KLM-DAQ Performance, Issues, & Progress

#### **Chris Ketter 26 August 2019**



#### Overview



- Reminder of KLM and KLM DAQ scheme
- Scintillator firmware performance (since start of phase 3)
- New software for front-end control
- KLM DAQ issues in phase 3
- Reviving scintillator Channels
- Pseudo-randomized constraint verification
- Scintillator waveform readout

# Reminder – KLM Detector





consists of large-area thin planar detectors interleaved with the iron plates of the 1.5T solenoid's flux return yoke.

Backward endcap

Barrel

Forward endcap

Slide courtesy of Leo Pillonen

2

### Reminder – RPCs





#### Slide courtesy of Leo Pillonen

# Scintillator Upgrade



- Inner two barrel layers and entire endcaps replaced with plastic scintillators in Belle II
- TiO2-coated PVT plastic scintillators with wavelength-shifting fibers
- Hamamatsu S10362-13-050c MPPCs





### Reminder – KLM DAQ





6

#### TARGETX Trig. Bit Encoding



- 5 trigger bits provided by each TARGETX (15 channels)
- Single hit (TB5 low):
  - Remaining 4 trigger bits encode hit channel number (one-based counting)
- Multi-hit (TB5 high):
  - Remaining 4 trigger bits encode groups of 4 channels which may have hits
  - Requires waveform readout to disambiguate which channels were actually hit

# Scint. Firmware Performance

- Time-stamping scintillator trigger primitives with 4 ns clock period for KLM trigger
- Time-stamping scintillator DAQ packets with 8 ns clock period
- High rate test with dummy packets (4 hits per module)
  - Trigger up to 50 kHz Poisson w/ 200 ns holdoff
  - Bottle neck was readout PCs
- KLM regularly joins global DAQ stress tests with 10 kHz Poisson / cosmic trigger and has not had any issues
  - SCROD firmware in "daq" mode, so mostly empty packets
- Latency of ~200 ns between receipt of global trigger and sending DAQ packets

#### **KLM SCROD**



# Front-End Control

Belle I

- SCROD control software now all python and is run directly on COPPERs using a tmux session
  - Prior versions involved a lot of shell scripting and XTerms
- Improved reliability
- Sending slow-control improved from  $\sim$ 3-4 min to  $\sim$ 5 s
- Slow control data is now human-readable and easier to modify (thanks to new register interface on SCROD and to use of python functions for reading/writing registers)
  - Prior version involved hammering the front-end with large strings of 32bit words read from a binary file and very difficult to interpret
- New register interface includes a checksum
- New diagnostic script allows us to easily diagnose local link problems (Aurora) vs. problems on single SCROD.



## Back-end Issues



- Intermittent BUSY from KLM COPPERs
  - When not recovered by SALS, requires rebooting all HSLBs to restore
  - Observed about once every 2 or 3 weeks
- Intermittent problems with HLT\_KLM (e.g. elog/KLM/37)
  - Once at beginning of phase 3, turned out that old node or port numbers were being used
  - Once at end of Spring run, had to restart HLT\_KLM
- Intermittent lock-up of COPPERs (e.g. elog/KLM/ 49)
  - Can't ssh into one or more (usually one) of them
  - Requires restarting of COPPER
  - Observed at least twice in Spring run

- Power cycling COPPER crate does not automatically bring up FINESSE
  - Have to reboot ttrx/HSLBs
  - Observed near end of Spring run



# Scintillator DAQ Issues



- Trigger scaler data used for calibration scans is sometimes unreliable
  - Discontinuities in scaler values for consecutive DAC changes (Values are read from SCROD registers via the HSLBs)
  - Implemented new status register autoupdate, need to test still
- Vasily SOLVED problem with reprogramming some of the SCRODs
  - PLL in the b2tt module did not lock
  - Modified PLL\_BASE parameter with different multiplication/division factors until finding one that works perfectly
- Intermittent checksum failures when sending slow control to FEEs



# Reviving Scint. Channels



- 13 stations tested with oscilloscope (1 station = 15 channels = 1 TARGETX ASIC)
  - $1^{st}$  checked for signals from ribbon cables on top of detector
  - 2<sup>nd</sup>, if necessary, checked for signals directly from modules
- 1 station fixed by TARGETX daughter card replacement
- 3 stations likely fixed after finding loose/unplugged cables at module side (due to routing of ARICH power)
- Attempted RHIC and daughter card replacement on 2 stations and still could not see signals ----Need to replace motherboard and/or SCROD. I will bring spares on next visit
- 2 stations had no signals from the modules themselves --- unless they magically come back to life, these will remain dead for the foreseeable future
- 3 stations inaccessible without more scaffolding installed
- 2 stations unplugged due to HV shorts
- 1 station missing 3 channels at readout electronics, but okay at module connector --- not pursued further at this time
- Barrel scintillator calibration success not yet remeasured, but should improve from ~90% to ~95%

# RPC Update / Issues



- Found 5 spare DC's at KEK, Indiana University procured 5 new spares & already shipped to KEK
- RPC lookback window settings should be updated whenever there are changes in the RPC FW or trigger timing;
  - In the beginning of Phase 3 we did not update the settings, and did not see RPC hits as a result;
  - Kirill Chilikin changed the settings to values determined by eye before run 1640, hits were recovered;
  - Later lookback window position was systematically scanned, optimal settings were determined and applied.
- One RPC sector has been down since start of phase 3 due to insufficient crate current
  - Replaced old Belle RPC electronic crates this summer
  - Reinstallation of electronics to be completed next week

#### \_4

# RPC Thresholds

- Work is underway to incorporate single channel thresholds for RPCs
- Archiving new PVs that will be useful for calculating / tuning these thresholds (D. Biswas)
- Working on new tool to flash RPC front-end readout FPGAs through jtagft utility (INFN).





# KLM Trigger



- Trigger FW for BKLM and EKLM works stably, is configurable, provides a lot of debugging information;
- Simulation and NSM2 modules exist;
- BKLM UT3 is connected to GRL and GDL;
- EKLM UT3 is connected to GDL, no connection to GRL is planned at the moment;
- Trigger not usable because of large trigger jitter;
- Trigger delay to fixed value is in progress;
- Due to lack of time of developer progress is slow, documentation is needed to increase number of developers – in progress.
- Details to follow in Dmitri Liventsev's talk in trigger session

# Firmware Verification



- Group from PNNL (Lynn Wood, Eric Becker, Mitch Mannino) performing pseudo-random verification on Data Concentrator and SCROD firmwares
  - Lynn is refining the combined Data Concentrator / RPC Front End / SCROD simulation
  - Eric & Mitch are writing the test harnesses for pseudo-random verification
- Verification is a continuous process as the Belle II will run for years and odd inputs will eventually occur during measurements!



#### Firmware Verification

Pacific

- **Complementary** to targeted simulations and testing directly on the hardware
- Focuses on more "high-level" testing than just confirming expected behavior
  - **Code coverage:** are all lines of code tested in the simulations?
  - Functional coverage: are all possible (correct) inputs tested?
  - Constrained random inputs: send in random and/or unexpected (even incorrect) data and check that the system handles it properly
  - Vary event sequence: randomize triggers, reads, writes, etc. to look for issues with different sequences
- Run simulations 24/7 to find corner cases



#### Firmware Verification

Pacific

Northwest

- Using advanced tools and libraries to perform verification tasks
  - ModelSim code coverage, fast sims
  - OSSVM functional coverage
  - UVVM constrained random inputs
- Using python scripts to auto-generate code "wrappers" around the BKLM modules
  - Much faster than writing them by hand
- Currently focusing on data concentrator modules in the trigger path
  - Verifying lower-level modules, moving up to higher-level ones
  - So far have found several minor issues that are not affecting current data taking, but could cause problems in the future
- Will integrate (in some way) with front-end simulations to see full data flow from end-to-end



## Waveform Readout

- TARGETX reads out all 15
  channels simultaneously
- Needed to disambiguate multihits
- Can improve leading edge timing to ~1ns
- Subtract pedestals for best results
- Known issue of preamp saturation has to be dealt with



#### Scintillator Upgrade Path

- Logical path forward, in terms of ease of implementation:
  - 1. Implement waveform readout with pedestal-less threshold checking for multi-hit disambiguation.
    - Sets up required framework for waveform readout with more complex feature extraction later.
    - Without pedestals timing is not likely to be very good, but can see if it improves relative to current 8 ns.
  - 2. Implement pedestal subtraction to achieve ~1 ns leading edge timing.
  - 3. Add amplitude information.
- Schedule:
  - Target item 1 ready for fall running, item 2 as "stretch goal."
  - Schedule for item 3 to be assessed as work continues.
    - Recall that we have a saturation issue on preamps that may limit our ability to measure amplitudes.

Slide courtesy of Kurtis Nishimura

June 2019 B2GM Scintillator FW Flow Diagram





# WaveformProcessing.vhd

- This module and everything contained therein is the focus of my work
- Working closely with Vasily Shebalin to ensure it can be easily integrated into existing firmware
- We'll meet at KEK in second week of Sept. to have a firmware-merger powwow
- Still targeting Fall 2019 run to implement simplified waveform readout with pedestal-less threshold checking for multi-hit disambiguation



#### Pedestal Bottleneck

Belle II

- Pedestals stored on Cypress CY62177EV30 CMOS static ram
  - Configured as 4M × 8 static RAM (SRAM) (22 bit address space, 8 bit values)
- 55 ns read time yields 10.6 us time per hit if using same ROI as older firmware:
- Strategies being implemented for improvement:
  - 8 bit peds: 10 us  $\rightarrow$  7 us
    - Requires optimization of mean values to make it work at all
    - Requires outlier detection in feature extraction in case even one sample rolls over the MSB
  - Readout smaller ROI (eg. 32 or 64 samples): 10 us → 2.5 5 us
    - Utilize 4-bit time-stamping in conjunction with sample-select lines to dynamically select ROI
- Have to cap occupancy at 2-hits / SCROD / Event for pedestal subtraction Cluster events with higher occupancy treated differently
  - If trigger-bit 5 is high, we perform a threshold check on raw waveforms during digitization in order to verify which TARGETX channels actually have hits, then perform feature extraction on those channels without subtracting pedestals.
  - I anticipate this will degrade timing resolution from < 1ns to ~ 1 ns



# DAQ Upgrade

- Already have one PCIe40 card from LAL in Hawaii
- Oscar Hartbrich helping to procure server for using the PCIe40
- Will be able to test with KLM (and TOP) on testbenches in Hawaii
- New post-doc Harsh Purwar joining Hawaii group very soon to work on DAQ upgrade
  - He overlapped with Oskar recently in LAL and received some basic training on operation of the PCle40





### Summary



- RPC readout and (simple) scintillator readout basically running stably
- Work under way to incorporate single-channel thresholds in RPC system
- Work continues to revive dead/missing scintillator channels
- Scintillator waveform readout firmware still under development, still targeting fall run for implementation
- KLM firmware verification is coming together and will be running 24/7 soon

## Questions / Comments?

- KLM and KLM DAQ (p3-7)
- Scintillator Firmware Performance (p8)
- Front-End Control (p9)
- Back-end DAQ Issues (p10)
- Scintillator DAQ Issues (p11)
- Reviving Scint. Channels (p12)
- RPC Update / Issues (p13)
- RPC Thresholds (p14)
- KLM Trigger (p15)
- Firmware Verification (p16-18)
- Scintillator Waveform Readout (p19-23)
- DAQ Upgrade (p24)

#### **Backup Material**

## KLM Testbench

- Commissioned fall '18
- DAQ equivalent to Belle II KLM (other than PocketDAQ)
- Instrumented with:
  - Scintillator readout
    1-layer eq. w/ MPPCs
  - RPC readout
    6-layer eq. w/ 1 DC
    (no test signals available)
- Has proven instrumental in debugging new scint. firmware releases and testing of control & calibration software



HV power

supply

Backplane

6u Custom

Backplane

(Power only)