### Status on ECL trigger upgrade

2025/10/22-24 TRG/DAQ workshop Y.Unno

## Upgrade plan

- (A) Clustering is based on **Graph Neural Network**
- (B) Change **granularity** of seed information in ECL trigger system
  - (base scenario) crystal by crystal (1x1)
  - (second scenario) 2x1 or 2x2



- Upgrade of ShaperDSP and downstream are required.
  - Data size will be at least x16 in case of (1x1) than TC(4x4)
- Latency requirement will be 9us(?) (4.4us in current system)

## Upgrade plan

#### (TC base) Current system



#### (1x1 base) Config A



#### (1x1 base) Config B



# Old and New ShaperDSP



#### Crystal E and T reconstruction on **new ShaperDSP**



Amplitude 
$$\Rightarrow$$

$$Amplitude \Rightarrow \begin{pmatrix} A \\ B \\ P \end{pmatrix} = \begin{pmatrix} \sum_{i} C_{i}^{Ak} y_{i} \\ \sum_{i} C_{i}^{Bk} y_{i} \\ \sum_{i} C_{i}^{Pk} y_{i} \end{pmatrix}$$
Pedestal  $\Rightarrow$ 

- y is data
- f is signal PDF
- S is noise covariance matrix
- $\cdot$  C is calculated by f and S
- Same E and T rec. logic for TC on FAM can be applied to crystal, but:
  - Pulse hight is lower than TC case
    - Energy threshold will be lower than TC case(100MeV)
  - # of bit for Amplitude will be larger => difficult to meet timing constraint
    - If DSP is used for division calculation, (probably) no problem.
  - Need a study of optimization (from simulation at first)
    - # of sampling points and # of bit for amplitude, E threshold, noise level, timing closure

### new ShaperDSP

- Data transmission from new ShaperDSP to Collector
  - 1 crystal =1 (hit)+7(timing, LSB=1ns)+18(energy, LSB=0.05MeV) = 26
  - 16 crystal =  $26 \times 16 = 416$ 
    - •416 x 66B/64B x 8MHz = 3.5Gbps
    - •=> 1 GTH is enough for 1 ShaperDSP
      - 1 GTH is enough for both data of ECL and TRG
    - •=> 12 GTH are required at Collector

#### Noise

- ●1 ADC ~ 5MeV
- ◆TC E Threshold=19 ADC



- ECL trigger have been suffered from noise (especially in endcap)
  - (A) ARICH FTSW => fixed at the end of 2017.
  - (B)TPC =>TPC was gone at the end(?) of 2018.
  - (C)ECL => partially fixed by adjusting connection of PD and PA and grounding
  - (D)Unknown source
- In high granularity case(1x1, 1x2, 2x2),
  - Energy threshold will be lower than TC base(4x4) 100MeV
  - Noise effect needs to be carefully studied and prepare countermeasure

#### Collector

- Design of Collector depends on downstream configuration
  - # of merger or # of GNN
- (A) Single Collector for both ECL and TRG or (B) separately?
  - (A): 52 Collector in total
  - (B): 104 Collector in total
    - Unified design is better for R&D, cost, and maintenance
- No large logic resource, memory, DSP needed, but GTY is needed
  - (In case of many merger or GNN, GTH would be OK)
  - UT4 is good candidate
    - but a bit too expensive for 52(104) Collector
    - For 1st test bench, UT4 is used at B2.
- Design is not started

#### **GNN-ETM**

| Granuality     | 4x4        | 1x1          |                   |
|----------------|------------|--------------|-------------------|
| Module         | UT4(VU190) | UT5(VP1802)  | Comment           |
| Data           | 576 TC     | 8736 crystal | 16 times larger   |
| Logic resource | 2350 K     | 7352 K       | 3 times larger    |
| (Used)         | (15%)      |              |                   |
| Memory         | 132Mb      | 994Mb        | 7 times larger    |
| (Used)         | (10%)      |              |                   |
| DSP            | 1800       | 14352        | 8 times larger    |
| (Used)         | (30%)      |              |                   |
| Latency budget | lus        | 5us(?)       | 5(?) times longer |

- "(Used)" is resource consumption by reduced network firmware.
  - Performance is lower than ideal case
- Process all of 8736 crystal data by single module is difficult
  - (Perhaps, possible if we accept lower performance?)

# GNN-ETM configuration

#### (1x1 base) Config A



- Data format
  - 1Xtal = 1(hit)+7(timing, LSB=1ns)+18(energy, LSB=0.05MeV) = 26
    - •=> 2155 Gbps is required in total
- Without merger
  - 104 GTY is required at UT5
    - => additional 2 daughter boards are required.
- With merger (e.g. 7 UT4)
  - 78 GTY is required at UT5
    - •=> additional 1 daughter board is required.
- GTY is required at Collector (VirtexU, AirtexU+, KintexU+)

# GNN-ETM configuration

#### (1x1 base) Config B 52 VME around detector E-hut Detector **GRL(1)** CsI(TI)+PD **ETM(1)** GNN(X) **Collector SDSP** +PreAmp **GDL(1)** UT5 UT5 (576)(52)(8736)PCle40(1)

- Data format
  - $\cdot 1Xtal = 1(hit) + 7(timing, LSB=1ns) + 18(energy, LSB=0.05MeV) = 26$ 
    - •=> 2155 Gbps + alpha is required in total
- 2 GNN
  - 52 GTY (at least) are required for each UT5
    - Additional 1 daughter board (or QSFP-DD 200G) is needed
- •>= 3 GNN
  - UT5 is OK(no additional board for each UT5)

# GNN-ETM configuration

- Multiple GNN modules are reasonable solution (at present)
  - e.g. divide into 2 region in theta with some overlap



- Need to study how large overlap region is necessary
  - Need to check required additional resource for each scenario



- If the number of GNN is larger, GTH can be used at Collector
  - Collector will be cheaper (but total cost for GNN is higher...)

## (Preliminary) To-do list

#### ShaperDSP

- Study of crystal E and T reconstruction on ShaperDSP
  - MC study with electric noise and beam background
    - E and T resolution for different sampling points and energy threshold
  - Firmware design
    - timing closure, data format, etc
- Design of optical link, etc
- Collector
  - Design of 1st prototype
    - FPGA, I/O, # of board, schedule, and cost
  - Preparation of test bench (Collector using UT4)
    - Design of optical link and the test
- GNN
  - Performance study of different granularity(1x1, 1x2, 2x2)
  - How large resource is required for different granularity
  - · How many board and how large overlap region are required
  - How large background reduction power or effective parameter(or idea)
- More clear and well considered strategy (before June/2026 for TDR?)
  - Schedule, human resource, cost, etc.

# Backup

#### TC E and T reconstruction on FAM

In current FAM case,

$$\chi^{2} = \sum_{ij} (y_{i} - Af_{i}^{k} - Bf_{i}^{\prime k} - P)S_{ij}^{-1}(y_{j} - Af_{j}^{k} - Bf_{j}^{\prime k} - P)$$



Amplitude 
$$\Rightarrow$$

$$\begin{pmatrix} A \\ B \\ P \end{pmatrix} = \begin{pmatrix} \sum_{i} C_{i}^{Ak} y_{i} \\ \sum_{i} C_{i}^{Bk} y_{i} \\ \sum_{i} C_{i}^{Pk} y_{i} \end{pmatrix}$$
Pedestal  $\Rightarrow$ 

- $\cdot y$  is data
- $oldsymbol{\cdot} f$  is signal PDF
- S is noise covariance matrix
- ullet C is calculated by  $oldsymbol{f}$  and S
- Perform chi2 fit on 12 sampling points(4 for pedestal, 8 for signal)
- Every 127ns with previous fit results as initial parameters
- If A and T meet some conditions, they are send to TMM as TC energy and timing
  - Requires 100MeV energy threshold
  - •TC energy(12bit with LSB~5MeV) and timing(7bit with LSB=1ns)
- All calculations with 256MHz since all need to be done within 127ns
  - Division in the calculation is required and done with LUT (w/o DSP).

# Current ECL trigger



- ShaperDSP
  - 4x4=16 crystal analog data are summed up to make single analog TC data
  - 576 analog TC data are generated on 576 ShaperDSP, and sent to FAM
- FAM
  - Digitization of analog TC data from ShaperDSP with 8MHz
  - Measure TC E and T with chi2 fit on digitized waveform every 127ns
  - Apply 100MeV threshold for each TC
- ICN-ETM
  - From 576 TC data, perform clustering and calculate trigger bits
- GNN-ETM
  - From 576 TC data, reconstruct clusters with Graph Neural Network
    - •=> expect much better resolution of cluster energy, timing and position with crystal data instead of TC