TPC QA 2024

Goals and to do

Aim is to make QA more automatic at the FF and bad pad level.
The procedures to run ADC files last year seem good. Improvements are needed in the following area's

Update the AnaQA.C and associates cron job to execute task for both dead channels (FEEs, RDO and singles onesI include TPX too
and for hot channels to produce time sequence of changes.

The hots channel task was tweaked and looks fine.

Hot channels for cosmics April 2024

It is difficult to get enough statistics for cosmic data run. Two sets of run with plenty of ADC data was done thanks to Jeff.
104032-034 for FF field settting and
107059 for RFF setting 

The analysis result for channels to be masked permanently is

Sector,   row  pad
4         72   110
6         61    58
11         3    25
22        56    71
23        21    15

April 18 2024

Shift crew notice in run 109020 , cosmic, higher yield area. It looks like part of a FEE is misbehaving, could be
one event or few. Unlikely for all events. Sector 7, not in following run.

April 28
Something is not right in TPX Sector 3, RDO 4. Way too much data causing deadtime. One can see bad data in the online plots.
I powercycled it just now so let's see.

Tonko: There seem to have been some electronics issues with the TPX03:4 RDO. This in turn caused a lot of data for that RDO which in turn caused the TPC to dominate the deadtime.
After powercycling the TPC deadtime is fine: 17% at 3.5 kHz.

I need to take a better look to figure out why the checking software didn't cause an auto-recovery. TBD.

Tonko: The first errors started in run 25119018.
Run 25119017 was still fine.
And now run 25119024 is fine again.


Summary of such bad runs

day 133  sector 1 rdo 5      133026-133032    -- reported by Lanny in QA meeting
day 150  sector 21 RDO 5  150009-150014 -- should be masked.  sector 21 TPX RDO5
day 150  sector 2 RDO 6     150052-150053

May 3, 05
Tonko in shiftlog - next good run is 25124008 (pedestal run)

Looked at various masked RDOs and unmasked all apart from iS12-1 which looks bad and won't be enabled in this run.
iS12-2 -- might still be problematic but I can't capture an error now
iS05-1 -- power issues, had to kill FEEs
iS19-2 -- timeouts, had to kill a few FEEs
iS24-3 -- power issues, had to kill FEEs 


Day 129
 One iTPC RDO was making problems: iS09:4 in multiple consecutive runs. Had to mask it out.
Failed in run 192010

Day 131
TPX sector 1 RDO 5 failed to load proper pedestal
-- high deadtime due to data volume. Bad pedestal from prev ped run. retake and reload.

May 28, 2024
First version of easier turnaround on keeping up with TPC,TPX status
First changing scripts to digest the itpc status dead channels
updated - create output in AnaQA.C to generated brief summary file with just run# #failed rdo, faied FEE bad dead itpc channels
These are saved in itpcplot per run.  The PoltiTPC.C reads all the files generated based on a runlist.txt file Output in plot is here  latest update  7/8/24

May29/30 from Tonko's google doc

  • iS01-1 masked but looks OK – unmasked 29-May

  • iS08-1 masked but looks OK – unmasked 29-May

  • iS09-4 masked. Disabled 1 FEE – unmasked 29-May

    • hm, still flaky, need a better look [30-May]

  • iS12-2 masked, repeatedly but I still can’t find anything wrong(?) – unmasked 29-May

    • will closely monitor its issues in the itpc.log

  • iS05-1 PROM – reburned 29-May

  • iS18-1 PROM – reburned 29-May

  • iS10-3 [before 29-May]

    • typical power issues. Had to mask/disable 4 FEEs to get the rest to work.

  • iS7-2 [before 29-May]

    • the board has power issues, no FEEs can configure. Similar to iS12-1 before.

    • Will likely stay masked until the end of the run although I’ll check it again.

      Day 156  June 4
      run156011  part of FEE in sector17 TPX seems not to be initialized properly. Gone in 156012, not in 156010
      Only shows as single hotspot in cluster plot

      Sector 19 itpc lower rows have noisy FEE response It is in row 11 & 12 pad ~0-7 and in rows 19&20 pad ~60-85. They should probably be masked out. it is quite obvious in the ADC  counts but does not seem to give significant to contribution to clusters.

      • 05-Jun rom Tonko's googlr doc

        • iTPC

          • iS07-2 old

          • iS09-4 old flaky — masked but will check tomorrow

          • iS11-3 new

            • board doesn’t respond to power, RDO dead, stays masked

          • iS12-1 old

          • iS12-2 had power problems earlier – unmasked but will check tomorrow

          • iS14-1 new – PROM – reburning, OK 

iS15-1 new – PROM – reburning, OK

June 19 2024 Tonko elog
Unmasked RDOs:

iS4-2 -- nothing wrong with it?
iS10-3 -- power problems, had to disable 2 more FEEs but the rest is OK
iS15-2 -- PROM reburned
iS15-4 -- PROM reburned

Also, based upon the prom_check reburned:

day 175022
Another case one one flacky FEE for one run

July 2 Tonko 

Fixed RDOs.
S9-4 pedestal data was bad so it was slowing DAQ. Took fresh peds and now OK. 
        all channels bad pedestal- Tonko suspect some flaw in pedestal software (3rd time it happened, different TPX RDO.
S11-6 No obvious reason? Either it recovered magically or there was a cockpit error powercycling manually.

iS1-2 bad PROM -- reburned.
iS9-1 bad PROM -- reburned.
iS9-3 I see nothing wrong?
iS10-3 the power of this RDO is slowly failing. Masked 2 more FEEs and now OK. 


July 6  !&-5
This rdo has ecessive counts as can be seen in rdo bytes histogram, and also in the adc 2D for sector 17.
This has been the cace for all runs in tghe current fill  042 - 046.

I suspect this is a case of bad pedestals, but since there is been you may try to recycle RDO first, and see if it goes away. If not consult with Oleg on actions to take -

- take pedestal now
- if not mask out for rest of this fill, as data are suspect for this RDO

action: worked after power cycling.

July 10

I scanned the last days run to gather some statistics of what RDO are 
autorecovering. (189001-189039(
For itpc there is no pattern, but for TPX
21:6 auto-recovered 11 times in 19 runs whereas only 3 other TPX RDO's 

I did take a look and also sent an email to star-ops.
In short: I can't find any cause which I can fix. I suspect
the power-supply which can potentially be alleviated on the
next longer access if Tim has time to turn it up a notch.