EPD Cold tiles again

Executive Summary

EQ1 QT32B board 0x16 (board #6) channels 24 - 31 are all fairly unstable.  This corresponds to EW = 1, PP = 8 TT = 16-23 in terms of the EPD geometry.  This card was replaced on 01/03/2020 and after replacement the card seems to be operating.

West PP5 TT23 is out.

Details

It was noticed that some tiles in the EPD were looking cold.  Run 20348027, for example, shows holes in a few tiles. Immediately after the pedestal we have run 20348031 with no holes.  From this run, there are a handful of problem tiles, including WPP8TT22 and WPP9TT12.  The shift log plot is shown below.

Figure 1. Shift QA plot from 20348027 online.star.bnl.gov/runPlots/20348027.shift.pdf which shows some cold spots in some of the tiles.

The issue of cold tiles was studied rather extensively, see drupal.star.bnl.gov/STAR/blog/rjreed/epd-cold-tile-issue.  There it was found that sometimes the pedestal subtraction would shift, causing the noise to be above the 16 ADC threshold to be counted.  However, the value was still less than the "real" mip peak and so <ADC> was less.


Figure 2:  Online QA for the ADC spectra for 8 different channels.  The offending tile is WPP8TT22, which is read by EQ1 Board 0x16 (board 6) channel 30.  We see here that the peak of the first mip peak has a relatively similar height, but the noise seems to be much higher.

The symptom of the previous incident was a fluctating pedestal.  However, looking at the pedestal before and after the problem run, there really isn't a difference for the offending channel.  Joey has confirmed that the pedestals seem to be pretty steady from a dump of the files.  Looking at the online monitoring:

Figure 3: Pedestal runs for the different channels connected to TT22 tiles.  The green that is ~150 is the set of pedestals that correspond to our problem tile.  We do not see the jump of ~50 or so ADC channels seen before, though it is obviously a lot messier.  (I do not know why the red/brown channel goes further than the rest.)  I'm sort of baffled at the dates and need to follow up - but when I look at the pedestals directly from the trg machine the story is the same.


Figure 4: On the left is the ADC versus run number (given by an index) of W PP8 TT22. We see here that the cold spot noted above was seen about index 74.  On the right are a few of the spectra from that time of the same channel (normalized by the integral). We see that there is a significant drop in the ADC. 

For those who might like to see Figure 4 (left) for all channels, a lower resolution plot is at: drupal.star.bnl.gov/STAR/system/files/ADCvsRunEPD12232019.pdf

To orient ourselves, index 73 is run 20347019, so we see this covers the area that was noted in the shift log. 


Figure 5: Pedestal values for W PP8 TT22 directly off of TRG machine.  The two (yeah not visible) black lines indicate the problem area.  It did not happen strictly between two pedestals.

We know that this isn't the problem seen before - which is characterized by an abrupt change of ~50 ADC channels on the pedestals, and resulted in an ADC spectra that was shifted to the high side.

The noise has increased in terms of rate (so it seems) but maybe not in terms of <adc>.

Comparing these pedestals to some other channels, we see:

Figure 6: Pedestal runs for W PP8 TT22, 23 and 20.  All are from the same QT board, but TT22 and TT20 share a FEE board.  Obviously these are all in the same crate.

The excel sheet for these files is attached.


Figure 7: The ADC vs Run Index for the tiles above.  We see here that they also have drops which look "cold".  But at different runs than each other, and the tile above.  Quickly looking at the interface of one, we see:


Figure 8: W PP8 TT23 for the range of its discontinuity.  It looks very similar - more noise but not any clear issue with the pedestal.

Unfortunately, this is utterly not visible to the shift crew.  (Ok, I'm sort of convincing myself that there is some slight teal in the right place, but this is not something that they will see!


Figure 9: Shift plot of <ADC> for run 20349011.

Maybe what we want is 1 peak peak value - <ADC> ?  Something with some dynamic range?

So far what I have noticed is that these are all on the same QT board - do we have one board(s) that are noisy with some longer-than-a-pedestal type cycle?  (It looks like all the channels on the same FEE will have similar pedestal values.

Looking at the values from the board themselves, all the channels are good until the very last 8.  So the 4th daughter card on this board is probably bad. 

Replacement

Hank replaced this daughter board during the access of 1/3/2020.  We did a few pattern tests in order to see if the daughter board is performing.  The relevant run numbers are:

pedestal rhic clock clean 21003023 - zeros

ped as phys Pattern 1 21003024
ped as phys Pattern 2 21003025

pedestal rhic clock clean 21003026 - default

I borked getting the daq files so redid a little:

pedestal rhic clock clean 21003031 zeros
ped as phys Pattern 1 21003032
ped as phys Pattern 7 21003033
per rhic clock clean 21003034 - default


Figure 10: EPD Pattern 1 with tiles corresponding to replaced tiles in black.


Figure 11: EPD Pattern 2 with tiles corresponding to replaced tiles in black.


Figure 12: EPD Pattern 7 with tiles corresponding to replaced tiles in black.

Seems West PP5 TT23 is out, I will check on that.