MTD issues system commissioning 2011

 
Background: 

The new STAR MUON detectors (MTD) were installed in January of 2011.  This was done for commissioning of the new detector which was only a few MRPC packs.

This was all that was reported:
The digitizer (TDIG) and processor boards (TCPU) were not communicating.

So, I set up an oscilloscope to check if the token bit from the communication stream was acknowledged by the TDIG board.
However, the University of Texas was not willing to share the test scripts in order to repeat the testing.

BNL STAR management had expressed that the system was being deemed as a “failed experiment” and we should pack the trays up and send them back to the University of Texas. However, the system manager for the MTD representing the BNL STAR group, LiJuan Ruan, was concerned that the new detector trays would not be confirmed in operation in time for system commissioning in February, especially with an impending delay of sending the detector to Texas.

So, I asked the TOF/MTD group consisting of representatives from both Rice and Texas Universities, to please share with us the testing scripts and other proprietary debug information. Actually, this work is being done for under contract for BNL and DOE, so there should not be any non-disclosed or proprietary property.

Troubleshooting in Blizzard:

The week of January 10, 2011 Monday, was the start of a Blizzard here on Long Island. I was at the STAR experiment hall when the snow started falling. The rest of the lab was leaving for early snow day dismissal.  

I still had some testing to do at STAR. LiJuan and myself again were concerned with the lack of support from Texas University, but I had told her there were a few things I can check here without their direct assistance.

Observed discrepancies between TOF tray and MRPC: 

I had observed that the TOF trays which use the same card configuration had equal length cables. The cables between the TDIG and TCPU cards for MTD however were not equal lengths, so I called this into question (maybe a race condition?).


The following is my communications with Ted Nusbaum from Rice University. Ted was able to put together a test setup at his lab in Texas, so we could troubleshoot the issue more effectively:


Hi Ted,

 So...Aux_token_in should pull up HIGH...and only drop LOW for 20ns durring a TOKEN?

 Also, a silly question...Would it matter if we made both cables for TCPU to TDIG 1 and TDIG 1 to TDIG 0, equal length cables?

 

TCPU---------30"---------TDIG1

TDIG1--------30"---------TDIG0

 

Tim

 

Tim Camarda

Sr. Technician

Brookhaven National Labs

STAR Physics Dept.

Ph 631-344-8153

 

 

From: Ted Nussbaum [mailto:tednuss@rice.edu]
Sent: Tue 1/18/2011 2:43 PM
To: Camarda, Timothy; Geary Eppley
Cc: Jo Schambach
Subject: Re: MTD 26

Hi,

 

I checked the signal "AUX_TOKEN_IN"  from J4 "Downstream" from U5.   It goes HIGH when the ribbon cable is disconnected.  This sets the board for continuous readout.

 

If it is being seen as "LOW" due to faulty timing from board "0" when it is connected, then TCPU getting nothing back might stops sending more tokens.

 

 

Ted

 

 

---- Original Message -----

From: Camarda, Timothy

To: Ted Nussbaum ; Geary Eppley

Cc: W.J. Llope ; Jo Schambach ; ruanlj@rcf.rhic.bnl.gov ; Christie, William

Sent: Friday, January 14, 2011 9:36 AM

Subject: RE: MTD 26

 

Hi Ted,

 

Your right about not seeing the TOKEN at J17.

 

The attached PDF shows the readings I got while reading the TDIG chain in a loop.

 

 

 

Tim

 

 

 

From: Ted Nussbaum [mailto:tednuss@rice.edu]
Sent: Friday, January 14, 2011 9:55 AM
To: Geary Eppley; Camarda, Timothy
Cc: W.J. Llope; Jo Schambach
Subject: MTD 26

 

Hi,

 

I'm wondering if its possible that clock timing of the token going back upstream could still be the problem.

Maybe the TCPU logic "locks up" and stops sending tokens if it doesn't see the first one come back properly.  This would be easy to verify with a scope by the complete absence of tokens at J17.   So,  if possible I'd stil try the 80" cable.  

The data corruption that Jo saw was a possible indication of crosstalk between adjacent clock & data signal pairs.  That was one reason for trying the 40" Twist & Flat.


Ted

 

 

Conclusion: The issue was a race condition between different TDIG cards responding at different times and the TCPU card timing out, do to different cable lengths.  I also recommended the cables be changed to twisted pair cabling.
We did not have to ship the MTD packs to Texas University and the system was commissioned on schedule.