FTPC embedding - review of the problem

 

Tried to add this to RT #2159 but it did not work so, making a bog.

 


Adding historical answers:

* Met face to face with Renee on June 1st - went over the issue
  and plots and I requested for a set of specific questions to
  be asked from Janet.

 -----------------------------------------------------------------

6/5/2011 06:26 - Renee
;Hi Janet,
;
;Would you mind giving us a short review on how the FTPC gain tables are
;determined from the data?  I ask because Jerome and I are reviewing the
;FTPC embedding request that is open right now and neither of us knows
;the procedure.  If I remember correctly you and Peter calculated the
;gain tables for the Run 8 pp and verified this several times while
;trying to understand the embedding/data comparisons.  Is this correct?
;

 -----------------------------------------------------------------
Answer on 6/6/2011
An explanation of how the FTPC gain tables are calculated is in
http://www.star.bnl.gov/public/ftpc/Calibrations/documents/FTPC_calib_v2.2/FTPC_calib_v2.2.pdf
pg 18-20(acroread pg 26-28)

 -----------------------------------------------------------------

Answer on the 6/8/2011
Creating an FTPC gain table is a three step process. First a pulser run is used to:
           - find dead channels
        and (I just looked at the code)
           - "normalize" the pulser signal for each channel using the mean
             pulser signal for the relevant FTPC
           - cut channels with extremely high or low gain (set = 0)

Then a data run is processed to display the signal on each channel. We look at these histograms to determine a 
cut which will then be applied to eliminate noisy channels.The data run is reprocessed with the noise cut applied.
Noisy channels are set to 0. The noise corrected gain table is then added to the
offline database.

Normally a new gain table is created and added to the database whenever FTPC
electronics die or become noisy. This was not done for the 2008 pp run because
at that time no one was interested in using the data.

Just for curiosity's sake: in what way is our procedure "a bit non-standard"?

Peter, Alexei do you have anything to add?

Janet

On Mon, 6 Jun 2011, Renee Fatemi wrote:

> Thanks Janet - actually we can probably just ask Terry our additional questions from document. This concerns 
> the embedding request by Jim Drachenberg in the 2008 pp data.  You and Peter and the embedding team have discussed 
> it extensively but Jerome is reviewing it because the implementation is ultimately a bit non-standard.

 -----------------------------------------------------------------

From 6/9/2011 10:50

Dear Renee,

The pulser runs are used both for computing gain factors and for
detecting malfunctioning electronics channels.

As you correctly assumed, the pulser should ideally produce signal
outputs of the same amplitude from all FTPC pads (each connected to a
separate electronics channel). The ratio of the actually measured
pulseheight to the average of all pads provides the gain factors.
These are stored in the gain table and are later used in the cluster
and track reconstruction software.

For some pads the electronics channel does not work properly, these
give zero or saturated signals. For such channels the gain factor
is set to zero. This way the pad/channel will not be used in the
further analysis.

Finally, as Janet already explained, we try to exclude excessively
noisy channels. These are detected by summing up the amplitudes
in each channel for many real events. Noisy channels will show
up as excessivle large values for the sum.

Best regards,   Peter

 -----------------------------------------------------------------

6/14/2011

Hi Renee,

Now we have a question concerning Jim's procedure. How does he propose to introduce his additional factor? Will 
this factor be included in the gain table or will code changes be necessary?

We foresee no problems with a new,modified gain table since this has
a timestamp and would only apply to the pp run. Code changes are another matter
since any change would effect all existing and future data. If Jim proposes code changes, he would have to 
test them with 2009->2011 data and we would have to review them BEFORE they are checked into the CVS repository.

Janet


 -----------------------------------------------------------------


6/14/2011 12:38

Hi Janet,

At this moment I believe that Pibero has modified the FTPCSimulatorMaker and the plan was to use this modified code 
for the embedding.  Pibero can speak better to the functionality, but the function is to introduce a fixed inefficiency 
on the reconstructed hits.  They tune this then until the nhits distribution look the same in data and embedding.

I had assumed we would just use this modified code for this embedding. This is non-standard and the reason for the 
discussion with Jerome etc. But if the FTPC group would like to include it in the standard package then I am sure 
none of us would have a problem with that.

Best,
Renee

 -----------------------------------------------------------------

6/15/2011 14:50

Hi Janet,

The modified FTPC cluster maker used in Jim's FTPC embedding is attached.
The main changes were in line 911-927:

    911   // Drop hits to simulate cluster finder inefficiencies in data
    912   Float_t ineff = .20;
    913   Int_t num_points = mHitArray->GetEntriesFast();
    914   Int_t del_points = Int_t(ineff * num_points + 0.5);
    915
    916   LOG_INFO << Form("Dropping %d clusters at %f inefficiency",del_points,ineff) << endm;
    917
    918   for (Int_t i = 0; i < del_points; ++i) {
    919     int k = gRandom->Integer(num_points);
    920     LOG_INFO << Form("Dropping cluster #%d",k) << endm;
    921     mHitArray->RemoveAt(k);
    922   }
    923
    924   mHitArray->Compress();
    925   num_points = mHitArray->GetEntriesFast();
    926
    927   LOG_INFO << Form("Keeping %d clusters",num_points) << endm;

Basically, here we emulate a 20% FTPC hit reconstruction inefficiency
by randomly selecting 20% of FTPC hits and removing them from the mHitArray
container. If this was to be integrated in the official code, then perhaps
a switch could be added to turn on/off this feature and set a desired
inefficiency as well. When the switch is in the off state (default),
the FTPC cluster maker would function has it always had for many years.

Pibero

 -----------------------------------------------------------------

6/21/2011 19:17


Hi Jerome,

Ultimately, my goal is to investigate spin asymmetries in pi0-charged particle correlations in the FMS and the FTPC. 
I'd like to do an IFF analysis which will focus on the near-side correlations. I have only just begun to unpack those. 
However, you can refer to

http://www.star.bnl.gov/protected/spin/drach/Run8FMS/FMS-FTPC_correlation/

for a summary of my work to obtain the correlations. You can also refer to my analysis/collaboration meeting talks 
linked under "Some Useful Links" at

http://www.star.bnl.gov/protected/spin/drach/Run8FMS/.

Please feel free to ask specific questions for those elements that are unclear.

Grace and peace,
Jim

On Jun 21, 2011, at 12:06 PM, Jerome LAURET wrote:

>
>     Dear all,
>
>     Sorry for the late answer and reaction (technical details,
> dead laptop, time to setup a new one, catching up with lag of EMail
> ... grompfff!!).
>
>     Dropping hits randomly seems VERY dangerous to me (even if
> it produces the desired effect on the final Nfit distribution as
> showed by Jim's study from April 14th). Doing so not only implements
> a very rough rejection of hits (not physical, no phi dependence effect
> which may be fundamental for correlation studies)  but also applies
> to only a global observable and would not reproduce (a repeat)
> the subtleties of Phi asymmetries. On the technical side, code hacks
> also have the tendency to be forgotten (and also seem to be a
> diverging path with our most recent embedding library release
> policy but this is another technical details we can overcome).
>
>     What I do not understand is why adjusting gains do not
> produce the proper result. Renee told me it was attempted twice
> without success. I am mystified. Ideally, I would imagine that even
> if we did not produce a gain table for Run-8, one would be able to
> look at the data alone and map the dead areas (I understand those
> match as per Jim's study from January 7th onward; did not match
> prior) and efficiencies and rescale the gains accordingly and
> let the simulator do the job. Because at the end, if the
> simulation and data do not agree, the problem can be at
> - geometrical effects (not including dead sectors, but including
>  wrong geometry description)
> - correction factor levels (dead area, gains, ...)
> - code / simulator level (how clusters / hit are formed beyond
>  the gain issues - usually, we have a slow and a fast simulator
>  to assess diverse level of fines)
>
> If we do not have the proper result, forcing it by applying
> mysterious rejections to match what we want to see happening
> in one dimension does not resolve much (and again, seems like
> a dangerous path in science) but hides the fact that the
> simulator cannot reproduce the data hence, the simulation is
> not realistic and therefore, the corrections not super
> helpful.
>
>     Especially, let me point that the request is aimed
> to proceed with a FMS+FTPC correlation study un Run8 for
> p+p 200 GeV. The FTPC seems to be at the very core of this
> analysis and trust-able corrections are fundamental here.
>
>
>     So the questions ---->
>
>     Jim, I would need to understand your analysis - perhaps
> you are not as sensitive as I think and we could make a one
> time exception. Any writeup / web page / description of it I can
> go through? [I have no time to keep up to date with all analysis
> and I am not familiar with yours].
>
>     Janet, Peter, Terry - as experts, can you please get
> another shot at explaining again why this approach of randomly
> rejecting hits is acceptable and how to explain that we do not
> have the proper result by adjusting the gains and re-injecting
> those modified gains into the simulator?
>
>     Thank you,


 -----------------------------------------------------------------

6/24/2011 11:08

Dear Colleagues,

Janet and I are also not convinced of the justification of the
proposed procedure to simulate hit inefficiencies. At this point
we do not understand the reason and that makes one uneasy.

J.Putschke and F.Simon studied FTPC reconstruction quality in
their theses which you can find under detectors - FTPC. Their
result was that reconstruction efficiency was around 90% in the
high geometrical acceptance pseudorapidity region of the FTPC
for pp and dAu reactions. This was still true for peripheral
AuAu but deteriorated for central collisions. The simulated
hits were somewhat too ideal (fluctuations of hit extent
in pad/time directions too small) resulting into better DCA
and nhits distributions for the simulation. The discrepancy
was, however, nowhere near what Jim reports.

We should verify once more that the gain tables were introduced
and applied correctly. This could be done by plotting clusters
on track for the matched embedded tracks. We realise that this
requires rerunning a sample of your events through embedding
and saving the *hist.root files. We emphasise that the gain
tables have to be applied at the simulation/reconstruction
stage and not at the muDST stage. Janet will make sure that
the proper gain tables are put in the database (will happen
next week).

A major problem for the run 8 pp data analysed by Jim is the
huge pileup. Hui looked at this and you find his plots at

http://www.star.bnl.gov/protected/bulkcorr/wanghui6/FTPC_calibration/Run8/

You see that the DCA distributions are dominated by uniform
background, with small peaks due to the real in-time tracks.
2-d distributions DCA_x vs DCA_y show a clearer peak, but also
on a large background. Selecting the peak by cuts still leaves
order of 50% pileup background. Are there zero bias runs for
this pp period in which one could look at pure pileup ?

We should also look at the cluster parameters for the pp runs
and find out whether these are normal. They can be obtained
from the *hist.root files as well.

Best regards,   Janet and Peter



On Tue, 21 Jun 2011, Jerome LAURET wrote:

>
>     Dear all,
>
>     Sorry for the late answer and reaction (technical details,
> dead laptop, time to setup a new one, catching up with lag of EMail
> ... grompfff!!).
[...]

 -----------------------------------------------------------------

6/24/2011 13:42

    Hello all. Just to be summarize: Peter reminds us of

- possible re-check of the application of the same gains for both
  simulations and real-data (looking in details at clusters will
  tell us too) - we also raised this question during the meeting

- pileup and subtle effects ... I interpret this as the clusters
  may not look alike between sim/data if simulation does not
  have pileup. Follows a suggestion to look at pileup only
  events (zerobias) - there were zerobias files (you can check
  by selecting the catalog with
  sname2=st_zerobias,year=2008,filetype=online_daq) and we should
  do as suggested - a quick test and look at what this would
  give (cluster shape, etc ...) after we make sure (I understood
  Janet will check next week) the proper gain table is in place.

- I am not accustom to the FTPC histos in *hist.root (but this
  is a good pointer / we may have some of the answers already).
  Terry?


    Peter and Janet also emits doubts on the proposed path
tweaking the hit picking to have NFit match (I think most of us
feel the elephant in the room, just do not how it got there).


    Thanks,


On 6/24/2011 12:42, Terence Tarnowsky wrote:
> Hi Peter, all,
>
> Thanks for reopening the conversation. We had a discussion at this
> week's S&C meeting about doing some further investigation into how these
> decrepancies come about.
>
> You made several good suggestions. The basic checks we can do before
> proceeding include comparing cluster finding parameters from real data
> and simulation. If those check out, we can compare cluster histograms
> (e.g. maxADC distribution) for clusters in both data and the simulator.
> This will give us a straightforward way to verify that the clustering is
> proceeding in the same way in both data and simulation.
>
> Thanks,
>
> Terry

 -----------------------------------------------------------------

We then had a topic at the S&C meeting.
See meeting minutes at
http://drupal.star.bnl.gov/STAR/event/2011/06/22/software-and-computing-meeting#comment-663


 -----------------------------------------------------------------

6/30/2011 11:59


Hello Friends,

Janet now created the proper gain tables for the 2008 pp run starting
on day 051 and day 069 (where there was a change in the FTPC electronics
performance). We suggest to check the realism of the embedding by
doing a run on data with an azimuthally unbiased trigger in order not
to mix up efficiency effects from the FTPC and the properties and
problems of the FMS trigger. For this check we do not need the p_T cuts.
Jim, could you please suggest the required number of events.
The *.log and *.hist.root files should be kept and made available on rcas.

Meanwhile, Janet is willing to take a look at the cluster parameters
for real and simulated data. However, she does not know how to submit jobs to Condor. We will probably have 
to reconstruct 1000 events each.
Could someone please provide a recipe for submitting to Condor.

Best regrads,  Janet and Peter


 -----------------------------------------------------------------

July 6th

Hi Jim,

Thank you very much for your e-mail. We have rerun a small sample of events through
embedding using the new gain tables for the 2008 pp200Gev run. We are now checking and comparing the cluster 
parameters used for real and for simulated data for 2008 pp events.
We will circulate our findings as soon as we finish this analysis.
Janet and Peter
p.s. A special thanks for your Condor submit script - Janet


 -----------------------------------------------------------------



7/7/2011 03:34

Dear Jim,

Janet created the gain tables for the 2008 pp 200 GeV run and added them to
the database. To proceed with your FTPC efficiency study, we suggest that
you now reconstruct a sample of events with azimuthally unbiased trigger
using the new gain tables (you will get them automatically if you do not
include a DbV option in the reconstruction chain). Let us first look at
the azimuthal track distribution and then discuss how to proceed with the
embedding to get reasonable results for the FTPC efficiency.

Best regards,  Janet and Peter

 -----------------------------------------------------------------

7/12/2011 12:32

Hi Jerome,

I am currently working on getting a production of min-bias data with the new FTPC gain tables. I should 
have the muDST's and log files ready, shortly.

Grace and peace,
Jim

On Jul 12, 2011, at 11:16 AM, Jerome LAURET wrote:

>
>
>     Hello All,
>
>     I was on a leave of absence and back now. Here are the
> action we had set before I left:
> * Compare raw signals for the cluster shapes
> * Verify similar parameters are applied in real data and
>  simulation
>
>     As I understand the thread and status, this is what happened
> so far:
>
> - a revised gain table was produced and set in the DB for 2008.
>  I did not see the comparative results from Jim. Is there any?
>
> - ASIC parameters are on the way of being checked (talked to Jeff
>  this morning and should follow-up with an answer on Janet's
>  last question). NO conclusion yet.
>
>     If I missed something, could someone please fill me in?
>
>     Thank you,
>

-----------------------------------------------------------------


7/12/2011 12:49

    Great, thank you.

    Janet, would the hist.root contains plots about the
clusters "raw" shapes or characteristics? Discussing more with
Jeff, he suggested a plot as simple as the distribution of
MaxADC for the clusters formed (assumption is here that a
gross error in applying ASICs would show up immediately).

    Best,

On 7/12/2011 12:32, Jim Drachenberg wrote:
> Hi Jerome,

 -----------------------------------------------------------------

7/13/2011 11:08


Dear Jerome,

Janet is now back online after recovering from a dead computer.

We thank Jeff for finding out the parameters of the ASIC code.
We compared these to what is done in the FTPC slow simulation
and embedding codes. Fortunately both agree.

Janet then compared the cluster characteristics for real data
of run 9069059 with those from simulated tracks embedded into
the same run. We had to change the conversion factor
(StarDb/ftpc/ftpcSlowSimPars.C->adcConversion)
between electrons and ADC counts from 1000 -> 2100
With this change one obtains reasonable agreement between
real data and simulation as can be seen from the plots in:

/protected/ftpc/jcs/2008_Embedding/Run_9069059/RealData
                                               Simulation

Best regards,  Janet and Peter


 -----------------------------------------------------------------

7/13/2011 16:31

Hi Renee,

The histograms for adcConversion = 1000 (original value), 2000, 2100
and 2255 are now all on afs in
http://www.star.bnl.gov/protected/ftpc/jcs/2008_Embedding/Run_9069059
I have committed StarDb/ftpc/ftpcSlowSimPars.C with adcConversion = 2100
to the CVS repository. If Jim and Pibero cvs co StarDb/ftpc/ftpcSlowSimPars.C
into their working directory they will use this new value.

Janet

On Wed, 13 Jul 2011, Renee Fatemi wrote:

> Hi Peter and Janet,
>
> These comparisons look very good.  I didn't see them for the default conversion factor 
> but I assume they are much improved?
>
> Is this change checked into CVS - could Jim and Pibero run a test to see if this fixes the 
> nHitsFit points problem as well?
>
> Thanks,
> Rene

 -----------------------------------------------------------------

7/14/2011 10:05

    A side note that if this all works fine, we should
focus on sticking to the original (new) release policy for
embedding libraries and get it re-tagged at BNL, tested
there + re=deployed at PDSF from the core repository (this
will ensure maximal reproducibility and avoid the "works
at BNL but not at PDSF" syndrome we bumped into a few times
... and is VERY time consuming to chase down). In fact,
since we may be converging, please think of codes you may
have running from private directories (it will need to be
committed too).

    Thank you again ... was just a post-previous-email-sending
thought.




On 7/13/2011 17:23, Pibero Djawotho wrote:
> Hi All,
>
> PDSF is down for maintenance today and tomorrow.
> We'll resume testing of the embedding on Friday.
>
> Pibero

 -----------------------------------------------------------------

7/14/2011 11:51


Dear Jim,

Thanks for producing the run. Janet added up the histograms from
all your files. Those relevant for our discussion are placed in
 /protected/ftpc/jcs/2008_Embedding/Run_9064011/RealData/
Since from physics the azimuthal track distribution should be
uniform, one concludes that there are azimuthal regions with
considerable inefficiency of the FTPC.

As the next step you should try to derive correction factors
as a function of pseudorapidity and p_T. To this end, one
needs a new embedding run using the correct gain tables and
adcConversion factor for this run. You should then determine
the correction table from the ratio of found to embedded tracks.
Finally this correction table should be applied to the real
data and hopefully result in a flat distribution for the
real track azimuthal angle.

Best regards,    Peter

 -----------------------------------------------------------------

12:16

    Pibero, the formed clusters need to match otherwise, you
do not have the proper apple/apple comparison and all hell break
lose as far as simulation (and merging data/simulation) goes. It
is a classic hence, the reason why I insisted for the parameters
to be checked again (and again) and see the maxADC distribution.
It is clear now that we were not having the same clusters in
real and simulated data.

Every time we do an embedding, such tuning / comparison needs to
be done.

    Hope this helps.

On 7/15/2011 12:10, Pibero Djawotho wrote:
> Hi Janet and Peter,
>
> Someone asked at the last embedding meeting what the physical motivation
> was for changing adcConversion from 1000 to 2100?
>
> Thanks,
> Pibero
>


 -----------------------------------------------------------------

14:27

    Hello Pibero,

    Thank you for the log. A few questions:
- Why are you using a custom version of the geant_Maker?
- Same question for DetectorDbMaker and FtpcClusterMaker
  and AssociationMaker

    Otherwise, I do not see the lines we need for the DB
information. Could you add the option debug1 to the chains?
Also, would it please be possible to run at the RCF? A reminder
that our policy implies we assemble at BNL and make sure all
codes works there and are in CVS and the library ... If code
needs to be compiled, we would definitely be missing something
in CVS ...

    Thanks,



On 7/15/2011 14:14, Pibero Djawotho wrote:
> There is a sample log file posted here:
>
> http://portal.nersc.gov/project/star/pibero/st_fmsslow_adc_9061101_raw_1510004.9532D9F37EEDE5F05E6D424D9246594A_442.log
>
> Pibero
>
>>     Could you please bring your log at the RCF?
>>     Thank you.
>>
>> On 7/15/2011 13:45, Pibero Djawotho wrote:
>>> Hi Jim,
>>>
>>> I submitted a test run with the new adcConversion parameter.
>>> The output should show up here:
>>>
>>> /eliza14/star/starprod/embedding/ppProduction2008/PiplusFTPC_101_20103904/
>>>
>>> I have no idea how to tell if we picked up the latest FTPC gains
>>> and the new adcConversion factor from the log, so we'll have to
>>> wait from the experts to tell us. For now, once test embedding
>>> is completed, proceed with QA as usual.
>>>
>>> Pibero
>>>


 -----------------------------------------------------------------