Re-weighting simulations of TPC efficiency for dead regions

A simulation production was made of W and QCD events to help understand efficiencies in W measurements. The production sampled 10 timestamps during pp500 running with an even number of events (1k events per timestamp). I will call these simulation sets R0-R9 to represent the 10 timestamp sets by the last digit of their respective simulation numbers (e.g. R0 ⇒ sdt20090320, R9 ⇒ sdt20090413, etc.).

Justin has tried to calculate TPC tracking efficiencies using this simulation, and has TPC Track Finding Efficiency for W cross section. There appear to be discrepancies in the efficiencies compared to relative yields observed in the different TPC sectors, particularly for sectors with dead regions.

One possibility is that the fraction of simulation events with dead TPC regions differs from that sampled by the real data. To correct for this, we can use information on the times (run numbers) when Finding dead TPC regions, and the sampled luminosity in the real data for the W measurements.

Fractions of the simulated data and real data without the dead regions are reported below as A and B respectively.

A ≡ fraction of simulated data without dead regions
B ≡ fraction of real data without dead regions

If the simulated dataset has N total events (N=10k), that dataset consists of A*N events without dead regions, and (1-A)*N events with dead regions:

A*N + (1-A)*N = N
A + (1-A) = 1              (normalized forms)
A / (A + (1-A)) = A

We want to adjust these last two equations by re-weighting the simulated events without and with dead regions by U and V respectively, such that the following two statements hold:

U*A / (U*A + V*(1-A)) = B
U*A + V*(1-A) = 1

U ≡ re-weighting of simulation data without dead region
V ≡ re-weighting of simulation data with dead region

The first statement insures that the dead regions are represented by the same fraction in the simulations as in the real data, while the second statement insures that the denominator of the efficiency calculations has the same total number of events. Following through the math we find:

U = B/A
V = (1-B) / (1-A)

We then have the following, with notes to discuss the recommended actions discussed below:

Sector Global region A B U V Datasets to
apply weight U
Datasets to
apply weight V
Recommended
actions
2 z>0, φ in (2±1)π/12 0.5 0.292 0.584 1.416 R0-R4 R5-R9 Don't re-weight (negligible)
4 z>0, φ in (-2±1)π/12 0.6 0.438 0.730 1.405 R0-R5 R6-R9 Do re-weight
5 z>0, φ in (-4±1)π/12 0.7 0.605 0.864 1.317 R0-R6 R7-R9 Do re-weight
6 z>0, φ in (-6±1)π/12 1.0 0.000 0.000 ? ? Use sector 5, R7-R9
11 z>0, φ in (8±1)π/12 0.3 0.169 0.563 1.187 R0-R2 R3-R9 Do re-weight
20 z<0, φ in (-2±1)π/12 0.6 0.438 0.730 1.405 R0-R5 R6-R9 Moot (unused)
21 z<0, φ in (0±1)π/12 1.0 0.996 0.996 ? ? Exclude runs 10076134-10076161

 

NOTES:

Because the issue in sector 2 is quite small (the red region here) and implies a very significant amount of re-weighting, my recommendation is to ignore the re-weighting and use the simulations as they are.

The issue in sector 21 is not small (an entire RDO, the green region here, and actually becoming alive from dead, opposite to the other dead regions), but represents a very small amount of the sampled data (less than half a percent). I recommend to either exclude these runs (easiest; they are all runs from fill 10383) or at least exclude sector 21 in these runs.

Sector 20 is regardless incorrect in the data because GridLeak distortion corrections are wrong (and we are excluding it from analyses), due to the dead anode region 20-5. In principle, sectors 12 and 18 also have dead anode regions (12-4 for all of the data, and 18-4 for most of the data) which should also affect GridLeak distortion corrections, but likely to a lesser degree.

Sector 6 is wrong in the simulations because the sector is treated as perfect despite having a dead region (RDO 6-5) for the entire real dataset. This dead region is identical to the one in sector 5 from simulation sets R7-R9, so it might be worth using the efficiencies calculated for sector 5 in those simulation sets for sector 6 in all of the data. Of note, however, is that neither sector 6 nor sector 5 look like their calculated efficiencies are very off, implying that this particular dead region may have little impact on the sector's efficiency.

 

CONCERNS:

These re-weightings should add emphasis to the simulations where TPC regions were dead, because most of the sampled luminosity came at the end of the pp500 running, as did the dead regions. However, Justin's plots show that sectors 4 and 11 appear to be too inefficient in the simulations and this re-weighting will only amplify that inefficiency (decrease the efficiency). It appears that this re-weighting correction is going the wrong way!

 

-Gene