Low pT bin and unfolding

In today's meeting, the question arose regarding the 4 < pT < 5 GeV/c bin and unfolding.  Two cases are considered:
  1. Include 4-5 bin when unfolding, but do not present the result in the final results
  2. Do not include the 4-5 bin when unfolding
Two issues arose:
  • Whether the central values come from unfolding with the bin in (1) or with it out (2)
  • Whether taking an uncertainty proportional to the difference between (1) and (2)
It was requested to show plots for each case and to double check which case is used for the central values.  It was decided last Fall sometime to use case (2) for the central values because the data/MC discrepancies in the 4-5 GeV bin appeared to be causing troubles.  Specifically, when the 4-5 GeV bin was included, the 5-6 GeV bin was brought much lower than seemed resonable and much lower than when starting the unfolding with the 5-6 GeV bin.  While there is a real effect of smearing from the 4-5 GeV into the 5-6 GeV bin, the 4-5 bin is not modeled so well by the Monte Carlo, which is beleived to be mainly due to it being too far below trigger threshold.

Option 1 for Central Values: Include 4-5 GeV


Option 2 for Central Values: do not include 4-5 GeV when unfolding


Further Analysis

While the uncertainties look much different in the upper panels (due to the log plot), looking at the data/Theory panel shows the uncertainties have not changed all that much.  Mainly just the central values have moved.  The shift is large for the first pT bin, minor for the 6-7 and 7-8 GeV bins, and basically imperceptable for the higher pT bins.  The systematic uncertainty due to this (item 4 on the list of cross section uncertainties on this blog) is computed as half the difference.  For example, the values are
Option 1: E d^2 sigma/dp^3 = 3.263e-06 [mb]
Option 2: E d^2 sigma/dp^3 = 7.946e-06 [mb]
Half the difference        = 2.341e-06 [mb]
Half the difference / Option 2 = 0.295 [mb]
The uncertainty of half the difference on the central value of option 2 causes a relative uncertainty of 29.5%, as reported on the other blog.

One should note the uncertainty on the 6-7 GeV bin has changed significantly, most likely due to the change in the bin height for the 5-6 GeV bin.  Note: EEMC scale uncertainty computed for the 6-7 GeV bin is effected by the height of the 5-6 GeV bin.  Specifically, if 5-6 GeV point is higher, than the cross section estimate at the 6 GeV bin edge is higher.  The  integral within +/- 3% of the bin edge with then also be higher, and thus the uncertainty on the 6-7 point due to the EEMC scale uncertainy is higher.  Conceptually, the EEMC energy scale uncertainty is the uncertainty that some of the counts in the bin below should have been in a given bin.  Moving the bin below to a higher value means that there is then more cross section that perhaps should be in the given bin, and thus higher uncertainty.

Conclusions

Personally, I feel the MC/data comparison is a little poorer than we like below 5 GeV.  This was the main for choosing to drop the 4-5 GeV bin from the spin asymmetries.  While there is real effects of smearing in from below on the 5-6 GeV bin, using the present Monte Carlo to estimate the smearing matrix and unfold the cross section also introduces non-physical effects, due to degrated quality of the data/MC comparison below 5 GeV.  For these reasons, the best balance me seems that which we have been doing since about January: a) use the central values from unfolding starting at 5 GeV (since we don't trust the MC too much below 5 GeV) and b) assign half the difference as a systematic, to account for the fact that we may not have fully corrected real physical effects because we didn't include a bin below the first one we show.

Note: before early February, we chose a different approach: since we didn't trust the 4-5 GeV bin so well, the 5-6 GeV was the first "trusted" bin, and then we presented in the result plot the first bin past this, i.e. the 6-7 GeV bin.  We also included the uncertainty for the difference between using the 4-5 bin or not, but this systematic is small for all pT > 6 GeV bins.  However, early February, it was decided that even though the 5-6 GeV bin doesn't have a "trusted" bin below it, and thus has higher uncertainty, it was still worth showing.

Though I lean towards doing what we are doing, if there is strong feeling for requiring the central values to have been unfolded with more more "trusted" and unshown bin below, I would recommend dropping the 5-6 GeV bin and staying with the central values that we have--unfolded starting with the 5-6 GeV bin.


Clarification on the uncertainty estimation

The uncertainty due to the low pT bin is actually slightly more complex than presented.  The unfolding is done three times--once starting at 4-5, once at 5-6, and once at 6-7.  The standard deviation among as many points as are available are then used for the uncertainty.  Specifically, this means that for bins above 6 GeV, the uncertainty is the standard deviation--although I divide by n=3 instead of n-1=2.  For the 5-6 GeV bin, there are only two points, and so the standard deviation using n=2 results in half the difference.  If I had used n-1=1 instead, then it would have been the full difference.  Note: the 6-7 GeV point does not move much regardless of whether the 5-6 GeV bin is included or not.  To demonstate this, here is the plot for unfolding starting at 6-7 GeV and showing the first point as the 6-7 GeV:



Thus, in the region where the MC and data have a good comparison, starting the unfolding at the lowest plotted point or at one below has a very minor difference.  Numerically, the values of the cross section for the 6-7 bin are
Unfolding starting at 5-6: 1.69e-6 [mb]
Unfolding starting at 6-7: 1.89e-6 [mb]
Half the difference:       0.20e-6 [mb]
Half difference / starting at 6-7: 10.5%
The "half difference / starting at 6-7" is meant as a comparable quantity to the earlier computed 30% uncertainty, equal to half the difference divided by starting at the higher of the two options.  Thus, the effect of not including a lower bin is a 3x larger effect when the bin under question is the 4-5 GeV bin rather than the 5-6 GeV bin.  This supports the conclusion that in general one does not always have to have a trusted "padding" bin below the lowest presented bin when unfolding.