# Yield Extraction and Correction systematic

Goal:

To calculate a combined systematic error for the yield extraction (i.e. background subtraction) and the correction factor, and to do so in a way that allows me to properly combine systematic uncertainties and statistical uncertaintiees in a meaningful way.

Method:

as shown here, my background-subtracted raw yields are calculated using what I call the jigsaw method.  I model the background shapes using simulated data (both single particle and full pythia.)  These shapes are simultaneously fit to the data and then subtracted from the data leaving a pure pion peak.  This peak is integrated over to find the raw yeild in any particular bin.  Obviously this method is succeptible to uncertainty, especially in the normailzation of the background shapes to the data.  This normalization determines how many counts are subtracted from the data and has a direct influence on the final counts.

Previous analyses have taken a maximum-extent approach to this problem.  The raw yields are calculated using some extreme scenario such as assuming no background at all or fitting the background to a polynomial.  A major problem with this method is that these extreme scenarios are  improbable.  They sample only the largest outliers of whatever underlying distribution the background actually is.  Further, these systematic uncertainties are then quoted as on equal footing with statistial uncertainties arising from gaussian processes and constituting 1 sigma errors.  Thus, the systematics are vastly overestimated which leads to a large overall uncertainty and a weaker measurement.  This problem is compounded when separate maximum extent errors are calculated for related variables (such as yield extraction and correction factor) and then added together in quadrature.  We ought to be able to do better.

As stated above the end result of the Jigsaw method is a set of scaling parameters for each of the background shapes.  The shapes are scaled by these parameters and then subtracted away.  If the scaling parameters are wrong, the final yield will be wrong.  Fortunately, the fitting procedure provides not only a scale for the shape but an uncertainty on that scale.  So we know, how likely the scale is to be wrong.  Instead of picking an outlying scenario (e.g. all scaling factors = 0) we can calculate the yields with a range of scaling factors sampled from an underlying gaussian probability distribution with a mean of the nominal scaling value and a width of the error on that scaling factor.  By sampling enough points, we can build up a probability distribution for the measured point (which should also be gaussian) and take a 1 sigma error on that distribution.  This error will not only be more accurate but will be on equal footing with the statistical error of the measured point.

Of course the final cross section value is a convolution of the background subtracted raw yields and the generalized correction factor.  We need to vary the fitting paramaters for both at the same time to obtain an accurate estimation of the error on the final value.  When we do this we get distributions for the final cross sections on a bin by bin basis.  See below

Bins:

9 pt bins, with boundries {5.5, 6., 6.5, 7., 7.75., 9., 11.5, 13.5, 16., 21.}

Plots:

1) The above shows the bin by bin cross section values after 10,000 iterations of the sampling procedure described above.  The systematic error for yield extraction + correction factor can be taken to be the width/mean of the above gaussian fits.

The Bin by bin relative systematic errors are as follows

Bin   Rel. Sys.

1      14.7%

2      8.2%

3      10.2%

4      9.6%

5      12.3%

6      11.6%

7      12.0%

8      13.2%

9      25.0%

previous measurements of the systematic uncertainties for these two contributions (when added in quadrature) yield an average systematic of ~20%.  As you can see, this method produces substantially reduced uncertainties in most bins in a fashion that is (as I argue above) more robust than previous methods.