Signal Reweighting Tests


One possible failing of using fits to extract a signal from data is a possible bias from unwanted dynamics in the underlying simulation.  In particular to the photon analysis, the signal extraction must not have any strong dependence on the prompt photon cross section in Pythia (which is known to be incorrect).  Allowing the signal and background normalizations to be fit independently in each Et bin should accomplish this property, modulo Et dependencies within the span of each bin.

To validate the independence directly a quick study was performed.  The entire simulation was split into two samples, one taken as simulation and the other as pseudo-data, and the pseudo-data sample reweighted to have different properties than the simulation sample used to determine the signal and background templates.  Below are the results from four such tests:

  • Delta w = 0.0: The weights of all signal events in the pseudo-data are multiplied by zero, removing all signal entirely.
  • Delta w = 1.0: The weights of all signal events in the psuedo-data are multiplied by one, reproducing the nominal results.
  • Delta w = 2.0: The weights of all signal events in the pseudo-data are multiplied by two, doubling the total signal to background fraction.
  • Correction: The weights of all signal events in the pseudo-data are tuned to match the signal/background fractions seen in the real data.

Note that, because the extracted signal cannot be negative, the Delta w = 0.0 extraction results are expected to be biased positive: the posterior distributions are not Gaussian with positive fluctuations canceling negative fluctuations but rather exponential or Gamma distributions where all fluctuations are positive.

In terms of relative deviation (modulo the Delta w = 0.0 test to avoid division by zero) 

The difference between the extracted signal and truth for all tests are shown below and found to be consistent with the true values.  There is no evidence of non-linearities as a function of the signal/background ratio, nor in the Et reweighted Correction test.

Lastly, the correlation in the residuals of each tests is in fact expected given that the same simulation split is used for all three.