2008.09.23 Sided residual plot projection: toward s/b efficency/rejection plot

Ilya Selyuzhenkov September 23, 2008

Data sets:

  • pp2006 - STAR 2006 pp longitudinal data (~ 3.164 pb^1)
    after applying gamma-jet isolation cuts (note: R_cluster > 0.9 is used below).
  • gamma-jet - data-driven Pythia gamma-jet sample (~170K events). Partonic pt range 5-35 GeV.
  • QCD jets - data-driven Pythia QCD jets sample (~4M events). Partonic pt range 3-65 GeV.

Notations used in the plots:

  • Fit peak energy:
    F_peak - integral within +-2 strips from maximum strip
    Maximum strip determined by fitting procedure.
    Float value converted ("cutted") to integer value.
  • Data peak energy:
    D_peak - energy sum within +-2 strips from maximum strip (the same strip Id as for F_peak).
  • Data tails:
    D_tail^left and D_tail^right.
    Energy sum from 3rd strip up to 30 strips on the
    left and right sides from maximum strip (excludes strips which contributes to D_peak)
  • Fit tails:
    F_tail^left and F_tail^right.
    Same definition as for D_tail, but integrals are calculated from a fit function.
  • Maximum sided residual:
    max(D_tail-F_tail)
    Maximum of the data minus fit energy on the left and right sides from the peak.

Maximum sided residual: MC vs. data comparison

Figure 1: Maximum sided residual plot
Top get more statistics for MC QCD sample plot is redone with a softer R_cluster > 0.9 cut

Figure 2: D_peak (projection on vertical axis for Fig. 1)
Upper left plot (no pre-shower fired case) reveals some difference
between MC gamma-jet and pp2006 data at lower D_peak values.
This difference could be due to background contribution at low energies.
Still needs more statistics for MC QCD jet sample to confirm that statement.

Figure 3: max(D_tail-F_tail) (projection on horisontal axis for Fig. 1)
One can get an idea of signal/background separation (red vs. black) depending on pre-shower condition.

Figure 4: Mean < max(D_tail-F_tail) > vs. D_peak (profile on vertical axis from Fig. 1)
For gamma-jet sample average sided residual is independent on D_peak energy
and has a slight positive shift for all pre-shower>0 conditions.
For large D_peak values (D_peak>0.16) MC gamma-jet and pp2006 data results are getting close to each other.
This corresponds to higher energy gammas, where we have a better signal/background ratio,
and thus more real gammas among gamma-jet candidates from pp2006 data.
(Note: legend's color coding is wrong, colors scheme is the same as in Fig. 3)

Figure 5: Mean < D_peak > vs. max(D_tail-F_tail) (profile on horisontal axis from Fig. 1)
For "no-preshower fired" case MC gamma-jet sample has a large average values than that from pp2006 data.
This reflects the same difference between pp2006 and MC gamma-jet sample at small D_peak values (see Fig. 2, upper left plot).
(Note: legend's color coding is wrong, colors scheme is the same as in Fig. 3)

Figure 6: D_peak vs. gamma pt

Figure 7: D_peak vs. gamma 3x3 tower cluster energy

Figure 8: 3x3 cluster tower energy distribution

Figure 9: Gamma pt distribution

Signal/background separation

The simplest way to get signal/background separation is to draw a straight line
on sided residual plot (Fig. 1) in such a way that
it will contains most of the counts (signal) on the left side,
and use a distance to that line for both MC and pp2006 data samples
as a discriminant for signal/background separation.
To get the distance to the straight line one can rotate sided residual plot
by the angle which corresponds to the slope of this line,
and then project it on "rotated" max(D_tail-F_tail) axis.

Figure 10: Shows "rotated" sided residual plot by "5/6*(pi/2)" angle (this angle has been picked by eye).
One can see that now most of the counts for gamma-jet sample (middle column)
are on the left side from vertical axis.

Figure 11: "Rotated" max(D_tail-F_tail) [projection on horizontal axis for Fig. 10]
Cut on "Rotated" max(D_tail-F_tail) can be used for signal/background separation.
From figure below one can see much better signal/background separation than in Fig. 3

Figure 12: "Rotated" D_peak [projection on vertical axis for Fig. 10]

Optimizing the shape of s/bg separation line

Ideally, instead of straight line one needs to use
an actual shape of side residual distribution for MC gamma-jet sample.
This shape can be extracted and parametrized by the following procedure:

  1. Get slices from sided residual plot for different D_peak values
  2. From each slice get max(D_tail-F_tail) value
    for which most of the counts appears on its left side (for example 80%),
  3. Fit these set of points {D_peak slice, max(D_tail-F_tail)} with a polynomial function

The distance to that polynomial function can be used to determine our signal/background rejection efficiency.

This work is in progress...
Just last one figure showing shapes for 6 slices from sided plot.

Figure 13: max(D_tail-F_tail) for different slices in D_peak (scaled by the integral for each slice)