Training output - BDT D+-, Run14

BDT setup

The setup of the Boosted Decision Tree method used for all p T and centrality bins is the same. During the training phase, 850 trees with the maximum depth  of three were produced.


Training output

In the application part, we need to choose the cut on the BDT response function. Within the TMVA, for every trained method we obtain a classifier cut efficiency plot, from which we can deduce the optimal cut value - for this analysis, we decided to use the maximum significance S (S = N_s / \sqrt{N_s +N_b} , where N_s and N_b are number of signal and background pairs, respectively). However, to obtain the significance as a function of the BDT response function, we need to predict a ratio between the signal and the background in the real data (when we do not know whether the candidate is signal or not). In other words, we need to predict how many D ± mesons can we found in a sample with a fixed number of entries.

Cut efficiency plot for different ratios of signal and background are shown in attached figures (for ratios S:B 1:1, 1:100, 1:1000, 1:10000), . The significance, the value which we are interested in, is plotted in green. One can observe that the optimal cut value significantly differs for displayed signal to background ratios.

There are several methods how to choose the optimal ratio without blindly guessing. The first deduces the ratio from the analysis of D 0 . However, this method is strongly biased, since one has to rely on the result of another analysis. Because of this reason, we choose the method in which we deduce the optimal cut in the application phase (as done by KF group).

ATTACHED
output from TMVA, cut efficiencies and optimal cut value plots for different signal to background ratios
S:B 1:1, 1:100, 1:1000, 1:10000

This text was a part of my research task.