Run 9 200GeV Data / Simulation Comparison: Eta Spectra Mismatch

Here I summarize my efforts to understand the mismatch seen between data and simulation in the jet eta spectra in the endcap ...

 

In my previous blog post, I defined my trigger sorting scheme and showed a number of data / simulation comparisons. In general, the agreement between data and simulation was good, but there was an obvious and systematic excess of simulation as compared to data in the endcap region when looking at jets as a function of eta. This feature was most prominant in the L2JetHigh category, but could be seen (albeit with worse statistics) in the other trigger categories.

 

Figure 1: This figure shows the eta spectra for the six trigger categories for the 5 point branch. Data is in blue and simulation is in red. The simulation excess is distinctly visible in the L2JetHigh pannel (upper left).

 

There were a number of suggestions as to what could be causing this and I made a number of investigations to see if any could be the cause:

  • Looked at data / dimu comparisons of the L2Jet filters (random, monojet, or dijet) seperately
  • Looked at the 5 point branch to see if tracking could be an issue
  • Did a z-vertex reweighting to see if data / simu z-vertex distribution mismatch was an issue
  • Made sure I used the same runs in data and in simulation

 

The plots showing these investigations can be seen on the previous page but the bottom line was that none of these issues seemed to contribute in any significant way to the simulation excess I see.

 

With the above, relatively simple, causes ruled out I started looking for areas of disagreement in the spectra of other quantities such as the sum of the track pt, the sum of the tower pt, the numbers of tracks and towers, the neutral fraction, etc. I placed various cuts on data and simu spectra of these quantities to avoid regions where there was mismatch, but nothing I did seemed to resolve the eta spectra problem. 

 

I then wanted to see what would happen if I only placed a cut on the simulation, so I discarded all jets which had less than 2 tracks in the simulation and compared the resultant eta spectrum to the unmodified data eta spectrum. The agreement was spot-on as seen in the figure below.

 

Figure 2: This figure shows the unmodified eta spectra (left column) and the eta spectra where I have excluded all simu jets which have fewer than 2 tracks (right column).

 

Despite the success of applying the track number cut to the simulation only, I wondered about the validity of applying a cut in simulation and not in data. This unease was shared by other members of the jet group when I presented the above plot on 8/21/12. Despite this, the success of the track number cut made me feel that it was a strong hint as to what was causing the data / simu discrepency.

 

At the jet meeting, the point was made that this could still be an endcap gain issue. The idea of looking at the emc only branch was raised as a way to check this since the only source of disagreement would be the towers. Pibero created a sample of 10 data jet trees and the full simulation sample including the emc only branch.

 

Figure 3: This figure shows the eta spectra for the L2JetHigh trigger for the EMC only jet tree branch. The left pannel shows all jets, the middle pannel shows jets with no endcap towers, and the right pannel shows jets which do contain endcap towers. All pannels use the same normalization factor. The data / simu discrepency is still clearly visible.

 

A pdf containing data / simulation comparisons for a number of quantities using the emc only branch can be found here.

 

Figure 4: This figure shows the eta spectra for the L2JetHigh trigger for the EMC only jet tree branch now with jet pt cuts applied. The Left pannel includes all jets, the Middle pannel only includes jets with pt > 10 GeV, and the Right pannel only includes jets with pt > 14 GeV. Despite the worsening statistics, the data / simu discrepency is still visible.

 

 

In figure 3 above, I divided jets into those which contained only barrel towers and those which contained barrel and endcap towers. I decided to split the categories into jets with only barrel towers, jets with only endcap towers and jets with barrel and endcap towers.

 

Figure 5: This figure shows the data (Blue) and simulation (Red) jet eta spectra for L2JetHigh jets from the EMC-Only branch. The left column shows jets which have barrel towers only, the middle column shows jets which have endcap towers only, and the right column shows jets which have both barrel and endcap towers. The simulation curves in all three pannels are scaled by the same number, which is the total number of data L2JetHigh jets divided by the total number of simulation L2JetHigh jets.

 

If the disagreement seen between data and simulation was a result of using incorrect gains to generate the simulation, hopefully we would see better agreement by adjusting the gains. Because rerunning the simulation from the beginning is time consuming, Pibero modified the trigger emulator ran from the jet finder so that the ADC values it reads in can be modified. As a trial, he re-ran the FF simulation sample and lowered the endcap tower gains by 2.5%.

 

Figure 6: This figure shows the jet eta spectra for the original simulation (Blue) and the new simulation (Red) which has had the endcap gains lowered by 2.5%. The pannels are the same as in figure 5 and no scaling has been applied to any of the spectra.

 

One reason that the simulation with the lower gains did not seem to change much compared to the old simulation could be that the events which don't pass the new filter requirements are events which would not have enough transverse energy to pass the 8.4 GeV cut to be included in the L2JetHigh trigger category. To make sure the modifications to the trigger filter are doing what we expect, I have made severl plots which look at the Endcap Jet Patch response in the old and new simulation.

 

Figure 7: This figure shows the number of times the EJP1(Bin 1) and EJP2 (Bin 2) bits fired in the original simulation (Blue) and the new simulation (Red).

 

In Bin 1, there are 4.19969*10^8 events from the original simulation and 3.11521*10^8 events from the new simulation. In Bin 2, there are 6.04652*10^7 events from the original simulation and 4.60386*10^7 events from the new simulation.

 

Figure 8: This figure shows the ADC spectra for the Endcap Jet Patches above threshold 1 (left column) and for Jet Patches above threshold 2 (right column). Again the blue curve is the original simulation and the red curve is the new simulation. Nither curve has been scaled. A figure showing ADC vs Jet Patch ID can be seen here.

 

I have also rerun the old and new simulations through my analysis using the 5-Point hit branch. The idea is that the additional charged pt can promote some of the events which pass the trigger condition but fall under the 8.4 GeV L2JetHigh trigger category threshold. There will be more subthreshold jets in the original simulation than in the new simulation so the addition of tracking may increase the size of the difference we see between the old and new simulation.

 

Figure 9: This figure shows the old simulation (Blue) and new simulation (Red) jet eta spectra for the 6 different trigger categories, as well as the old/new ratio. No scale factor has been applied.

 

Figure 10: This figure shows only the L2JetHigh category, but the eta spectrum has been broken into jets containing only barrel towers (left), containing only endcap towers (middle) and containing both barrel and endcap towers (right). No scale has been applied.

 

In order to make a true data / simu comparison, the data gains must be shifted the same amount as the simulation gains. The data gains are changed in the StjEEmcMuDst class of the jet maker. This function reads in the endcap tower ADCs and applies the ped and gain values to convert to energy and adds that tower energy to the jet finder. Pibero created a set of 10 data jet trees which have the gains manually lowered by 2.5% in this function.

 

Figure 11: The top six pannels of this figure show the jet eta spectra for the six different trigger categories. The blue curve is the original data production and the red curve is the new data production with the lowered gains. The bottom six pannels show the corresponding old / new ratios. No scaling has been applied. The 5-point branch is shown.

 

The changes to the eta spectra in the data behave as we would expect. We can now do a true comparison between data and simulation as if both had been created with endcap gains 2.5% lower than what they currently are set at in the database.

 

Figure 12: This pannel shows the jet eta spectra data / simu comparisons for the six different trigger categories. The top six pannels show the comparison using the original simulation and data production for reference (Blue=Data, Red=Simu). The next six pannels show the comparison using the new simulation and data production which have had the gains lowered by 2.5% (Blue=Data, Red=Simu). The final six pannels show the data/simu ratio for the six trigger categories. The Blue curve shows the ratio for the original data / simu comparison (top six pannels) and the Red curve shows the ratio for the new data / simu comparison (middle six pannels).

 

 

 

It is possible that some of the data / simu disagreement in the endcap region may be caused by Pythia not getting the actual forward physics quite right. One area that may not be simulated correctly is the underlying event affect. As a rough check, Carl suggessted at the 9/11/12 Jet Meeting that I look at the eta spectra data / simu comparison using the anti-kt jet algorithm with the .5 radius as this is less sensitive to underlying event affects than the cdf-midpoint algorithm I have been using to this point.

 

Figure 13: This figure shows the eta spectra data / simu comparisons for the six trigger categories. The Blue curves are data and the Red curves are simulation. The top six pannels show the comparison for the original data and simulation and the bottom six pannels show the comparison for the new data and simulation with the gains lowered by 2.5%. All pannels use the Anti-kt algorithm with the .5 cone radius.

 

Figure 14: This figure shows the data / simu ratios for the jet eta spectra for the six different trigger categories. The Blue curves show the data / simu ratios from the CDF-Midpoint algorithm and the Red curves show the data / simu ratios from the Anti-Kt algorithm. The top six pannels are the ratios obtained from comparing the original data and simulation and the bottom six pannels are the ratios obtained from comparing the data and simu with the 2.5% gain change.

 

Figure 14 shows that there is very little difference in the eta spectra data / simulation ratios when using the CDF-Midpoint algo vs the Anti-Kt algo.