Clustering

Second most important step in EMC data reduction is to find clusters from EMC data.

Main code performing EMC clustering is located in StRoot/StPreEclMaker and StRoot/StEpcMaker

The main idea behind the BEMC cluster finder is to allow multiple clustering algorithms. With that in mind, any user can develop his/her own finder and plug it in the main cluster finder.

In order to develop a new finder the user should follow the guidelines described in this page.

Display Macro

A Small cluster finder viewer was written for the BEMC detector. In order to run it, use the macro:

$CVSROOT/StRoot/StPreEclMaker/macros/clusterDisplay.C

This macro loops over events in the muDST files, runs the BEMC cluster finder and creates an event by event display of the hits and clusters in the detector. This is an important tool for cluster QA because you can test the cluster parameters and check the results online. It also has a method to do statistical cluster QA over many events.

The commands available are:

setHitThreshold(Int_t det, Float_t th)

This method defines the energy threshold for displaying a detector HIT in the display. The default value in the code is 0.2 GeV. Values should de entered in GeV.  The parameters are:

  • Int_t det = detector name. (BTOW = 1, BPRS = 2, BSMDE = 3 and BSMDP = 4)
  • Int_t th = hit energy threshold in GeV. 

next()

This method displays the next event in the queue.

qa(Int_t nevents)

This method loops over many events in order to fill some clusters QA histograms. The user needs to open a TBrowser n order to access the histograms.  The parameters are:

  • Int_t nevents = number of events to be processed in QA 

help()

Displays a small help in the screen.

How to write a new Cluster Finder

To create a new algorithm it is important to understand how the cluster finder works.

EmcClusterAlgorithm.h

This file defines a simple enumerator for the many possible cluster algorithms. In order to add a new algorithm the user should add a new position in this enumerator

StPreEclMaker.h and .cxx

The cluster finder itself (StPreEclMaker) is a very simple maker. This maker is responsible only for creating the finder (in the Init() method) and call some basic functions in the Make() method.

StPreEclMaker::Init()

This method just instantiates the finder. In the very beginning of Init() method, the code checks which algorithm is being requested and creates the proper finder.

StPreEclMaker::Make()

The Make() method grabs the event (StEvent only) and exits if no event is found. If the event is found it calls 4 methods in the finder. These methods are:

  • mFinder->clear(); // to clear the previous event local cluster
  • mFinder->clear(StEvent*); // to clean any old cluster or pointer in StEvent
  • mFinder->findClusters(StEvent*); // to find the clusters in this event
  • mFinder->fillStEvent(StEvent*); // to fill StEvent with the new clusters
  • mFinder->fillHistograms(StEvent*); // to fill QA histograms

The modifications the user should do in StPreEclMaker.cxx are only to add the possibility of instantiating his/her finder in the StPreEclMaker::Init() method.

Creating a new cluster algorithm

Before creating a new cluster algorithm it is important to know the basic idea behind the code.  The basic classes are

StEmcPreCluster

There is an internal data format for the clusters in the code. The clusters are StEmcPreCluster which are derived from plain ROOT TObject. StEmcPreCluster is more complete than the regular StEvent cluster object (StEmcCluster) because it has methods to add and remove hits, split and merge clusters and well set matching id between different detectors.

StEmcPreClusterCollection

This is a placeholder for the StEmcPreCluster objects that are created. This object derives from the regular ROOT TList object. It has methods to create, add, remove and delete StEmcPreClusters. StEmcPreCluster objects that are created or added to the collections are owned by the collection so be careful.

StEmcVirtualFinder

This is the basic finder class. Any cluster algorithm should inherit from this class. It already create the necessary collections for the clusters in each detector.

To create a new finder algorithm you should define a finder class that inherits from the StEmcVirtualFinder and overwrite the method findClusters(StEvent*). Lets suppose you want to create a StEmcMyFinder algorithm. You should create a class with, at least, the following:
StEmcMyFinder.h

#ifndef STAR_StEmcMyFinder
#define STAR_StEmcMyFinder

#include "StEmcVirtualFinder.h"

class StEvent;

class StEmcOldFinder : public StEmcVirtualFinder
{
private:

protected:
public:
StEmcOldFinder();
virtual ~StEmcOldFinder();
virtual Bool_t findClusters(StEvent*);

ClassDef(StEmcMyFinder,1)
};

#endif

StEmcMyFinder.cxx

#include "StEmcMyFinder.h"
#include "StEvent.h"
#include "StEventTypes.h"

ClassImp(StEmcOldFinder)

StEmcMyFinder::StEmcMyFinder():StEmcVirtualFinder()
{
// initialize your stuff in here
}
StEmcMyFinder::~StEmcMyFinder()
{
}
Bool_t StEmcMyFinder::findClusters(StEvent* event)
{
// check if there is an emc collection

StEmcCollection *emc = event->emcCollection();
if(!emc) return kFALSE;

// find your clusters

return kTRUE;
}

The method findClusters(StEvent*) is the method that StPreEclMaker will call in order to find clusters in the event. All the StEmcVirtualFinder methods are available for the user.

The user has 4 pre cluster collections available. They are named

mColl[det-1]

where det =1, 2, 3 and 4 (btow, bprs, bsmde and bsmdp)

How to deal with clusters and collections

Lets suppose you identified a cluster in a given detector. How do I work with StEmcPreCluster objects and the collections? Please look at the code itself to be more familiar with the interface. The next lines will give basic instructions with the most common tools:

To create and add a cluster to the collection for the detector ´det´
StEmcPreCluster *cl = mColl[det-1]->newCluster();

This line creates a cluster in the collection and returns its pointer.
 

To remove and delete a cluster from a collection
mColl[det-1]->removeCluster(cl); // cl is the pointer to the cluster

or

mColl[det-1]->removeCluster(i); // i is the index of the cluster in the collection

This will remove AND delete the cluster.
 

To get the number of clusters in a collection
mColl[det-1]->getNClusters();
 

To add hits to a StEmcPreCluster (pointer to the cluster is called ´cl´)
cl->addHit(hit);

where hit is a pointer to a StEmcRawHit object.
 

To add the content of a pre cluster ´cl1´ to a cluster ´cl´
cl->addCluster(cl1);

The added cluster ´cl1´ is not deleted. This is very useful if you identified a spitted cluster and would like to merge them.
 

How to do matching in the cluster finder?
Depending on the cluster finder algorithm one can do clusters in one detector using the information in the other detector as seed. In this sense, it is important do have some kind of matching available in the cluster finder. In the original software scheme, StEpcMaker is the maker responsible for matching the clusters in the BEMC sub detectors. This maker is still in the chain.

Because of StEpcMaker, we CAN NOT create StEmcPoints in the cluster finder. This should be done *ONLY* by StEpcMaker. In order to have a solution for that, StEmcPreCluster object has a member that flags the cluster with matching information. This is done by setting a matching id in the cluster. Use the methods matchingId() and setMatchingId(Int_t).

The matching id is an integer number. If matching id = 0 (default value), no matching was done and the matching will be decided in StEpcMaker. If matching id is not equal to 0, StEpcMaker will create points with clusters with the same matching Id.

Using this procedure, we can develop advanced matching methods in the cluster finder algorithm.

How to plug your finder into the cluster finder

In order to plug your algorithm to the cluster finder you need to change two files in the main finder

EmcClusterAlgorithm.h
This file defines an enumerator with the cluster finders. To add your algorithm in this enumerator add an entry in the enumerator definition, for example:


enum EmcClusterAlgorithm
{ kEmcClNoFinder = 0, kEmcClDefault = 1, kEmcClOld = 2, kEmcMyFinder = 3};

StPreEclMaker.cxx
You should change the Init() method in StPreEclMaker in order to instantiate your finder. To instantiate your StEmcMyFinder object, add, in the Init() method of StPreEclMaker:

if(!mFinder)
{
if(mAlg == kEmcClOld) mFinder = new StEmcOldFinder();
if(mAlg == kEmcMyFinder) mFinder = new StEmcMyFinder();
}

 

Original cluster algorithm (kEmcClOld)

This page describes the original cluster finder algorithm. In order to use this algorithm you should set with the algorithm kEmcClOld.

Salient features of the method implemented in the program are,

  • Clustering have been performed for each sub detector separately.
  • Currently clusters are found for each module in the sub detectors. There are some specific reasons for adopting this approach especially for Shower Max Detectors (SMD's). For towers, we are still discussing how it should be implemented properly. We have tried to give some evaluation results for this cluster finder.
  • There are some parameters used in the code with their default values. These default values are obtained after preliminary evaluation, but for very specific study it might be necessary to change these parameters. 
  • The output is written in StEvent format.

Cluster algorithm

  • Performs clustering module by module
  • Loops over hits for each sub detector module
  • Looks for local maximums

Cluster parameters

  • mEnergySeed – minimum hit energy to start looking for a cluster
  • mEnergyAdd -- minimum hit energy to consider the hit part of a cluster
  • mSizeMax – maximum size of a cluster
  • mEnergyThresholdAll – minimum hit energy a cluster should have in order to be saved

Neighborhood criterion

Because of the difference in dimension and of readout pattern in different sub detectors, we need to adopt different criterion for obtaining the members of the clusters.

  • BEMC: Tower gets added to the existing cluster if it is adjacent to the seed. Maximum number of towers in a cluster is governed by the parameter mSizeMax. It should be noted that BEMC, which takes tower as unit is 2-dimensional detector and by adjacent tower, it includes the immediate neighbors in eta and in phi.
  • BSMD: As SMDs are basically one-dimensional detector, so the neighborhood criterion is given by the condition that for a strip to be a neighbor, it has to be adjacent to any of the existing members of the clusters. Here also maximum number of strips in a cluster is governed by mSizeMax parameter.

Cluster Object

After obtaining the clusters, following properties are obtained for each cluster and they are used as the members of the cluster object.

  • Cluster Energy (Total energy of the cluster member).
  • Eta cluster (Mean eta position of the cluster).
  • Phi cluster (Mean phi position of the cluster).
  • Sigma eta, Sigma phi (widths of the cluster in eta nd in phi).
  • Hits (Members of the cluster).

Some Plots

BSMDE clusters for single photon events

Performance Evaluation

Note:  This is a rather old study that I found on BEMC public AFS space and ported into Drupal.  I don't know who conducted it or when.  -- A. Kocoloski

On the subject of reconstruction made by cluster finder and matching, evaluation has been developed  to determine how good is the reconstruction, efficiency, purity, energy and position resolution of reconstructed particles originally generated by simulation data.

Cluster finder is being evaluated at the moment using single particle events, which favors the evaluation of efficiency and purity. In this case, we can define efficiency and purity for single particle events as:

  • efficiency - ratio between the number of events with more than 1 cluster and the total number of events.
  • purity - ratio between the number of events with only 1 cluster and the number of events with at least one cluster.

There are other quantities that could be used for evaluation. They are:

  • energy ratio - ratio between the cluster energy and the geant energy.
  • position resolution - difference between cluster position and particle position.

All these quantities can be studied as a function of the cluster finder parameters,  mSizeMax, mEnergySeed and mEnergyThresholdAll. The results are summarized below.

BTOW Clusters

mEnergyThresholdAll evaluation

Nothing was done to evaluate this parameter.

mEnergySeed evaluation

The purity as a function of mEnergySeed gets better and the efficiency goes down as this parameter increases.  Some figures (the others parameters were kept as minimum as possible) are shown for for different values of mSizeMax for single photons in the middle of a tower with pt = 1,4,7 and 10 GeV/c.

The following plots were generated using energy seeds of 0.1 GeV (left), 0.5 GeV (middle), and 1.5 GeV (right).  Full-size plots are available by clicking on each image:

Eta dfference

Phi difference


Number of clusters



Energy ratio

mSizeMax evaluation

Nothing was done to evaluate this parameter.

BSMD Clusters

mEnergyThresholdAll evaluation

Nothing was done to evaluate this parameter.

mEnergySeed evaluation

The purity as a function of mEnergySeed gets better and the efficiency goes down as this parameter increases.  Some figures (the others parameters were kept as minimum as possible) are shown for for different values of mSizeMax for single photons in the middle of a tower with pt = 1,4,7 and 10 GeV/c.

The following plots were generated using energy seeds of 0.05 GeV (left), 0.2 GeV (middle), and 0.8 GeV (right).  Full-size plots are available by clicking on each image:

Eta Difference (BSMDE only)


Eta difference RMS as a function of photon energy for different seed energies:


Phi Difference (BSMDP only)

Phi difference RMS as a function of photon energy for different seed energies:

Number of clusters for BMSDE

Number of clusters for BSMDP:

BSMDE efficiency and purity as a function of seed energy

BSMDP efficiency and purity as a function of photon energy for different seed energies

mMaxSize evaluation

The efficiency and purity as a function of mSizeMax get better as this parameter increases but for values greater than 4 there is no difference. Some figures (the others parameters were kept as minimum as possible) are shown for for different values of mSizeMax for single photons in the middle of a tower with pt = 5 GeV/c.

Point Maker

After obtaining the clusters for 4 subdetectors, we need to obtain the information about the incident shower by making the proper matching between clusters from different subdetectors.

It should be mentioned that, because of the better energy resolution and higher depth BEMC is to be used for obtaining the energy of the shower. SMDs on the other hand will be used for obtaining the position of the showers because of their better position resolution.

Currently Preshower detector (PSD) has not been included in the scheme of matching, so we discuss here the details of the method adopted for matching other 3 sub detectors.

Following steps are adopted to obtain proper matching:

  • The clusters obtained from different sub detectors are sorted to obtain the clusters in a single subsection of the SMD phi.
    • Each module of SMD phi consists of 10 subsections. 
      • Each subsection consists of 15 strips along eta, giving phi positions of the clusters for the eta-integrated region of 0.1. 
    • In SMD eta same subsection region consists of 15 strips along phi and the same region consists of 2 x 2 towers. 
  • In the present scheme of matching, we match the clusters obtained in 3 sub detectors in each of this SMD phi subsection.
  • There are two levels of matching:
    • SMD eta - SMD phi match
      • First matching aims to obtain the position of the shower particle in (eta, phi). Because of the integration of signals in SMD etas in the SMD phi subsection region, (i.e.. each SMD eta strip adds the signal in 0.1rad of phi and each SMD phi strip adds the signal in delta eta=0.1 region.) For this type of configuration of detectors, we have decided to get the position matching for each SMD phi subsection. Even though for cases like 1 cluster in SMD eta and 1 cluster in SMD phi in the subsection, we have mostly unique matching of (eta, phi), but for cases where SMD eta/SMD phi subsection consists of more than 1 cluster (see figure) then the possibility of matching does not remain unique. the task is handled from the following standpoint, The number of matched (eta, phi) pairs are taken as minimum of (number of clusters in SMD eta, number of clusters in SMD phi). The (eta, phi) assignment for the pairs are given considering the pair having minimum of (E1-E2)/(E1+E2), where E1, E2 are the cluster energy of SMD eta and SMD phi clusters.
    • Position - energy match
      • It should be mentioned that position/energy matching is done only when there exists a cluster in BEMC for the subsection under study. Because of the large dimension of towers, the clusters obtained in BEMC is made up of more than one incident particles. In case of AuAu data, where particle density is very large, the definition of clusters in towers become ambiguous, especially for low energy particles. In the present scheme, as we have discussed earlier, we have followed the method of peak search for finding clusters, where maximum cluster size is 4 towers. 4 towers make the region of one SMD phi subsection, so we have obtained the energy for the pairs obtained in (SMD eta, SMD phi) from the cluster energy of BEMC cluster in the same subsection. For the cases where we have 1 cluster in BEMC, 1 cluster in SMD eta, 1 cluster in SMD phi, we assign the BEMC cluster energy to (eta, phi) pairs. But for the cases when the number of matched pairs is more than one, then we need to split the BEMC cluster energy. Presently it is done according to the ratio of the strengths of (eta, phi) pair energies. This method of splitting has the drawback of using SMD energies, where energy resolution is worse compared to BEMC energy resolution.

Codes for the point maker are found in StRoot/StEpcMaker

Usage

The cluster finder can run in many different chains. The basic modes of running it are:

  1. bfc chain: This is the usual STAR reconstruction chain for real data and simulation.
  2. standard chain: this is the most common way of running the cluster finder. The cluster finder runs with real data and simulated data.
  3. embedding chain

There are some rules a user needs to follow in order to run the cluster finder properly:

  1. Include the file $STAR/StRoot/StPreEclMaker/EmcClusterAlgorithm.h in order to define the cluster algorithm enumerator
  2. Need to load the shared libraries
  3. StPreEclMaker should run *AFTER* the BEMC hits are created (StADCtoEMaker or StEmcSimulatorMaker)
  4. The cluster algorithm should be defined *BEFORE* Init() method is called
  5. Any change in the cluster parameters should be done *AFTER* Init() is called.

The following is an example on how to run the cluster finder in a standard chain:

// include the definitions of the algorithms
#include "StRoot/StPreEclMaker/EmcClusterAlgorithm.h"

class StChain;
StChain *chain=0;

void DoMicroDst(char* list = "./file.lis",
int nFiles = 10, int nevents = 2000)
{
gROOT->LoadMacro("$STAR/StRoot/StMuDSTMaker/COMMON/macros/loadSharedLibraries.C");
loadSharedLibraries();
gSystem->Load("StDbUtilities");
gSystem->Load("StDbLib");
gSystem->Load("StDbBroker");
gSystem->Load("St_db_Maker");
gSystem->Load("libgeometry_Tables");
gSystem->Load("StDaqLib");
gSystem->Load("StEmcRawMaker");
gSystem->Load("StEmcADCtoEMaker");

// create chain
chain = new StChain("StChain");

// Now we add Makers to the chain...
maker = new StMuDstMaker(0,0,"",list,"",nFiles);
StMuDbReader* db = StMuDbReader::instance();
StMuDst2StEventMaker *m = new StMuDst2StEventMaker();
St_db_Maker *dbMk = new St_db_Maker("StarDb","MySQL:StarDb");

StEmcADCtoEMaker *adc=new StEmcADCtoEMaker();
adc->setPrint(kFALSE);

StPreEclMaker *ecl=new StPreEclMaker(); // instantiate the maker
ecl->setAlgorithm(kEmcClDefault); // set the algorithm
ecl->setPrint(kFALSE); // disables printing

chain->Init();

StEmcOldFinder* finder = (StEmcOldFinder*)ecl->finder(); // gets pointer to the finder
finder->setEnergySeed(1,0.8); // change some setting in the finder

int n=0;
int stat=0;
int count = 1;
TMemStat memory;

while ( (stat==0 || stat==1) && n<nevents)
{
chain->Clear();
stat = chain->Make();
n++;
}
chain->Finish();
}