Overview of StRoot FMS clustering code for reviewers

The FMS clustering code reconstructs possible photons ("points") from energy depositions ("hits") in groups ("clusters") of FMS towers. There are two steps to this process:

  1. "Clustering" a group of towers into a single contiguous area of energy deposition.
  2. Fitting the cluster with the shower-shape profile of a photon in FMS towers, to distinguish clusters formed by a single photon from those formed by two overlapping photon showers.

Note that the "photons" yielded by the fitting process may in fact be electrons, hadrons etc, so the more generic term "point" is used to refer to the result of this fitting. The full FMS reconstruction therefore flows as hits -> clusters -> points. As the clusters and points are closely related, they are reconstructed by a single STAR maker, StFmsPointMaker.

Overview of cluster-finding and photon-fitting

Before discussing the code structure, it may be helpful to explain the approach used.

Cluster finding is performed on the full list of hits in an event. Clusters are initiated from one or more "seed" towers, selected by their high energy and isolation from any other seed towers. Hits are then added to clusters, repeatedly looping over the hits to "grow" the cluster outward from the seed by the addition of adjacent towers. Clustering continues until no more towers can be added to any clusters. Sometimes a tower may be equidistant between seed towers, in which case it is assigned to a cluster at the end of the process, based on its distance from the mean cluster position, calculated over all towers in the cluster. No tower is allowed to be shared between clusters.

Clusters are categorised as formed either by one photon, two photons merging, or possibly either. The last case is referred to as an "ambiguous" cluster in the code. Categorisation is based on the number of towers in a cluster, the cluster energy and its shape. The exact criteria have been empirically arrived upon based on past analyses of FMS data and can be found in the source code. To determine the actual properties of the photon(s) creating the cluster, the towers in a cluster are subsequently fitted.

The fitting procedure utilises the expected energy deposition of a photon in adjacent towers, according to a "shower shape" function. Minuit minimisation is used to fit the shower shapes of either one or two photons to a cluster, depending on its categorisation. In the case of an ambiguous cluster, both one- and two-photon fits are performed, and the cluster is identified as a one- or two-photon cluster on the basis of the better fit. Detailed background information about the shower shape function can be found here:
Once each cluster has been fit individually, a simultaneous "global" photon fit of all towers in all clusters is performed, using the results of the first round of fitting as initial values for the fit parameters. This helps "tighten up" the fit results, and correct for errors in distributing towers amongst clusters.

The final results are an event with some number (including possible zero) clusters, each formed from a number of adjacent towers, and fitted to yield either one or two photons per cluster.

Code structure

The code to be reviewed is in three main packages: StFmsPointMaker, StEvent and StMuDSTMaker. Additions and modifications to StEvent and StMuDSTMaker code are to be checked by the relevant experts, while the main review is concerned with StFmsPointMaker. However, the notes below on StEvent and StMuDSTMaker may be of interest to the StFmsPointMaker reviewers to provide context for some of the design choices.

Additions and modifications to StRoot/StEvent

FMS hits were already included in StEvent, via the class StFmsHit. We have added structures representing FMS clusters and points. The full list of additions and changes is:

  • Class StFmsCluster (in StFmsCluster.h and StFmsCluster.cxx), describing a cluster of StFmsHits.
  • Class StFmsPoint (in StFmsPoint.h and StFmsPoint.cxx), describing a single point fitted to an cluster.
  • Extend StFmsCollection (in StFmsCollection.h and StFmsCollection.cxx) to implement cluster and point collections, in addition to the existing hit collection.
  • Changes to StContainers.h and StContainers.cxx to define St[S]PtrVecFms{Cluster/Point} containers via StCollection{Def/Imp} macros.
  • Add StFmsCluster.h and StFmsPoint.h to the files included in StEventTypes.h

Additions and modifications to StRoot/StMuDSTMaker/COMMON

Changes to StMuDSTMaker largely mirror those in StEvent; previously there was only hit information, now we have added cluster and point objects to the MuDst. The cluster and point objects are somewhat slimmed-down from their StEvent counterparts, to reflect the "micro" nature of the MuDst. Additionally, there have been some changes to non-FMS-related parts of StMuDSTMaker, in order to support persistent referencing between hits, clusters and points (see below).

Aside on persistent referencing

It is useful if a cluster can access a list of (1) the hits that formed that cluster and (2) the points that were fitted to it. Similarly, it is useful if a point can be used to access the cluster to which it was fitted. This is necessary if an end-user wishes to e.g. omit clusters/points containing certain hits (e.g. from hot towers or some region of the detector), or to select points based on some property of a cluster. Ideally this is done by referencing the original hits/clusters/points in the relevant MuDst branches (recall that hits, clusters and points are each written to a separate TTree branch in the MuDst). This avoids duplication of data, as would be necessary if we simply, say, stored a copy of all the hits forming a cluster in that cluster object.  It also overcomes complications related to recursion (if a cluster stores a point, which then stores a cluster…). It also means that users can apply operations to the main object arrays (e.g. sort hits by energy or sub-detector) and still retain the referencing, which would not be the case if we made associations by, say, storing the indices of the relevant hits etc. Clearly this referencing needs to be able to be written to file to be of use. ROOT supports such persistent references via the classes TRef (http://root.cern.ch/root/htmldoc/TRef.html) and TRefArray (http://root.cern.ch/root/htmldoc/TRefArray.html). We have therefore used these to support the cross-referencing scheme outlined above. Note that as a result, a TProcessID object is now written to the MuDst in addition to the tree. While the provided implementation functions as we expect (and can be tested with a macros provided in the review distribution), we would appreciate feedback from the MuDst coordinator (and any other software experts) if they have concerns about the appropriateness of our approach, and whether it may be better incorporated into the StRoot framework in some other way. One issue outstanding is where to make calls to TProcessID::SetObjectCount(), which is used to reset the persistent object count after each event. This is currently done in StFmsPointMaker, but it would likely be more appropriate in a maker of "higher priority" in the chain than StFmsPointMaker, so that it is robust in the case that other users make their own use of TRef[Array].

The full list of additions and changes is:

  • Class StMuFmsCluster (in StMuFmsCluster.h and StMuFmsCluster.cxx), describing a cluster of StMuFmsHits.
  • Class StMuFmsPoint (in StMuFmsPoint.h and StMuFmsPoint.cxx), describing a single point fitted to a cluster.
  • Add support for "FmsCluster" and "FmsPoint" arrays in StMuArrays.h and StMuArrays.cxx, following the procedure and naming for the existing "FmsHit" branch.
  • Extend StMuFmsCollection (in StMuFmsCollection.h and StMuFmsCollection.cxx) to implement cluster and point collections, in addition to the existing hit collection.
  • Extend StMuFmsUtil (in StMuFmsUtil.h and StMuFmsUtil.cxx) to populate StMuFmsCollection from StFmsCollection and vice versa.
  • Modify StMuDstMaker.cxx to fill and read FMS cluster and point arrays. In detail, these changes include: relevant calls in StMuDstMaker::connectFmsCollection(); removal of a call to StMuFmsHit::IgnoreTObjectStreamer(), as the TRef mechanism requires a unique ID, written as part of TObject; call TChain::BranchRef(), to allow automatic loading of referenced objects; pass option "C" to TClonesArray::Clear() when clearing the FmsCluster branch, to clear the arrays of referenced hits and points (which won't happen otherwise).

StRoot/StFmsPointMaker

StFmsPointMaker is the STAR maker responsible for finding FMS clusters and points according to the procedure described above. It provides access to upstream data (FMS hits, database information), calls the cluster/point-finding routines, and populates StEvent with the resultant cluster and point objects. It does not interact directly with the MuDst; the cluster and point data is migrated from StEvent to the MuDst via StMuFmsUtil, which is part of StMuDSTMaker, following the existing pattern used for StFmsHit.

The actual clustering and photon fitting is handled by a number of other, more specialised classes, which are utilised either directly or indirectly by StFmsPointMaker to perform the full clustering and photon fitting. Please see the comments in the relevant files for more detail on each class. They are:

  • StFmsClusterFinder: performs the association of adjacent towers into a cluster.
  • StFmsClusterFitter: defines the photon fitting routine applied to clusters, to determine the properties of the photon(s) forming the cluster.
  • StFmsEventClusterer: manages clustering and photon-fitting for all clusters/points in a single FMS sub-detector for one event. This is the class with which StFmsPointMaker directly interacts, with StFmsPointMaker looping over the four FMS sub-detectors.
  • StFmsFittedPhoton: a lightweight structure describing a single "photon" found by fitting a cluster.
  • StFmsGeometry: provides a simple interface to FMS database information.
  • StFmsTower: a wrapper around StFmsHit, storing additional information needed during clustering. Each hit provided by StFmsHitMaker are wrapped in an instance of this object before being passed to the clustering routine.
  • StFmsTowerCluster: an extended version of StFmsCluster, storing additional information useful during clustering.

Each class is declared in StFmsPointMaker/<class name>.h and most classes have a corresponding implementation file, StFmsPointMaker/<class name>.cxx.

Other code

There are other changes that have been made to enable testing or debugging, namely:

  • Addition of an "fmsPoint" chain option in StRoot/StBFChain/BigFullChain.h. This was added to allow us to run our maker in a BFC chain. Of course, after the review the BFC maintainer(s) will add the appropriate option as they see fit.
  • Extend StFmsHitMaker to be able it to re-generate hits (including reapplication of calibrations) from an existing MuDst. This was useful in our debugging tests, but isn't necessary for the proposed cluster/point functionality and isn't being submitted for review here. However, to use all the test macros provided, the reviewers should use this version of StFmsHitMaker, not the one checked out from CVS.
  • An additional maker, StFmsQAHistoMaker, generates a number of standard QA plots with hit, cluster and point properties, which were used by the developers when debugging the code. It is provided for the reviewers for testing output, but should not be considered part of the review.

Downloading and building

Browsable, documented code is here: www.star.bnl.gov/~tpb/stfms/
The full development source code is hosted here: https://github.com/yuxip/fms/ and is also on RCF at /star/u/tpb/fmsSoftware/reviewFmsPointMaker

Either copy the code from the RCF directory, or download your copy via:

git clone http://github.com/yuxip/fms.git [optional name]

This will give you a directory "fms" (assuming you didn't provide an alternative name to git clone), so

cd fms

to navigate there. You will see an StRoot directory containing the development code, plus a number of test ROOT macros and scripts. Before building we need to get some other files - for StEvent and StMuDSTMaker, only those files affected by our modifications are tracked in the above repository. Therefore the remaining parts of existing STAR packages should be checked out via CVS. As the code is in development, please make sure you are using the DEV version of STAR code

stardev

To get the full test suite, do:

cvs co StRoot/StBFChain StRoot/StEvent StRoot/StMuDSTMaker StRoot/StMuDSTMaker/COMMON

then build with cons:

cons

Building has been tested and succeeds on SL6 under STAR version DEV as of 2014-08-07. Due to C++11 features used in some places, the code will not compile under gcc 4.3.2 in the SL5 STAR environment.

For developers

  • Though not required by the STAR standards, developers are requested to additionally enable the flag "-pedantic" to catch other warnings. Please treat any warnings in your code as errors and fix them.
  • StFmsPointMaker is to be used as a testbed for enabling C++11 support in STAR, so feel free to use it. We are currently developing for gcc 4.4.7, so the list of supported C++11 features is given here: gcc.gnu.org/gcc-4.4/cxx0x_status.html 

Testing

You are now ready to run the test scripts. The following are provided:
  • bfc.C runs a BFC chain with StFmsPointMaker and StFmsQAHistoMaker. This will yield both a MuDst file (along with others like <name>.event.root) and a PostScript file with some example figures. Many example figures have a version for StEvent and a version for StMuDst. If there are no errors these should be the same i.e. data in StEvent should be propagated without change to MuDst.
  • The script bfc.sh is provided to run bfc.C for some file and number of events with a minimal set of chain options.
  • load.C loads all necessary libraries. You should run this before e.g. opening a MuDst file.
  • testFms_stfms.C is can be run on a MuDst to generate example figures. These are the same StFmsQAHistoMaker figures produced by bfc.C. The StMuDst versions should be the same as those from running bfc.C - this is to check that the results at runtime are correctly written (and readable) from a file. The StEvent versions will be blank in this case as there will be no StEvent information in the MuDst.