The TSP EEMC Strip Cluster Finder Algorithm

Another EEMC Strip Cluster Finder Algorithm, the TSP algorithm.  TSP is short for Tukey-Smoother Plus.


The algorithm proceeds as follows:

  1. The distribution of the energy per strip in a given layer and sector is smoothed using an algorithm of John Tukey, called "353QH twice", from his book Exploratory Data Analysis.  The method was presented by J. Friedman in Proc.of the 1974 CERN School of Computing, Norway, 11-24 August, 1974, page 271. The method is available in root as
    void TH1::SmoothArray(Int_t nn, Double_t *xx, Int_t ntimes)
    
    The function is passed the value of mNumSmoothIters for the value of ntimes.

    Note: as an optimization, only the strips with indices from between  mSearchMargin less than the index of the first non-zero strip to mSearchMargin plus the index of the last non-zero strip are considered.

  2. The distribution is further smoothed by comparing the smoothed energy of each strip vs. the average  of the adjacent strips.  If the strip energy is above or below a threshold of (1 +/- mAnomalySupFactor), then the strip energy is set to the appropriate threshold.  Note: all averages are computed use the smoothed energies from step 1.  This step can be considered an anomaly suppressant step.

    Note: in the following steps, strip energy refers to the energy of the strips as computed after steps 1 and 2.
  3. Peaks are identified as strips that

    1. Have greater energy the two adjacent strips

    2. Are have energy above a absolute threshold mSeedAbsThres, and above mSeedRelThres times the maximum strip energy in the layer and sector.

    3. Have energy more than mAbsPeakValleyThres greater than the smallest strip energy between this peak and adjacent peaks.

  4. Each peak then forms the seed of a cluster. All strips on either side of the peak are added as long as the energy of the strips is monotonically decreasing, up to either the first strip with zero energy or until the strips are mMaxDist strips away from the seed strip.  Clusters which have less than mMinStripsPerCluster strips are dropped, and the strips are not reassigned to other clusters.

  5. The position of the cluster is set to the energy weighted mean position of the strips in the cluster, and the energy of the cluster is set to the (smoothed) energy of the seed strip.  Note: due to the smoothing and lack of good knowledge how to share the energy of a single strip between clusters, the energy of the peak tends to be a better estimate of the relative energy of the clusters than the sum of all strip energies in the cluster. 


The total set of parameters for the algorithm are

UInt_t mNumSmoothIters;        // number of Tukey smoothing iterations
UInt_t mMinStripsPerCluster;   // min number of strips per cluster
UInt_t mMaxDist;               // max distance for which to include a strip in a cluster
UInt_t mSearchMargin;          // how many strips around the first and last non-zero strip to include
Double_t mSeedAbsThres;        // Absolute energy threshold for seed strips 
Double_t mSeedRelThres;        // Relative energy threshold for seed strips 
Double_t mAbsPeakValleyThres;  // Absolute energy difference between peaks and valleys (lowest point between peaks)
Double_t mAnomalySupFactor;    // Factor for considering if strip energy is anomalous