STAR coding and naming standards

STAR coding standards

  1. C++ programming Guidelines
  2. C++ coding style guidelines
  3. General STAR rules (FORtran and C coding style, STAR specific requirements)
  4. File extensions

STAR StRoot/ makers naming standards

  1. The directory structure under StRoot tree
  2. Trees and implicit/hidden rules
  3. Current patterned exceptions

STAR coding standards

Fortran and C coding style manual

postscript

STAR Specific requirements

The C++ coding guide should be consulted for the general standards and allowed/discouraged features in the SRA environment. In addition, the following information are provided
  • Hard-coded numbers within the code must be avoided for portability and maintainability reasons. The use of constants (const) is a valid solution. For values which are likely to change with time, a database approach should be considered. Refer to the database Web page area for more information.
  • For printing messages in the STAR framework use the StMessage message manager package documented here. For all messages from a given portion of code, use a unique string, like:
    { LOG_XXX << &quotStZdcVertexMaker::Init(): in ZdcVertexMaker did not find ZdcCalPars." << endm; }

    where XXX is either

    • DEBUG
    • INFO
    • WARN
    • ERROR
    • FATAL
    • QA

    Then, you can filter in/out the wanted / unwanted messages using the logger filter mechanism. The Logger documentation is available here and its use is encouraged.

  • To exit your code on an error condition in the STAR framework, return one of the STAR return codes (an enum) from your Maker:
    enum EReturnCodes{
      kStOK=0,  // OK
      kStOk=0,  // OK
      kStWarn,  // Warning, something wrong but work can be continued
      kStEOF,  // End Of File
      kStErr,  // Error, drop this and go to the next event
      kStFatal      // Fatal error, processing impossible
    };

    Outside Makers in your application code AND for testing or debugging  conditions that should never happen and indicate disaster if they do, we recommend the use of assert() (it aborts the program if the assertion is false). Don't use exit(). For info on assert() see the man page ('man assert').
    assert() should not be used in official code (apart from rare cases). Generally speaking, it is BAD practice to just abort the program and should be avoided. Better to message an error and propagate up a fatal error flag. Unless your Maker is a fundamental and base maker (DAQ detecting a corruption, db finding an unlikely condition which should really never happen, IO maker not able to stream, ...) the use of assert() is prohibited as a single detector sub-system error shall not lead to an entire chain abort. The use of clear messages with level FATAL is emphasized.

  • See also Class and method naming good advice

File Extensions

  • C++ header files containing ROOT related code must have the extension .h, the referring source files must have the extension .cxx. This rule is imposed on us by ROOT.

  • Plain C++ header files, i.e. those without ROOT related code have the extension .hh, the source files the extension .cc. Only header files which contains definitions usable in C and C++ may have the extension .h. This is a convention commonly used in HEP (e.g. CLHEP, Geant4).




STAR StRoot makers naming standards

This document attempts to explain

  • the code directory structure and layout in STAR

  • the rules and assumptions triggered in the make system (cons) solely on the basis of the name choice

  • the existing exceptions to the rules

Naming convention in green are user discouraged (proliferation prevention) and such request should be accompanied with a strong reasoning. Items in magenta are obsolete forms (and code) which will disappear and rendered obsolete in a near future. Anything appearing in red should be forgotten immediately as they are forbidden (and were introduced due to unfortunate momentary laps of reason, power surge or extreme special transitional needs).

Why a naming convention ??
Naming convention keeps consistency between code and packages written by many users. Not only it enables users to have a feel for what-does-what but also, it allows managers to define basic default set of compilation rules depending sometimes naming convention. Naming conventions in general are fundamental to all large projects and although N users will surely have N best-solution, the rules should be enforced as much as possible.

The directory structure under StRoot tree

The StRoot/ tree Is of the following form

StRoot/

XXX/

ONLY base class should be freely named. Example: StarClassLibrary, StarRoot, Star2Root

 

StXXX/

A directory tree which will contain a base class many makers will use and derive from.
In this category, XXX can be anything. For example, StChain, StEvent, StEventUtilities

 

St_XXX_Maker/

XXX not being a detector sub-system but a set of character with underscores:

the code is FORtran derived. XXX is a sub-system in a general sens (not necessarily a detector-sub-system)

 

StXXXMaker/

A tree for a Maker, that is, code compiled in this tree will be assembled as one self-sufficient package. A maker is a particular class deriving from StMaker. Its purpose is to run from within a chain (StChain) of makers and perform a specific task.

In this category, sub-name convention are as follow
* StXXXDbMaker a maker containing the database calls for the sub-system XXX (+)
* StXXXSimulationMaker or StXXXSimulatorMaker a simulation maker for the subsystem XXX
* StXXXCalibMaker or StXXXCalibrationMaker a calibration maker for the sub-system XXX
* StXXXMixerMaker a data/simulation mixer code for he sub-system XX
* StXXXDisplayMaker a self-explained named Graphical tool (+)
* StXXTagMaker a maker collecting tags for the sub-system or analysis XX

while XXX is in principle a detector sub-system identification (3 to 4 letters uniquely designating the sub-system), it may also be anything but a detector sub-system (StAssociationMaker, StMiniMcMaker, StDbMaker) or of the form XX=analysis or physics study.

 

StXXXRaw*/

Any directory with named with the word Raw will make our make system include the necessary path for the Run-Time-System DAQ reader files automatically. This convention is additive to any other description and convention herein.

Example: StEmcRawMaker is a "maker" 9as described above) and a code base using the DAQ reader and so would be the expectation for Stl3RawReaderMaker or StFgtRawMaker.

 

StXXXUtil/
StXXXUtilities/

Code compiled in a Util or Utilities tree should be code which do not perform any action (nor a maker) but constitute by itself a set of utility classes and functions. Other classes may depend on a Utility library.
* XXXUtil : XXX IS a sub-system detector.
* XXXUtilities : XXX IS NOT a detector sub-system (this is reserved)

 

StXXXPool/

This tree will contain a set of sub-directories chosen by the user, each sub-directory maybe a self-contained project with no relation with anything else. Each sub-directory will therefore lead to the creation of a separate library. The naming convention for the library creation is as follow :
* If the subdirectory is named like StYYY, the library will inherit the same name. Beware of potential name clash in this case
* If the subdirectory has an arbitrary name YYY, the final library name will be have the name StXXXPoolYYY .
The Pool category has some special compilation internal rules: if it does not compile, it may be removed from compilation entirely. As such, codes appearing in Pool directory trees cannot be part of a production maker dependency. A typical usage for this structure is to provide a Pool (or collection) of lose codes not used in production (utility tools for sub-systems, analysis codes or utilities).
XXX
can be easer a Physics Work Group acronym or a detector sub-system acronym.

 

StXXXClient/

This tree will behave like the Pool trees in terms of library naming creation (separate libraries will be created, one per compilable sub-directory).
XXX can be anything relevant for a sub-system. Client directories MUST compile (unlike the pools) and may be part of a dependency of a data processing chain. Its general goal is to provide a different tree structure for a set of code providing a "service" widely used across makers. For example, the Run Time System (RTS) have a Client tree containing DAQ related reading codes.

 

Trees and implicit/hidden rules

StRoot/
StRoot/

StXXX./
(./)

README

A basic documentation in plain text (not mandatory). If exists, the software guide will display the information contained in this file.

 

 

doc/

A directory containing more elaborate documentation, either in html or in LaTeX. Note that if a file named index.html exists, the software guide will link to it

 

 

local/

A subdirectory containing stand-alone Makefiles for the package and/or standalone configuration files

 

 

examples/

A directory having a collection of code using the Maker or utility package of interest (case incensitive)

 

 

macros/

A directory containing root macros example making use of the maker

 

 

kumac/

This is an obsolete directory name (from staf time) but still considered by the make system. It may also appears in the pams/ tree structure.

 

 

test/

This directory may contain test programs (executables should in principle not appear in our standard but be assembled)

 

 

html/

A directory possibly containing cross-linked information for Web purposes. However, note that the documentation is, since 2002, auto-generated via the doxygen documentation system (see the sofi page for more information).

 

 

images/

A directory containing images such as bitmap, pixmaps or other images used by your program but NOT assembled by any part of the build process. XPM files necessary for Qt for example should not be placed in this directory as explicit rules exists in 'cons' to handle those (but cons will ignore the xpm placed in images/).

    wrk/
run/
 

 

 

include/

A directory containing a set of common include files

 

 

Any other name

Will be searched for code one level down only.
All compiled code will be assembled in one library named after to StXXX...
Each sub-directory will be compiled separately that is, each must contain code using explicit include path as the only default search paths for includes will be the one described by CPPPATH and its own directory. 1
Include statement can ALWAYS refer to the relative path after the StRoot/portion as the StRoot/ path is a default in CPPPATH.

 

StXXXPool/
StXXXClient/
(./)

doc/
local/
examples/
macros/
kumac/
test/
html/
images/
wrk/
run/
include/

As noted above (i.e. the content of those directories will be skipped by the make system)

 

 

Any other name

The presence of every sub-directory will create a different dynamic library. Note that this is NOT the case with the other name format (all compiled code would go in a unique library name)
The convention is as follow:
* If the name starts with 'St', for example 'StZZZ', a library StZZZ.so will be created containing every compiled code available in StZZZ directory 2
* if the name does NOT start with 'St', for example 'WWW', a library StXXXPoolWWW.so will be created containing all compile code available in WWW directory

 

Current patterned exceptions.

 

StEventDisplay.*

Directories within this pattern will be compiled using the extra include path pointed by the environment variable QTDIR. The moc program will run on any include containing the Q_OBJECT directive, -DR__QT define is added to CXXFLAGS.

 

StDbLib
StDbBroker

Those are special. Compilation will consider MySQL includes and the created dynamic library will be linked against MySQL

 

St.*Db.*

Any directory following this pattern will use the MySQL include as an extra include path for the CPPPATH directive

 

StTrsMaker
StRTSClient

Are two exceptions of kind (b) [see footnote 1] and uses its own include/ directory as a general extraneous include path.

 

StHbtMaker

For this maker, a pre-defined list of sub-directories is being added to the (CPPPATH)

 

StAssociationMaker
StMuDSTMaker
.*EmcUtil
StEEmcPool
StTofPool
StRichPool
Sti.*

This form will include in the CPPPATH every sub-directories found one level below. Only macros/, examples/ and doc/ are excluded withing this form noted in (a). For the Pool directory, the extraneous rule mentioned here is additive to the one of Pool directories.



Direct comments to the STARSOFT list.


1However, if there is a need for StRoot/StXXX sub-directories compilation to include every available sub-paths (other than the exceptions noted above) (a) as a list of default path in a compiler option or if you want a default include/ directory (b) to be always added in a default include path compiler option statement, you may request this feature to be enabled. To do that, send an Email to STARSOFT .

2In this form, the sub-directory MUST be self-sufficient i.e. all code and include (apart from the default paths) must be in the sub-directory StZZZ

 

Class and method naming good advice

Ottinger's Rules for Variable and Class Naming


When a new developer joins a project which is already in progress, there is a steep learning curve. If the new developer already knows the methodology and programming language, some of this is reduced. If the new developer already knows the problem domain fairly well, this also shortens the ramp-up time.

There is often a great deal of artificial curve which is added to a project by decree or by accident.

The goal of this rule set is to help avoid creating one type of artifical learning curve, that of decyphering or memorizing strange names.

The rules were developed in group discussions, largely by examining poor names and dissecting them to determine the cause of their 'badness'.

  1. Use Pronouncable names:
    If you can't pronounce it, you can't discuss it without sounding like an idiot. "Well, over here on the bee cee arr three cee enn tee we have a pee ess zee kew int, see?"
    I company I know has genymdhms (generated date, year, month day, hour, minute and second) so they walked around saying "gen why emm dee aich emm ess". I have an annoying habit of pronouncing everything as-written, so I started saying "gen-yah-mudda-hims". It later was being called this by a host of designers and analysts, and we still sounded silly. But we were in on the joke, so it was fun.
    Don't do that. It would have been so much better if it had been called 'timestamp' or something.

  2. Avoid Encodings:
    Encoded names require decyphering. This is true for hungarian and other 'type-encoded' or otherwise encoded variable names. Besides, encoded names are seldom pronouncable (#1).
    When you worked in name-length-challenged programs, you probably violated this rule with impunity and regret. Fortran forced it by basing type on the first letter, making the first letter a 'code' for the type. Hungarian has taken this to a whole new level.
    We've all seen bizarre encoded naming standards for files, producing (real name) cccoproi.sc and SRD2T3. This is an artificially-created naming standard in the modern world of long filenames, though it had it's time.
    Of course, you can get used to anything, but why create an artificial learning curve for new hires? Avoid this if you  can avoid it.

  3. Don't be too cute:
    If the names are too clever, they will be memorable only to people who share your sense of humor and remember the joke.

  4. Most meanings have multiple words. Pick ONE:
    Pick one word and stick with it. For instance, it's confusing to have 'fetch', 'retrieve' and 'get' as members of the same class.
    How do you choose?
    Likewise, it's confusing to have a 'controller' and a 'manager' and a 'driver' in the same process.
    The names are synonyms, making you thing the variables are the same, but there are several with mechanically different names, which leads you to think that they're not the same. You don't know what's different, or whether there is a difference.
    Instead of using different words, use words which describe different the different use or aspects of things (be specific).

  5. Most words have multiple meanings:
    Don't use the same word for two purposes, if you can at all avoid it.
    Remember that it's not polite at all to have the same name in two scopes.

  6. Nouns and Verb Phrases
    Classes and objects should have noun or noun phrase names.
    There are some methods (commonly called "accessors") which calculate and/or return a value. These can and probably should have noun names. This way accessing a person's first name can read like:
    string x = person.name();
    Other methods (sometimes called "manipulators", but not so commonly anymore) cause something to happen. These should have verb or verb-phrase names. This way, changing a name would read like:
    fred.changeNameTo("mike") 
  7. Use Solution Domain Names:
    Use CS terms, algorithm names, pattern names, math terms, etc ...

    Yeah, it's a bit heretical, but you don't want your developers having to run back and forth to the customer asking what every
    name means *if* they already know the concept by a different name.

    We're talking about code here, so you're more likely to have your code maintained by a CS major or informed programmer than by a domain expert with no programming background. End users of a system very seldom read the code, but the maintainers have to.

     

  8. Also Use Problem Domain Names:
    When there is no 'programmer-ese' for what you're doing, use the name from the problem domain. At least the programmer who maintains your code *can* ask his boss what it means. In analysis, this becomes the superior rule over "Use Solution Domain Names", because the end-user is the target audience.

     

  9. Avoid Mental Mapping:
    Readers shouldn't have to mentally translate your names into other names they already know.

     

  10. Nothing is intuitive:
    Sadly, and in contradiction to the above, all names require some mental mapping, since this is the nature of language.
    If you use a term which might not be known to your audience, you must map it to the concept you'd like it to represent.

    For this reason, most important names should be in a glossary or should be explained in comments at least.

     

  11. Avoid Disinformation:
    Avoid words which already mean something else. For example, "hp", "aix", and "sco" would be horrible variable names because
    they are the names of Unix platforms or variants. Even if you are coding a hypotenuse and hp looks like a good abbreviation,it violates too many rules and also is disinformative.

    Likewise don't refer to a grouping of accounts as an AccountList unless it's actually a list. A list means something to CS people. It denotes a certain type of data structure. If the container isn't a list, you've disinformed the programmer who has to maintain your code. AccountGroup or BunchOfAccounts would have been better.

     

  12. Names are only Meaningful in Context:
    There are few names which are meaningful in and of themselves. Most, however are not. Instead, you need to place names in
    context for your reader by enclosing them in classes, well-named functions, or comments.

    The term 'tree' needs some disambiguation, for example if the application is a forestry application. You may have syntax trees, red-black or b-trees, and also elms, oaks, and pines. The word 'tree' is a good word, and is not to be avoided, but it must be
    placed in context every place it is used.

     

  13. Don't add Artificial Context
    In an imaginary application called 'Gas Station Deluxe', it is a bad idea to prefix every class with 'GSD' if there is a chance that the class might later be used in 'Inventory Manager' (at which time the prefix becomes meaningless).

    Likewise, say you invented a 'Mailing Address' class in GSD's accounting module, and you named it AccountAddress.
    Later, you need a mailing address for your customers. Do you use 'AccountAddress'?

    In both these cases, the naming reveals an earlier short-sightedness regarding reuse. It shows that there was a failing at the design level to look for common classes across an application.

    The names 'accountAddress' and 'customerAddress' are fine names for instances of the class.

     

  14. No Disambiguation without Differentiation
    The first part of this is to avoid "noise words" in your variable names.

    Imagine that you have a Product class. If you have another called ProductInfo or ProductData, you have failed to make the names different. Info and Data are like "stuff": basically meaningless. Likewise, using the words Class or Object in an OO system is so much noise.

    In short, sometimes people disambiguate for the compiler without differentiating for the reader.

    MoneyAmount is no better than 'money'. CustomerInfo is no better than Customer.
    The word 'variable' should never appear in a variable name.
    The word 'table' should never appear in a table name.
    How is NameString better than Name? Would a Name ever be a floating point number? Probably not. If so, it breaks an earlier rule about disinformation.
    There is an application I know of where this is illustrated. I've changed the name of the thing we're getting to protect the guilty:

     

             getSomething();
             getSomethings();
             getSomethingInfo();


    The second tells you there are many of these things. The first lets you know you'll get one, but which? The third tells you nothing more than the first, but the compiler (and hopefully the author) can tell them apart. You are going to have to work harder.
    If you have a name used twice, you must disambiguate in such a way that the reader knows what the different versions offer her, instead of merely that they're different.

The hardest thing about choosing good names is that it requires good descriptive skills and a shared cultural background. This is a teaching issue, rather than a technical, business, or management issue. As a result many people in this field don't do it very well.