Class and method naming good advice

Updated on Sat, 2007-08-18 18:29 by testadmin. Originally created by jeromel on 2007-08-18 17:46. Under:

computing

Ottinger's Rules for Variable and Class Naming

When a new developer joins a project which is already in progress, there is a steep learning curve. If the new developer already knows the methodology and programming language, some of this is reduced. If the new developer already knows the problem domain fairly well, this also shortens the ramp-up time.

There is often a great deal of artificial curve which is added to a project by decree or by accident.

The goal of this rule set is to help avoid creating one type of artifical learning curve, that of decyphering or memorizing strange names.

The rules were developed in group discussions, largely by examining poor names and dissecting them to determine the cause of their 'badness'.

Use Pronouncable names:
If you can't pronounce it, you can't discuss it without sounding like an idiot. "Well, over here on the bee cee arr three cee enn tee we have a pee ess zee kew int, see?"
I company I know has genymdhms (generated date, year, month day, hour, minute and second) so they walked around saying "gen why emm dee aich emm ess". I have an annoying habit of pronouncing everything as-written, so I started saying "gen-yah-mudda-hims". It later was being called this by a host of designers and analysts, and we still sounded silly. But we were in on the joke, so it was fun.
Don't do that. It would have been so much better if it had been called 'timestamp' or something.
Avoid Encodings:
Encoded names require decyphering. This is true for hungarian and other 'type-encoded' or otherwise encoded variable names. Besides, encoded names are seldom pronouncable (#1).
When you worked in name-length-challenged programs, you probably violated this rule with impunity and regret. Fortran forced it by basing type on the first letter, making the first letter a 'code' for the type. Hungarian has taken this to a whole new level.
We've all seen bizarre encoded naming standards for files, producing (real name) cccoproi.sc and SRD2T3. This is an artificially-created naming standard in the modern world of long filenames, though it had it's time.
Of course, you can get used to anything, but why create an artificial learning curve for new hires? Avoid this if you can avoid it.
Don't be too cute:
If the names are too clever, they will be memorable only to people who share your sense of humor and remember the joke.
Most meanings have multiple words. Pick ONE:
Pick one word and stick with it. For instance, it's confusing to have 'fetch', 'retrieve' and 'get' as members of the same class.
How do you choose?
Likewise, it's confusing to have a 'controller' and a 'manager' and a 'driver' in the same process.
The names are synonyms, making you thing the variables are the same, but there are several with mechanically different names, which leads you to think that they're not the same. You don't know what's different, or whether there is a difference.
Instead of using different words, use words which describe different the different use or aspects of things (be specific).
Most words have multiple meanings:
Don't use the same word for two purposes, if you can at all avoid it.
Remember that it's not polite at all to have the same name in two scopes.
Nouns and Verb Phrases
Classes and objects should have noun or noun phrase names.
There are some methods (commonly called "accessors") which calculate and/or return a value. These can and probably should have noun names. This way accessing a person's first name can read like:
```
string x = person.name();
```
Other methods (sometimes called "manipulators", but not so commonly anymore) cause something to happen. These should have verb or verb-phrase names. This way, changing a name would read like:
```
fred.changeNameTo("mike") 
```
Use Solution Domain Names:
Use CS terms, algorithm names, pattern names, math terms, etc ...

Yeah, it's a bit heretical, but you don't want your developers having to run back and forth to the customer asking what every
name means *if* they already know the concept by a different name.

We're talking about code here, so you're more likely to have your code maintained by a CS major or informed programmer than by a domain expert with no programming background. End users of a system very seldom read the code, but the maintainers have to.
Also Use Problem Domain Names:
When there is no 'programmer-ese' for what you're doing, use the name from the problem domain. At least the programmer who maintains your code *can* ask his boss what it means. In analysis, this becomes the superior rule over "Use Solution Domain Names", because the end-user is the target audience.
Avoid Mental Mapping:
Readers shouldn't have to mentally translate your names into other names they already know.
Nothing is intuitive:
Sadly, and in contradiction to the above, all names require some mental mapping, since this is the nature of language.
If you use a term which might not be known to your audience, you must map it to the concept you'd like it to represent.

For this reason, most important names should be in a glossary or should be explained in comments at least.
Avoid Disinformation:
Avoid words which already mean something else. For example, "hp", "aix", and "sco" would be horrible variable names because
they are the names of Unix platforms or variants. Even if you are coding a hypotenuse and hp looks like a good abbreviation,it violates too many rules and also is disinformative.

Likewise don't refer to a grouping of accounts as an AccountList unless it's actually a list. A list means something to CS people. It denotes a certain type of data structure. If the container isn't a list, you've disinformed the programmer who has to maintain your code. AccountGroup or BunchOfAccounts would have been better.
Names are only Meaningful in Context:
There are few names which are meaningful in and of themselves. Most, however are not. Instead, you need to place names in
context for your reader by enclosing them in classes, well-named functions, or comments.

The term 'tree' needs some disambiguation, for example if the application is a forestry application. You may have syntax trees, red-black or b-trees, and also elms, oaks, and pines. The word 'tree' is a good word, and is not to be avoided, but it must be
placed in context every place it is used.
Don't add Artificial Context
In an imaginary application called 'Gas Station Deluxe', it is a bad idea to prefix every class with 'GSD' if there is a chance that the class might later be used in 'Inventory Manager' (at which time the prefix becomes meaningless).

Likewise, say you invented a 'Mailing Address' class in GSD's accounting module, and you named it AccountAddress.
Later, you need a mailing address for your customers. Do you use 'AccountAddress'?

In both these cases, the naming reveals an earlier short-sightedness regarding reuse. It shows that there was a failing at the design level to look for common classes across an application.

The names 'accountAddress' and 'customerAddress' are fine names for instances of the class.
No Disambiguation without Differentiation
The first part of this is to avoid "noise words" in your variable names.

Imagine that you have a Product class. If you have another called ProductInfo or ProductData, you have failed to make the names different. Info and Data are like "stuff": basically meaningless. Likewise, using the words Class or Object in an OO system is so much noise.

In short, sometimes people disambiguate for the compiler without differentiating for the reader.

MoneyAmount is no better than 'money'. CustomerInfo is no better than Customer.
The word 'variable' should never appear in a variable name.
The word 'table' should never appear in a table name.
How is NameString better than Name? Would a Name ever be a floating point number? Probably not. If so, it breaks an earlier rule about disinformation.
There is an application I know of where this is illustrated. I've changed the name of the thing we're getting to protect the guilty:
```
         getSomething();
```
```
         getSomethings();
```
```
         getSomethingInfo();
```
The second tells you there are many of these things. The first lets you know you'll get one, but which? The third tells you nothing more than the first, but the compiler (and hopefully the author) can tell them apart. You are going to have to work harder.
If you have a name used twice, you must disambiguate in such a way that the reader knows what the different versions offer her, instead of merely that they're different.

The hardest thing about choosing good names is that it requires good descriptive skills and a shared cultural background. This is a teaching issue, rather than a technical, business, or management issue. As a result many people in this field don't do it very well.

Printer-friendly version
Login or register to post comments

The STAR experiment

Software & Computing

User login

Navigation

Class and method naming good advice

Ottinger's Rules for Variable and Class Naming