Things get so much more complicated when your development team grows larger than one person. Not only coordinating the efforts of two or more people, but also agreeing on data structures and naming conventions, designing interfaces so that you insulate developers as much as possible from others’ work, and a whole host of other problems. A source code control system relieves a large part of the worry, although you should be using source code control as soon as your development team grows larger than zero people. I won’t belabor that point here, as it’s the subject for a rather long post in itself.
One of the many things programmers like to argue about is code formatting. This topic is right up there with the discussion of the best language and best text editor. And as is the case with those topics in which each programmer has his own opinion that is unquestionably right, each programmer has his own code formatting style that is The One True Way. Anybody who does things differently is, at best, suspect.
But most development shops impose code formatting standards along with naming conventions. Why? In order to reduce confusion when somebody other than the code’s original author works on it. Without a code formatting standard, confusion arises in several ways.
The format of the code tells you things about it. For example, indentation implies a compound statement. In many languages (the C-derived languages), an opening brace begins a compound statement, which is normally indented. As creatures of habit, when we see indentation we expect to see an opening brace. And vice-versa. If the brace is not where we expect it, or the indentation is different than we’re accustomed to, then our brains have to adjust. This adjustment takes longer than you might expect. For example, consider these three blocks of code, each of which does the same thing, but are formatted slightly differently:
if (someValue == 0) {
DoSomething();
DoSomethingElse();
}
if (someValue == 0)
{
DoSomething();
DoSomethingElse();
}
if (someValue == 0)
{
DoSomething();
DoSomethingElse();
}
Each of those styles has its adherents and detractors, and objectively each is as valid as the other. But two of them make my brain hurt.
Most programmers these days use editors that perform some level of automatic formatting when you enter new code. If my editor is configured to create things in the One True Way, the result of editing a file that was created with one of the heretical formatting techniques is a complete mess.
Absent imposed coding standards, there are four possible solutions to the problem, none of which is satisfactory:
- Just ignore the problem. Believe me when I say that you don’t want to do this. Few things are as confusing as working on a single file that has inconsistent formatting.
- Disable the automatic formatting features of the editor. This is possible, but makes you much less productive. The automatic formatting features save a lot of time, especially when you’re refactoring code, as it prevents you from having to manually indent or unindent code to make sure that the braces match up.
- Reconfigure your editor to match the coding style of whichever file you’re currently working on. This is problematic because programmers often (probably most often) are working on multiple files at the same time. No editor that I know of will allow you to specify the configuration on a per-file basis. (On a file type basis, yes. But not per file.)
- Instruct the editor to reformat code to your style whenever you open a file. Although this sounds like a good idea to begin with, it’s disastrous when source control is involved which, as I pointed out above, should be for every project. Source control typically saves the deltas–changes between two versions of a file–rather than saving the entire file every time it’s modified. This is a huge space savings, and also allows you to view the history to see the specific changes between two versions. If you reformat the entire file, then the source control system will assume that (almost) every line changed. The result is that your source control database will be orders of magnitude larger than it has to be, and the significant changes between versions will be lost in the reformatting noise.
There is one other option that is not currently available, but should be. If the source control system (or a filter in front of the source control system) could normalize a file–convert it to a standard form–then the fourth option above would be possible. You could get a file from source control in whatever the normalized form is, open it in your editor and reformat to your heart’s content. When you saved the file and checked it in, the source control system would reformat it to the common form and store only the deltas. This allows each programmer to view and edit code in whatever format he is most comfortable with, but also allows the source control system to be efficient and effective.
I realize that I’m waving my hand over some implementation detail, particularly the many filters that would be required to format different kinds of files. But nothing here sounds terribly difficult with today’s tools. Has this been done? If not, why?