Software Refactoring: Comments

Comment everything

This may seem like a no-brainer: adding comments to code will make it more maintainable. Given that a main property of maintainable code is that it communicates well the intent of the code, comments are indispensable.

Regardless how well-organized the code is, and how self-documenting the names of variables and functions, there will always be some information missing from the code itself. A mathematical paper that contained only equations and no descriptive text might be brilliant, but it would also be incomprehensible to anyone but the original author -- and therefore worthless. So it is with code.

Comments go beyond what can be expressed in the code: they express in natural language the intended usage, inputs and outputs of a function; they provide a conceptual description of what is represented by a data structure or variable; they provide a step-by-step description of the algorithm being implemented; and they provide the means to verify whether a given segment of code is correct.

Comments from Nothing

When faced with the task of analyzing and comprehending legacy code, the exercise of adding comments may help you to focus on and extract the original intent. Moreover, the addition of comments provides testable hypotheses. An erroneous comment is better than no comment, since inscrutable code is proportionally more likely to be incorrect. Moreover, the comment makes a claim about what the code is supposed to do (according to your current understanding). If that turns out to be untrue, either the code must be modified, or you have learned something from the assertion. If the latter, the corrected comment will reflect the fruits of your investigation, and the code is ultimately more valuable.

Commenting the Implementation

When starting with a completely undocumented piece of code, you must usually work from the bottom up, documenting small segments of code first, and working your way up to the function level. While doing so, it is often useful to apply other factoring techniques, such as breaking up long functions and choosing mnemonic variable names.

When commenting the implementation, you should focus on the effect of a block of code, not on its structure. In this way, you retell what is being said by the code but in a way that is more abstract and thus easier to comprehend. Do not waste comment space restating the obvious: your job is to interpret, not parrot. For example, the comment "return zero" provides no interpretation, while "return success" does.

Adding comments at this level may reveal errors in implementation. For example, you might read a piece of code which appears to dispatch messages of a certain type, returning an error for the types not handled. You write "Dispatch messages of type foo and return an error for all other message types." But you have already noticed that there are now more unhandled messages than those enumerated in the routine. You write "TODO: add cases for missing message types."

Commenting the Interface

When the behavior of a function can be comprehended as a whole, you can add a block comment to document its interface. This should include a description of the inputs and outputs, as well as the expected behavior of the function.

The inputs and outputs should be fairly obvious from the signature. However, even when providing a comment on individual arguments, much room for interpretation is possible. A pointer or reference that is only read within the function could be marked as "const". Even if you can't immediately change the interface, you can make a note of that in your description. If the function tests that input values lie in a certain range, or makes other assumptions about them, these need to be stated as well.

The overall description can include potential error conditions, and list desirable and undesirable interactions between the operands. If exceptions are used, it should list at least the exceptions generated in the function itself as well as important exceptions that may be generated in subordinate functions and propagated to the call (if known).

Commenting at this level, you can already say something about the strengths and weaknesses of the function. You may have identified assumptions on which the function is relying. Write these down! Also, if you have noticed potential performance pitfalls, write those down as well. For example, "This function uses a quadratic search algorithm, so it will become very slow if the list L is allowed to grow past 1000 elements or so."

Commenting on the Intended Usage

Commenting on the intended usage of a function goes beyond the boundaries of the function itself. It requires broader comprehension regarding the purpose of the function within the program. This kind of understanding is especially valuable to someone trying to add new features or identify the cause of a bug. If such high-level information does not fit reasonably within the code itself, it should still be recorded in a separate file (a README or a design document).

How Comments Help

As you add comments to the program, you will build up a more abstract view of the functions, data structures and their interactions. Whereas it may be possible to easily discern the behavior of a leaf function, doing the same for a function that calls others requires also referencing (or recalling) those subordinate functions. After parsing out a half-dozen or so, it becomes easy to forget the overall effect of the main function. Moreover, it consumes a lot of time to reference and understand each of the subordinate functions in turn.

When someone has taken the trouble to write a high-level description of the function, one can often gain all of the required information there. No time at all need be spent referencing the subordinate functions. The abstract description conveys a view of the structure of the program and the behavior of a function at an appropriate level of abstraction. The conceptual model it conveys saves the reader from having to digest the code and form a similar conceptual model on his own.

The importance of comments cannot be stressed enough, and yet I continue to see uncommented code written, delivered and accepted. Add comments wherever you go, and insist on the same level of commentary from others.

Software Refactoring

Sunday, August 31, 2014

Comments

Comments from Nothing

Commenting the Implementation

Commenting the Interface

Commenting on the Intended Usage

How Comments Help

No comments:

Post a Comment