Software Refactoring: Avoid Chained if ... else if .. Constructs, Part 3

Use continue and return to highlight where special cases have been fully handled.

In the previous post, we showed that if a sequence of conditionals are written such that they are mutually independent (not necessarily mutually exclusive), then the "else" keywords are optional. It is also the case that the clauses can be reordered arbitrarily (assuming that the semantics of the body of each "if" clause is also order-independent). If the reordering proves to be legal, it can enable other simplifications in the code. But more importantly, this reordering can be used to emphasize the most important case by weeding out and handling special cases briefly (at the head of a loop or function.

A large amount of the code I deal with is devoted to handling trivial or special cases and filtering out cases that are not of interest. For these cases, I like to use statements that disrupt the control flow (break, return, continue, throw). Adherents to structured programming principles will consider this style distasteful, but there is a good reason to prefer it over the strict structured-programming style: readability. Marking the end of the special-case code with a continue or return statement tells the reader that the special case as been removed from further consideration. This summarization can be a significant aid in understanding the purpose of the routine as a whole.

Consider the following example:

typedef std::vector<T*> myVectorType;
myVectorType myVector;
...
// Traverse a vector that contains a mix of valid T pointers and null pointers.
for (myVectorType::iterator i = myVector.begin();
i != myVector.end();
++i)
{
T* ptr = *i;
if (ptr)
{
// A valid pointer.
<Do lots of stuff for several pages.>
}
else
{
// Oho! I'll bet you didn't expect this!
++nullPtrCount;
}
}

The problem here is that the if ... else (even without "else if" clauses) involves action-at-a-distance. To understand the case in which ptr is null, the reader has to scan ahead to find the matching else.

Recoding the two clauses as "if (ptr)" and "if (!ptr)" does not change the semantics of the snippet, and since these two clauses are independent, they can be reordered freely. Using the knowledge that they are mutually exclusive, we can place a "continue" at the end of the first clause. This leaves us with

// Traverse a vector that contains a mix of valid T pointers and null pointers.
for (myVectorType::iterator i = myVector.begin();
i != myVector.end();
++i)
{
T* ptr = *i;
if (!ptr)
{
++nullPtrCount;
continue;
}
if (ptr)
{
// A valid pointer.
<Do lots of stuff for several pages.>
}
}

Now the reader can see clearly that after nullPtrCount has been incremented, nothing more is done with that case. His attention is free to focus on the pages of code that follow, dealing with the interesting case. The coder can again use the knowledge that the two cases are mutually exclusive to remove the condition and brackets from the case where the pointer is non-null. Since this is the interesting case, it makes sense to promote it to the top level within the loop:

// Traverse a vector that contains a mix of valid T pointers and null pointers.
for (myVectorType::iterator i = myVector.begin();
i != myVector.end();
++i)
{
T* ptr = *i;
if (!ptr)
{
++nullPtrCount;
continue;
}
// A valid pointer.
<Do lots of stuff for several pages.>
}

This form is useful for dealing with other special cases. They may be weeded out at the top of the loop, leaving unchanged the fact (and representation) that the main purpose of the loop is dealing with the common case.

After the code is rendered relatively readable in this way, we may turn our attention to efficiency considerations. Though it may be the case that all special cases must be weeded out before reaching the common case, it may also be the case that the order in which they are treated is unimportant. Suppose also that the predicate for identifying one special case is relatively cheap while another is relatively expensive. In that case, it will obviously improve the efficiency of the code if the cheaper predicate is tested first. That is, one should follow the pattern:

for (coo::iterator i = c.begin(); i != c.end(); ++i)
{
if (cheapPredicate(i))
{
// handle special case 1.
continue;
}
if (expensivePredicate(i))
{
// handle special case 2.
continue;
}
// handle common case.
}

In the context function, the continue statements are replaced by return statements. Otherwise, the code looks much the same. To illustrate this, we factor out the body of the above loop as a helper function.

for (coo::iterator i = c.begin(); i != c.end(); ++i)
coo_iter_helper(i);
void coo_iter_help(coo::iterator i)
{
if (cheapPredicate(i))
{
// handle special case 1.
return;
}
if (expensivePredicate(i))
{
// handle special case 2.
return;
}
// handle common case.
}

Software Refactoring

Monday, August 25, 2014

Avoid Chained if ... else if .. Constructs, Part 3

No comments:

Post a Comment