Software Refactoring: 2014

Monday, September 29, 2014

Establishing and Preserving Invariants

State clearly the invariants associated with a given type.
Make sure that these invariants are preserved.
If necessary, split a type with conflicting uses, so the invariants of each can be described precisely.

Invariants are the properties you ascribe to an object that should be true throughout the lifetime of that object. Often, these are structural assumptions that can be used to traverse a data structure. For example, in a tree structure, every child might have a parent pointer that is expected to point to its parent. This is easily checked.

class TreeNode{ TreeNode* parent;
TreeNode* first_child; TreeNode* next; public: bool isOK() { // Check the structure at this node.
for (TreeNode* i = first_child; i; i = i->next) { if (i->parent != this) return false; if (! i->isOK()) return false; } return true; }};

Implicit in the above example is that the list of children at a given level are represented as a linked list -- linked through the next pointer. The last child in the list will have a next pointer that is NULL. Also implied is that if a node has no children, its first_child pointer will be NULL.

Other invariants are possible. Rather than letting the parent field of the root node contain random junk, one might impose the invariant that it be NULL. That way a traversal from a leaf up to the root will know when to stop.

bool TreeNode::isRoot() { return parent == NULL; }

The designer of a class should state the class invariants clearly and ensure that they are maintained. They must be established by (each of) the constructors, and left undisturbed or re-established by any operation upon an instance of the class.

In this example, the TreeNode type does double-duty, representing not only a node in the tree, but also the subtree rooted at that point (along with any siblings to the right). This makes it obvious that isRoot() can return true for all the siblings at the top level, so the way the tree is defined, there is not a unique definition for "the root". In the interior of the tree, it is possible to enumerate all of ones siblings by getting one's parent, then first_child and then traversing the list following next points. At the top level, this is not possible.

To fix the problem, one can establish a new invariant -- that the root node has no siblings. Then we would have an additional check in isOK():

bool TreeNode::isOK(){ if (isRoot()) if (next) return false; // Check the structure at this node, as before. ...}

It seems a bit wasteful to perform this extra check at every node in the tree, just to make sure that the root node is OK. Another approach would be to create a special type to represent the root node, inheriting from the more general TreeNode type. That way, the RootNode will have all of the properties of a TreeNode, and also the special property that its parent and next pointers are null.

class RootNode : public TreeNode{ public: bool isOK() { if (parent || next) return false; return TreeNode::isOK(); // Explicitly call parent class version. }}

So. What does all this have to do with refactoring?

It is quite often the case that the invariants associated with a type defined in legacy code are not clearly stated. This may be because the original author did not understand the issues involved or he simply didn't take the time to document them. In any case, as capabilities were added and the purpose of the type migrated over time, the original invariants were not preserved. That makes the class a fertile area for bug-creation.

Suppose, in our toy example, that one of the capabilities added later was to insert a new node after a given sibling. The code looks like this:

class TreeNode::insertAfter(TreeNode*& new){ assert(new->next == NULL); new->next = this->next; assert(new->parent == NULL); new->parent = this->parent; this->next = new; new = NULL; // Steal this pointer.
}

That all works fine, but if this happens to be the root node, we have unwittingly violated the invariant that the root node has no siblings. If some other part of the code depends on the root node being the solitary sibling at the top level, that code may break in mysterious ways. Had the invariant been clearly stated and religiously adhered to, the bug could never have crept in.

Often, the lack of clarity surrounding class invariants will lead to a type being used in an overloaded way. It expresses one set of invariants in one usage and a different set in another. Client code compensates for this by inserting predicates to sort out which set of invariants is being expressed in a given context. Not only does this make the client code very ugly, it is an open invitation for bugs.

A ready example can be found in a bug I recently isolated in the Chapel code. The problem was that (with optimizations turned on), the C compiler was complaining that there was an attempt to deallocate a statically-allocated object. Here is the failing procedure:

inline proc _cast(type t, x) where t == string && x.type != c_string { const cs = _cast(c_string, x); const ret = toString(cs); if !isBoolType(x.type) && !isEnumType(x.type) then chpl_free_c_string(cs); return ret;}

The fault occurred in the call to chpl_free_c_string(). The variable cs -- initialized with the return value from the nested call to _cast() -- is inferred to have type c_string. A case in which the c_string was statically allocated was reaching the call to chpl_free_c_strring().

The root cause goes back to the early implementation of the run-time support for character strings in Chapel. In it, generally, low-level character strings (c_strings) were sometimes dynamically allocated and sometimes not. Memory management was not a concern, so the dynamically-allocated strings were leaked, the others being automatically reclaimed (if stack allocated) or never reclaimed (if statically-allocated). One might have chosen

Much later, calls (to chpl_free_c_string()) were added to reclaim some of the leaked memory. At that point in time, cases where the return value from _cast were known to be statically allocated were filtered out. Unfortunately, there were other overloads of _cast that returned a statically-allocated c_string and that were not enumerated in the cases where the call to chpl_free_c_string() was to be avoided.

This code has exactly the symptoms I mentioned earlier. The lack of an explicit invariant (regarding whether a c_string is heap-allocated or not) forces the client code to use predicates (here ! isBoolType() and ! isEnumType) to compensate. Not only is this ugly, but it's incomplete. That was the cause of the bug. A weak specification of the interface led to weak code.

How can we remedy the situation? One way would be to store in the class whether an instance contains a heap-allocated c_string. This would make the interface expectations explicit. A statically-allocated c_string that had its heap-allocated bit set to "true" would clearly be lying. Then it would be a matter of tracing back to see where the bit was set, and ensuring that that piece of code satisfied the new, stricter interface. All downstream code could assume that the bit was correctly set.

Code that was concerned about how a c_string was allocated would still have to test that bit, but that detail need not necessarily be exposed in client code. As long as c_string objects are manipulated only through the c_string interface, manipulations involving memory allocation can examine the bit and "do the right thing". Let us assume, for example, that toString() will make a copy of its operand if it is statically-allocated (or stack-allocated), and will otherwise "steal" its operand and set the input pointer to zero. We further assume that chpl_free_c_string() will accept a null pointer as its input, and in that case do nothing. Then the client code can be simplified to:

inline proc _cast(type t, x) where t == string && x.type != c_string { const cs = _cast(c_string, x); const ret = toString(cs); chpl_free_c_string(cs); return ret;}

The error-prone predicates are eliminated from the client code.

Another alternative is to use a distinct types to represent heap-allocated and non-heap-allocated c_string objects. In languages that support function overloading, this approach can be as elegant as hiding the allocation state within the object itself. In addition, where static function binding is possible, it can provide a slight performance advantage. The decision of whether to use the code path for heap-allocated vs. non-heap-allocated c_strings is made at compile time; the extra time required to extract and test the allocation flag in a c_string object is avoided. The downside of this approach is that it expands the size of the executable.

Suppose that we use the type c_string_copy to represent a heap-allocated c_string. For this case, the corrected client code would be

inline proc _cast(type t, x) where t == string && x.type != c_string_copy { const cs = _cast(c_string_copy, x); const ret = toString(cs); return ret;}

In this case, we can only have the nested call to _cast return one type (since we want calls to be statically bound). We choose to promote the input variable to the c_string_copy type. The overloaded version of toString() always steals its operand so the call to chpl_free_c_string() is no longer necessary. That makes the cast routine very simple indeed.

You will encounter much legacy code in which the interfaces are not clearly specified. In the cases where the invariants associated with a type are not clear, it may suffice to simply state those assumptions clearly and then make all of the methods affecting that type obey the new, stricter interface.

If it turns out (as in this case), that there are two or more conflicting uses of the same time, the options are to:

Hide those choices within the type itself
Split the type into two or more static types
Split the type into two or more types using inheritance

External considerations will determine the best choice.

Sunday, August 31, 2014

Comments

Comment everything

This may seem like a no-brainer: adding comments to code will make it more maintainable. Given that a main property of maintainable code is that it communicates well the intent of the code, comments are indispensable.

Regardless how well-organized the code is, and how self-documenting the names of variables and functions, there will always be some information missing from the code itself. A mathematical paper that contained only equations and no descriptive text might be brilliant, but it would also be incomprehensible to anyone but the original author -- and therefore worthless. So it is with code.

Comments go beyond what can be expressed in the code: they express in natural language the intended usage, inputs and outputs of a function; they provide a conceptual description of what is represented by a data structure or variable; they provide a step-by-step description of the algorithm being implemented; and they provide the means to verify whether a given segment of code is correct.

Comments from Nothing

When faced with the task of analyzing and comprehending legacy code, the exercise of adding comments may help you to focus on and extract the original intent. Moreover, the addition of comments provides testable hypotheses. An erroneous comment is better than no comment, since inscrutable code is proportionally more likely to be incorrect. Moreover, the comment makes a claim about what the code is supposed to do (according to your current understanding). If that turns out to be untrue, either the code must be modified, or you have learned something from the assertion. If the latter, the corrected comment will reflect the fruits of your investigation, and the code is ultimately more valuable.

Commenting the Implementation

When starting with a completely undocumented piece of code, you must usually work from the bottom up, documenting small segments of code first, and working your way up to the function level. While doing so, it is often useful to apply other factoring techniques, such as breaking up long functions and choosing mnemonic variable names.

When commenting the implementation, you should focus on the effect of a block of code, not on its structure. In this way, you retell what is being said by the code but in a way that is more abstract and thus easier to comprehend. Do not waste comment space restating the obvious: your job is to interpret, not parrot. For example, the comment "return zero" provides no interpretation, while "return success" does.

Adding comments at this level may reveal errors in implementation. For example, you might read a piece of code which appears to dispatch messages of a certain type, returning an error for the types not handled. You write "Dispatch messages of type foo and return an error for all other message types." But you have already noticed that there are now more unhandled messages than those enumerated in the routine. You write "TODO: add cases for missing message types."

Commenting the Interface

When the behavior of a function can be comprehended as a whole, you can add a block comment to document its interface. This should include a description of the inputs and outputs, as well as the expected behavior of the function.

The inputs and outputs should be fairly obvious from the signature. However, even when providing a comment on individual arguments, much room for interpretation is possible. A pointer or reference that is only read within the function could be marked as "const". Even if you can't immediately change the interface, you can make a note of that in your description. If the function tests that input values lie in a certain range, or makes other assumptions about them, these need to be stated as well.

The overall description can include potential error conditions, and list desirable and undesirable interactions between the operands. If exceptions are used, it should list at least the exceptions generated in the function itself as well as important exceptions that may be generated in subordinate functions and propagated to the call (if known).

Commenting at this level, you can already say something about the strengths and weaknesses of the function. You may have identified assumptions on which the function is relying. Write these down! Also, if you have noticed potential performance pitfalls, write those down as well. For example, "This function uses a quadratic search algorithm, so it will become very slow if the list L is allowed to grow past 1000 elements or so."

Commenting on the Intended Usage

Commenting on the intended usage of a function goes beyond the boundaries of the function itself. It requires broader comprehension regarding the purpose of the function within the program. This kind of understanding is especially valuable to someone trying to add new features or identify the cause of a bug. If such high-level information does not fit reasonably within the code itself, it should still be recorded in a separate file (a README or a design document).

How Comments Help

As you add comments to the program, you will build up a more abstract view of the functions, data structures and their interactions. Whereas it may be possible to easily discern the behavior of a leaf function, doing the same for a function that calls others requires also referencing (or recalling) those subordinate functions. After parsing out a half-dozen or so, it becomes easy to forget the overall effect of the main function. Moreover, it consumes a lot of time to reference and understand each of the subordinate functions in turn.

When someone has taken the trouble to write a high-level description of the function, one can often gain all of the required information there. No time at all need be spent referencing the subordinate functions. The abstract description conveys a view of the structure of the program and the behavior of a function at an appropriate level of abstraction. The conceptual model it conveys saves the reader from having to digest the code and form a similar conceptual model on his own.

The importance of comments cannot be stressed enough, and yet I continue to see uncommented code written, delivered and accepted. Add comments wherever you go, and insist on the same level of commentary from others.

Saturday, August 30, 2014

Refactoring for Correctness, Robustness and Maintainability

Being faced with the task of maintaining legacy code can be daunting. Legacy code is often poorly structured, sparsely commented and rife with bugs. However, merely restructuring the code can expose logical inconsistencies. These inconsistencies often reflect a lack of precision in the implementation, or design decisions that were never addressed before coding began. Resolving such inconsistencies often makes large numbers of bugs simply disappear.

I have worked on large, inscrutable legacy code bases for many years. Time after time, applying the simple transformations catalogued in this blog has allowed me to correct long-standing weaknesses, and improve both the internal and observable quality of the code.

I treat the various transformations described here as a toolbox, each tool to be applied as appropriate to improve my understanding of the code while also cleaning it up. In "Bitter Java", [ ] describes software "smells" and the tools he applies to improve the code. My approach is similar: each technique is mostly orthogonal to the others. Applying any one of them can improve the quality of the code in some way. Applying many in series can make the revised code completely unrecognizable when compared to the original, even though all of the changes are merely formal and almost purely formulaic.

Certain assumptions underlie this approach. Primarily, I assert that the adage, "You can't test in quality," holds. Robustness and maintainability are both manifestations of a concise and consistent design carried through to the implementation. If you don't believe that, you may as well not read this blog.

Fixing bugs one at a time is a way to keep your customers happy one bug at a time. But restructuring software is a way to keep the customers who are complaining and the ones who are not complaining happy in the long term. The customers who are not complaining are the deadly ones: They have already written off your enterprise and are quietly advising their friends to stay away.

In addition to capturing the silent detractors, not having to fix bugs one-by-one frees up resources for other endeavors, such as adding new features. Depending on the expected longevity of the software and the nature of the competition, taking the long view and investing the resources required to produce correct, robust and maintainable code at any point in the software lifecycle can pay huge dividends. Note: Software always hangs around longer than you expect.

Correctness

Correctness is whether the actual output of the program agrees with its expected output. However, testing all possible inputs and outputs is usually infeasible for any nontrivial program. To be verified reasonably, correctness has to be abstracted in terms of the model(s) the program is expected to implement. This can only be done by specifying the expected behavior of the program in terms of models and then verifying -- by inspection of the implementation itself -- that the models are faithfully rendered in the code. Testing may indicate that certain important cases are handled as desired, but without examining the underlying model, there can be no assurance that a nearby test case will yield the correct result.

Whether P=NP may be the fundamental question in Computer Science, but the fundamental problem in Computer Science is how to express what you want the machine to do. Theorem-proving systems may be used to verify the correctness of a given piece of code. But then the problem is merely mapped into a different domain: Once you have fed the essential information into the theorem-proving system, you may as well write a translator to go from theorem-proving language into executable code. That is, the formal description of how the program is expected to behave contains all of the information required to generate that program. The problem is "fundamental" because it cannot be transformed out of existence.

Given that at least part of the expression of a program is fundamental, there is no technique that can be proposed which will ensure that a given program is correct. The best that can be achieved is to arrange the code so it is hard to write code that is incorrect. Although there are many transformations that support this goal, they can be summarized as "arrange the code so as to expose the essential design decisions".

Techniques discussed here that are useful in exposing design points will be flagged as such, with a description of where they are especially applicable. We cannot guarantee correctness, but we can make it very hard for bugs to remain hidden in the code.

Robustness

Robustness is the ability of a program to respond reasonably to incorrect or unusual inputs. Error handling is typically left out of software designs and implementations, or added as an afterthought. Targeting robustness elevates the role of correct error handling to the same importance as the core functionality.

Erroneous inputs can arise from faults elsewhere in the system, or due to maintenance (e.g. adding a new feature). When dealing with these odd inputs, robust software will do something reasonable, and "just work". Code that is not robust will fail without giving any useful indication of where the problem lies.

Robustness is also a practical measure of modularity and encapsulation within the code, since being able to reason about the behavior of the software toward invalid inputs require a complete comprehension of the code that will encounter those cases. Logic that is distributed across several modules (lack of modularity) and assumptions that can be arbitrarily rendered invalid (lack of encapsulation) will defeat attempts to gain a comprehensive view. In this sense robustness is the flip-side of correctness: Correctness deals with the response of the program to anticipated inputs; robustness deals with the response of the program to unexpected inputs; both are only accomplished effectively when the coder can understand the implementation completely.

Maintainability

Maintainability is a measure of how easy it is to alter the code -- in response to bug reports or requests for new features. In well-factored code, all of the logic required to deal with the new or erroneous case is in one place. In that case, it is easy for a programmer to identify and implement the necessary changes. This can be done with high confidence and entail a minimum of testing.

On the other hand, poorly factored code is hard to maintain because a successful change will require the programmer to locate all of the areas of the code affected, and make the modifications repeatedly until the desired behavior is achieved. In that case, a debugger is almost always involved. Ultimately, the coder has confidence only in the code he has inserted (if that). Thus, a declaration of success entails extensive testing, as there is little confidence that the changes have not adversely affected other areas of the code.

The result of a modification to unstructured code is usually code that is even less structured. At some point, the code will go over the precipice, requiring ever-increasing amounts of effort to implement succeeding changes. The logical approach to battling this decline is to invest the time to add structure to the program as you are maintaining it. That is the point of this blog. Blaming the original coder for the state of the code is only a recognition of the problem -- not a solution. Making the software better requires corrective action, and the techniques described here give you the tools to do so.

Maintainability is also about communication. Because correct solutions to coding problems depend on the coder's comprehension of the code, maintainability is a measure of how easy it is for anyone (not just the original programmer) to read and understand the code. Whether it is your future self or someone else that is tasked with maintaining your code, you will be thanked for producing correct, complete, comprehensible, commented and well-structured code.

Monday, August 25, 2014

Avoid Chained if ... else if .. Constructs, Part 3

Use continue and return to highlight where special cases have been fully handled.

In the previous post, we showed that if a sequence of conditionals are written such that they are mutually independent (not necessarily mutually exclusive), then the "else" keywords are optional. It is also the case that the clauses can be reordered arbitrarily (assuming that the semantics of the body of each "if" clause is also order-independent). If the reordering proves to be legal, it can enable other simplifications in the code. But more importantly, this reordering can be used to emphasize the most important case by weeding out and handling special cases briefly (at the head of a loop or function.

A large amount of the code I deal with is devoted to handling trivial or special cases and filtering out cases that are not of interest. For these cases, I like to use statements that disrupt the control flow (break, return, continue, throw). Adherents to structured programming principles will consider this style distasteful, but there is a good reason to prefer it over the strict structured-programming style: readability. Marking the end of the special-case code with a continue or return statement tells the reader that the special case as been removed from further consideration. This summarization can be a significant aid in understanding the purpose of the routine as a whole.

Consider the following example:

typedef std::vector<T*> myVectorType;
myVectorType myVector;
...
// Traverse a vector that contains a mix of valid T pointers and null pointers.
for (myVectorType::iterator i = myVector.begin();
i != myVector.end();
++i)
{
T* ptr = *i;
if (ptr)
{
// A valid pointer.
<Do lots of stuff for several pages.>
}
else
{
// Oho! I'll bet you didn't expect this!
++nullPtrCount;
}
}

The problem here is that the if ... else (even without "else if" clauses) involves action-at-a-distance. To understand the case in which ptr is null, the reader has to scan ahead to find the matching else.

Recoding the two clauses as "if (ptr)" and "if (!ptr)" does not change the semantics of the snippet, and since these two clauses are independent, they can be reordered freely. Using the knowledge that they are mutually exclusive, we can place a "continue" at the end of the first clause. This leaves us with

// Traverse a vector that contains a mix of valid T pointers and null pointers.
for (myVectorType::iterator i = myVector.begin();
i != myVector.end();
++i)
{
T* ptr = *i;
if (!ptr)
{
++nullPtrCount;
continue;
}
if (ptr)
{
// A valid pointer.
<Do lots of stuff for several pages.>
}
}

Now the reader can see clearly that after nullPtrCount has been incremented, nothing more is done with that case. His attention is free to focus on the pages of code that follow, dealing with the interesting case. The coder can again use the knowledge that the two cases are mutually exclusive to remove the condition and brackets from the case where the pointer is non-null. Since this is the interesting case, it makes sense to promote it to the top level within the loop:

// Traverse a vector that contains a mix of valid T pointers and null pointers.
for (myVectorType::iterator i = myVector.begin();
i != myVector.end();
++i)
{
T* ptr = *i;
if (!ptr)
{
++nullPtrCount;
continue;
}
// A valid pointer.
<Do lots of stuff for several pages.>
}

This form is useful for dealing with other special cases. They may be weeded out at the top of the loop, leaving unchanged the fact (and representation) that the main purpose of the loop is dealing with the common case.

After the code is rendered relatively readable in this way, we may turn our attention to efficiency considerations. Though it may be the case that all special cases must be weeded out before reaching the common case, it may also be the case that the order in which they are treated is unimportant. Suppose also that the predicate for identifying one special case is relatively cheap while another is relatively expensive. In that case, it will obviously improve the efficiency of the code if the cheaper predicate is tested first. That is, one should follow the pattern:

for (coo::iterator i = c.begin(); i != c.end(); ++i)
{
if (cheapPredicate(i))
{
// handle special case 1.
continue;
}
if (expensivePredicate(i))
{
// handle special case 2.
continue;
}
// handle common case.
}

In the context function, the continue statements are replaced by return statements. Otherwise, the code looks much the same. To illustrate this, we factor out the body of the above loop as a helper function.

for (coo::iterator i = c.begin(); i != c.end(); ++i)
coo_iter_helper(i);
void coo_iter_help(coo::iterator i)
{
if (cheapPredicate(i))
{
// handle special case 1.
return;
}
if (expensivePredicate(i))
{
// handle special case 2.
return;
}
// handle common case.
}

Saturday, August 23, 2014

Avoid Chained if ... else if ... Constructs, Part 2

Make each clause independent, and remove the "else" keyword.

Even when the conditionals in a chained if-else construct are all testing the same dimension, the clauses may not be mutually exclusive. This causes the reader to bear in mind the negation of all of the preceding clauses. It also makes the code fragile with respect to the introduction of a new clause. These are actually two views of the same problem -- action at a distance -- which should generally be avoided in robust programming.

Let's look at a simple snippet that tests whether an variable is valid decimal digit:

int digit = ... ;
if (digit < 0) printf("Your digit is too small.\n");
else if (digit > 9) print("Your digit is too large.\n");
else printf("You entered a valid digit.\n");

Now suppose we want to do something special when the digit is exactly zero. So we add an "else if" clause after the "if" clause:

int digit = ... ;
if (digit < 0) printf("Your digit is too small.\n");
else if (digit == 0) printf("You entered zero.\n");
else if (digit > 9) print("Your digit is too large.\n");
else printf("You entered a valid digit.\n");

The problem is that introduction of the new clause has eroded the subspace captured by the "else" clause. We really want "You entered a valid digit.\n" to be printed even when the digit is zero.

This may not seem like much of a problem in this toy example. The programmer making the change can easily scan forward and see that the "else" clause will now capture only {1..9}. He can then verify whether that is the intended behavior. However, if the chained if ... else contains many clauses and spans several pages, the programmer may not see that his change will introduce a bug. How can this be avoided? By removing the "else" keywords.

We started out with two clauses that were mutually exclusive, but the third is dependent on the other two. The first step is to rewrite the construct so that all of the clauses are mutually exclusive:

int digit = ... ;
if (digit < 0) printf("Your digit is too small.\n");
else if (digit > 9) print("Your digit is too large.\n");
else if (digit >= 0 && digit <=9) printf("You entered a valid digit.\n");

Now that the clauses are mutually exclusive, the "else" keywords can be removed without changing the behavior of the code snippet:

int digit = ... ;
if (digit < 0) printf("Your digit is too small.\n");
if (digit > 9) print("Your digit is too large.\n");
if (digit >= 0 && digit <=9) printf("You entered a valid digit.\n");

Observe that it is no longer important whether the clauses are mutually exclusive or not. We have converted a dependent form (where the last clause executes only if the preceding two do not, etc.) into an imperative form. Whenever the condition of the first clause is true, its body executes; whenever the condition on the second clause is true, its body executes, etc.
This kind of imperative programming is very useful in writing robust code, because the behavior of the snippet no longer depends on the order of the clauses, and is not disrupted by the addition of another clause. Now, adding the new clause in the same style obtains the expected behavior:

int digit = ... ;
if (digit < 0) printf("Your digit is too small.\n");
if (digit == 0) printf("You entered zero.\n");
if (digit > 9) print("Your digit is too large.\n");
if (digit >= 0 && digit <=9) printf("You entered a valid digit.\n");

Someone will immediately point out that the tests in the third clause are redundant with the tests in the first two. Imagine how much processing time will be wasted! To be concerned about that is an example of early optimization. In general the benefits of writing imperative code outweigh any runtime efficiency consideration. And where optimization is required, the imperative style actually helps to achieve that goal. That will be the topic of the next installment.

Although it can help with maintainability, this style of coding isn't always ideal. For example, it may still be necessary to use a chained if-else construct where you want to capture an unhandled case and flag an error. I also use simple if-else statements (containing no "else if" clauses) quite a bit, because forming the negation of the conditional of the "if" statement would be verbose and error-prone.

Friday, August 22, 2014

Avoid Chained If ... else if ... Constructs, Part 1

Conditionals testing different dimensions should have different indentations

There are a number of reasons to avoid using chained "if ... else if ... else" constructs. First among these is the ease with which they hide bugs.

Chained if-else constructs can hide bugs because each conditional can test a different dimension of the state space. When this is done, successive "if" clauses explore successively smaller half-spaces in the state space. As a result, much of the matrix of possibilities in the state space is left unexplored.

What does all of that mean? We need an example.

if (A) doA();
else if (B) do_AB();
else if (C) do_A_BC();
else do_A_B_C();

Here, I use an underscore to indicate "not", so do_AB() represents "do something when A is false and B is true".

If we have three predicates (A, B and C) and they are orthogonal (independent), then there are 8 possibilities in the full matrix of possibilities:

_A_B_C
_A_BC
_AB_C
_ABC
A_B_C
A_BC
AB_C
ABC

yet, we have enumerated only four of these. The source code using a chained if-else construct looks neat, but it has obscured the fact that all possibilities have not been canonically examined. Time after time, I have exposed (and corrected) long-standing bugs by merely expanding a chained if-else construct to its canonical form, and filling in the missing cases.

The simple rule-of-thumb that may be followed to achieve this is that the conditionals in a chained if-else construct must all lie along the same dimension. For example, if we have a straight enumeration:

enum signal_color
{
SIGNAL_NONE,
SIGNAL_RED,
SIGNAL_YELLOW,
SIGNAL_GREEN
};

then the following chained if-else would be permissible:

signal_color sc = ... ;
if (sc == SIGNAL_NONE) printf("Stop, then go.\n");
else if (sc == SIGNAL_RED) printf("Stop.\n");
else if (sc == SIGNAL_YELLOW) printf("Floor it!\n");
else if (sc == SIGNAL_GREEN) printf("Go.\n");
else error("Unexpected signal_color.\n");

Note that every clause tests the same variable, sc, and that every test uses equality. (Bit tests are different, as shown below.) Since all the tests lie along the same dimension, it makes sense to add an else clause, to capture unhandled cases. This is good Software Engineering practice, as it guards against unexpected behavior in this code, should another element be added to the enumeration in the future.

However, if we have a bit-mapped enumeration:

enum signal_color_components
{
SIGNAL_COMPONENT_BLUE = 1<<0,
SIGNAL_COMPONENT_GREEN = 1<<1,
SIGNAL_COMPONENT_RED = 1<<2,
};

Then the following chained if-else is not permissible:

signal_color_components scc = ... ;
if (scc & SIGNAL_COMPONENT_RED) printf("Stop.\n");
else if (scc & SIGNAL_COMPONENT_GREEN) printf("Go.\n");
else printf("Stop, then go.\n");

In this version, it is evident that we left room for a bug, and the damned thing crawled right in! We want the case scc & SIGNAL_COMPONENT_RED && scc & SIGNAL_COMPONENT_GREEN to print out "Floor it!\n", and yet this is not done.

The predicates testing each bit of a bitmap are orthogonal predicates (i.e. predicates testing different dimensions), so we fell right into the trap. If the rule of thumb is followed, we would first rewrite the above as:

signal_color_components scc = ... ;
if (scc & SIGNAL_COMPONENT_RED)
{
printf("Stop.\n");
}
else
{
if (scc & SIGNAL_COMPONENT_GREEN) printf("Go.\n");
else printf("Stop, then go\n");
}

Now it is obvious that the first clause (where the signal color has a red component) is missing a case. After adding back the missing case:

signal_color_components scc = ... ;
if (scc & SIGNAL_COMPONENT_RED)
{
if (scc & SIGNAL_COMPONENT_GREEN) printf("Floor it!\n");
else printf("Stop.\n");
}
else
{
if (scc & SIGNAL_COMPONENT_GREEN) printf("Go.\n");
else printf("Stop, then go\n");
}

the code segment will behave as intended. Goodbye bug!

Wednesday, July 30, 2014

Useless Asserts

Avoid asserts that restate the obvious

In code, it is fairly common to see something like:

class B { ... } ;
class D : B { ... } ;
void foo(B* b) {
D* d = dynamic_cast<D*>(b);
assert(d != 0);
if (d->somefield) ... ;
}

In this context, the assert is just visual noise, and should therefore be removed. The dereference of d in the subsequent line will abort the program just as effectively as the assert: The fact that d is expected to be a valid pointer is implied by the dereference, and any competent programmer will recognize this.

Rather than using an assert to restate the obvious, the space would be better spent by inserting a comment stating why the program may assume that d is valid at that point in the code.

It makes more sense to add an assert ahead of the dynamic_cast, since dynamic_cast will return a null pointer when b is a valid pointer but cannot be dynamically cast to type D*. In debugging, that would permit the programmer to distinguish between abuse (passing in a null pointer) and misuse (passing in a pointer of the wrong type).

Of course, strict use of exceptions solves both problems:

void foo(B* b) {
if (! b) throw NullPointerException("foo must be called with a valid B*");
D* d = dynamic_cast<D*>(b);
if (! d) throw InvalidArgumentException("foo's argument must be convertible to type D*");
if (d->somefield) ... ;
}

But the least a programmer should do is never add (and conscientiously remove) asserts that belabor the obvious.

Tuesday, June 10, 2014

Factoring Kernels

Distribute control across complex kernels
Break down complex transformations into a linear sequence of smaller transformations

A technique that I use repeatedly to make the structure of a program more flexible and easier to understand is to distribute control flow across a complex kernel. The control structure is usually some kind of iteration, the kernel is the body of the loop. This transformation can be applied if the steps contained in the kernel are independent of one another and they do not have to be completed in any particular order.

It is a hallmark of premature optimization to mash together several unrelated operations under a single copy of the control structure. This is done under the mistaken notion that it is more efficient to amortize the loop overhead across them. The assumption may be true at the outset, but coalescing loops that have slightly different filter functions leads to the insertion of additional conditions within the loop. The disruption of control flow due to these conditions will very quickly negate the efficiency that loop coalescing was intended to provide. That being the case, application of this kernel refactoring often enables further simplifications.

The factoring is straightforward, and can be performed incrementally using the following steps:

Insert a comment to identify the start (and end) of each kernel;
Remove the control structure and insert a copy around each kernel;
Factor the resulting code into separate functions, if appropriate.

It should be stressed, that this refactoring is based on the assumption that each kernel can be performed independently over the entire data structure being traversed. This assumption must be verified, either by using detailed knowledge of the effects of each kernel, or by applying verification testing after the refactoring is performed. As a general rule of thumb, each function/method in a program should fit on a single screen. To put numbers to this, a function longer than about 50 source lines should be considered for refactoring.

Here is a function I recently refactored using this technique [courtesy of the Chapel project, https://sourceforge.net/projects/chapel/23535/tree/trunk/compiler/resolution/functionResolution.cpp:7134]:

static void removeUnusedFormals() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      Vec<SymExpr*> symExprs;
      for_formals(formal, fn) {
        // Remove formal default values
        if (formal->defaultExpr)
          formal->defaultExpr->remove();
        // Remove formal type expressions
        if (formal->typeExpr)
          formal->typeExpr->remove();
        // Remove method and leader token formals
        if (formal->type == dtMethodToken || formal->hasFlag(FLAG_INSTANTIATED_PARAM))
          formal->defPoint->remove();
        if (formal->hasFlag(FLAG_TYPE_VARIABLE) &&
            (!formal->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE) &&
             !fn->hasFlag(FLAG_EXTERN))) {
          SET_LINENO(formal);
          formal->defPoint->remove();
          VarSymbol* tmp = newTemp("_formal_type_tmp_", formal->type);
          fn->insertAtHead(new DefExpr(tmp));
          if (symExprs.n == 0)
            collectSymExprs(fn->body, symExprs);
          forv_Vec(SymExpr, se, symExprs) {
            if (se->var == formal) {
              if (CallExpr* call = toCallExpr(se->parentExpr))
                if (call->isPrimitive(PRIM_DEREF))
                  se->getStmtExpr()->remove();
              se->var = tmp;
            }
          }
        }
        if (formal->hasFlag(FLAG_TYPE_VARIABLE) &&
            formal->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
          if (FnSymbol* fn = valueToRuntimeTypeMap.get(formal->type)) {
            Type* rt = (fn->retType->symbol->hasFlag(FLAG_RUNTIME_TYPE_VALUE)) ?
                        fn->retType : runtimeTypeMap.get(fn->retType);
            INT_ASSERT(rt);
            formal->type = rt;
            formal->removeFlag(FLAG_TYPE_VARIABLE);
          }
        }
      }
      if (fn->where)
        fn->where->remove();
      if (fn->retTag == RET_TYPE) {
        VarSymbol* ret = toVarSymbol(fn->getReturnSymbol());
        if (ret && ret->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
          if (FnSymbol* rtfn = valueToRuntimeTypeMap.get(ret->type)) {
            Type* rt = (rtfn->retType->symbol->hasFlag(FLAG_RUNTIME_TYPE_VALUE)) ?
                        rtfn->retType : runtimeTypeMap.get(rtfn->retType);
            INT_ASSERT(rt);
            ret->type = rt;
            fn->retType = ret->type;
            fn->retTag = RET_VALUE;
          }
        }
      }
    }
}
}

The control structure is the loop over FnSymbols, filtering out those that do not have a valide defPoint or parentSymbol.

Step 1. Add kernel comments:

static void removeUnusedFormals() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      Vec<SymExpr*> symExprs;
      for_formals(formal, fn) {
        // Kernel 1: Remove formal default values
        if (formal->defaultExpr)
          formal->defaultExpr->remove();

        // Kernel 2: Remove formal type expressions
        if (formal->typeExpr)
          formal->typeExpr->remove();

        // Kernel 3: Remove method and leader token formals
        if (formal->type == dtMethodToken || formal->hasFlag(FLAG_INSTANTIATED_PARAM))
          formal->defPoint->remove();

        // Kernel 4: Convert type variable formals.
        if (formal->hasFlag(FLAG_TYPE_VARIABLE) &&
            (!formal->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE) &&
             !fn->hasFlag(FLAG_EXTERN))) {
          SET_LINENO(formal);
          formal->defPoint->remove();
          VarSymbol* tmp = newTemp("_formal_type_tmp_", formal->type);
          fn->insertAtHead(new DefExpr(tmp));
          if (symExprs.n == 0)
            collectSymExprs(fn->body, symExprs);
          forv_Vec(SymExpr, se, symExprs) {
            if (se->var == formal) {
              if (CallExpr* call = toCallExpr(se->parentExpr))
                if (call->isPrimitive(PRIM_DEREF))
                  se->getStmtExpr()->remove();
              se->var = tmp;
            }
          }
        }
        if (formal->hasFlag(FLAG_TYPE_VARIABLE) &&
            formal->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
          if (FnSymbol* fn = valueToRuntimeTypeMap.get(formal->type)) {
            Type* rt = (fn->retType->symbol->hasFlag(FLAG_RUNTIME_TYPE_VALUE)) ?
                        fn->retType : runtimeTypeMap.get(fn->retType);
            INT_ASSERT(rt);
            formal->type = rt;
            formal->removeFlag(FLAG_TYPE_VARIABLE);
          }
        }
      }

      // Kernel 5: Remove where clauses
      if (fn->where)
        fn->where->remove();

      // Kernel 6: Convert runtime return types.
      if (fn->retTag == RET_TYPE) {
        VarSymbol* ret = toVarSymbol(fn->getReturnSymbol());
        if (ret && ret->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
          if (FnSymbol* rtfn = valueToRuntimeTypeMap.get(ret->type)) {
            Type* rt = (rtfn->retType->symbol->hasFlag(FLAG_RUNTIME_TYPE_VALUE)) ?
                        rtfn->retType : runtimeTypeMap.get(rtfn->retType);
            INT_ASSERT(rt);
            ret->type = rt;
            fn->retType = ret->type;
            fn->retTag = RET_VALUE;
          }
        }
      }
    }
}
}

Step 2. Distribute control structure.

static void removeUnusedFormals() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {
        // Kernel 1: Remove formal default values
        if (formal->defaultExpr)
          formal->defaultExpr->remove();
      }
    }
}
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {        // Kernel 2: Remove formal type expressions
        if (formal->typeExpr)
          formal->typeExpr->remove();
      }
    }
}
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {        // Kernel 3: Remove method and leader token formals
        if (formal->type == dtMethodToken || formal->hasFlag(FLAG_INSTANTIATED_PARAM))
          formal->defPoint->remove();
      }
    }
}
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      Vec<SymExpr*> symExprs;
      for_formals(formal, fn) {        // Kernel 4: Convert type variable formals.
        if (formal->hasFlag(FLAG_TYPE_VARIABLE)) {
          if (!formal->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
            if (!fn->hasFlag(FLAG_EXTERN))) {
            SET_LINENO(formal);
            formal->defPoint->remove();
        VarSymbol* tmp = newTemp("_formal_type_tmp_", formal->type);
            fn->insertAtHead(new DefExpr(tmp));
            if (symExprs.n == 0)
              collectSymExprs(fn->body, symExprs);
            forv_Vec(SymExpr, se, symExprs) {
              if (se->var == formal) {
                if (CallExpr* call = toCallExpr(se->parentExpr))
                  if (call->isPrimitive(PRIM_DEREF))
                se->getStmtExpr()->remove();
                se->var = tmp;
              }
        }
            }
          } else {
            if (FnSymbol* fn = valueToRuntimeTypeMap.get(formal->type)) {
            Type* rt = (fn->retType->symbol->hasFlag(FLAG_RUNTIME_TYPE_VALUE)) ?
                      fn->retType : runtimeTypeMap.get(fn->retType);
              INT_ASSERT(rt);
              formal->type = rt;
              formal->removeFlag(FLAG_TYPE_VARIABLE);
            }
          }
        }
      }
    }
}
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {     // Kernel 5: Remove where clauses
      if (fn->where)
        fn->where->remove();
      }
    }
}
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {
        // Kernel 6: Convert runtime return types.
      if (fn->retTag == RET_TYPE) {
          VarSymbol* ret = toVarSymbol(fn->getReturnSymbol());
        if (ret && ret->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
          if (FnSymbol* rtfn = valueToRuntimeTypeMap.get(ret->type)) {
            Type* rt = (rtfn->retType->symbol->hasFlag(FLAG_RUNTIME_TYPE_VALUE)) ?
                        rtfn->retType : runtimeTypeMap.get(rtfn->retType);
            INT_ASSERT(rt);
            ret->type = rt;
            fn->retType = ret->type;
            fn->retTag = RET_VALUE;
          }
        }
      }
    }
}
}

Notice that the factoring makes it clear that the vector symExprs is used only in Kernel 4, so I removed it elsewhere. Also, in kernel 4, the test for a formal having the FLAG_TYPE_VARIABLE flag can be factored out of the two cases. Similarly, the mutually-exclusive cases involving the FLAG_HAS_RUNTIME_TYPE can be converted into an if ... else ... construct. Apparently, the function cannot carry FLAG_EXTERN if FLAG_HAS_RUNTIME_TYPE is true, so this test should probably be moved into an outer scope.

Step 3. Break into separate functions.

static void removeDefaultExprs() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {
        // Kernel 1: Remove formal default values
        if (formal->defaultExpr)
          formal->defaultExpr->remove();
      }
    }
}
}

static void removeTypeExprs() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {        // Kernel 2: Remove formal type expressions
        if (formal->typeExpr)
          formal->typeExpr->remove();
      }
    }
}
}

static void removeMethodAndTokenFormals() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {        // Kernel 3: Remove method and leader token formals
        if (formal->type == dtMethodToken || formal->hasFlag(FLAG_INSTANTIATED_PARAM))
          formal->defPoint->remove();
      }
    }
}
}

static void convertTypeVariableFormals() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      Vec<SymExpr*> symExprs;
      for_formals(formal, fn) {        // Kernel 4: Convert type variable formals.
        if (formal->hasFlag(FLAG_TYPE_VARIABLE)) {
          if (!formal->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
            if (!fn->hasFlag(FLAG_EXTERN))) {
            SET_LINENO(formal);
            formal->defPoint->remove();
        VarSymbol* tmp = newTemp("_formal_type_tmp_", formal->type);
            fn->insertAtHead(new DefExpr(tmp));
            if (symExprs.n == 0)
              collectSymExprs(fn->body, symExprs);
            forv_Vec(SymExpr, se, symExprs) {
              if (se->var == formal) {
                if (CallExpr* call = toCallExpr(se->parentExpr))
                  if (call->isPrimitive(PRIM_DEREF))
                se->getStmtExpr()->remove();
                se->var = tmp;
              }
        }
            }
          } else {
            if (FnSymbol* fn = valueToRuntimeTypeMap.get(formal->type)) {
            Type* rt = (fn->retType->symbol->hasFlag(FLAG_RUNTIME_TYPE_VALUE)) ?
                      fn->retType : runtimeTypeMap.get(fn->retType);
              INT_ASSERT(rt);
              formal->type = rt;
              formal->removeFlag(FLAG_TYPE_VARIABLE);
            }
          }
        }
      }
    }
}
}

static void removeWhereClauses() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {     // Kernel 5: Remove where clauses
      if (fn->where)
        fn->where->remove();
      }
    }
}
static void convertRuntimeReturnTypes() {
forv_Vec(FnSymbol, fn, gFnSymbols) {
    if (fn->defPoint && fn->defPoint->parentSymbol) {
      for_formals(formal, fn) {
        // Kernel 6: Convert runtime return types.
      if (fn->retTag == RET_TYPE) {
          VarSymbol* ret = toVarSymbol(fn->getReturnSymbol());
        if (ret && ret->type->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
          if (FnSymbol* rtfn = valueToRuntimeTypeMap.get(ret->type)) {
            Type* rt = (rtfn->retType->symbol->hasFlag(FLAG_RUNTIME_TYPE_VALUE)) ?
                        rtfn->retType : runtimeTypeMap.get(rtfn->retType);
            INT_ASSERT(rt);
            ret->type = rt;
            fn->retType = ret->type;
            fn->retTag = RET_VALUE;
          }
        }
      }
    }
}
}

I used the function name convention in Renaming Functions [TBS] to choose appropriate names for the new functions.

The original function must be replaced with one that calls each of the new functions:
static void removeUnusedFormals() {
removeDefaultExprs();
removeTypeExprs();
removeMethodAndTokenFormals();
convertTypeVariableFormals();
removeWhereClauses();
convertRuntimeReturnTypes();
}

This makes it readily apparent that the original function was performing (at least) six separate actions -- some related, some unrelated. It is conceivable that the conversion of runtime type formals and return types was added after the original remove<>() functions were already there, so the programmer could save the effort of cutting-and-pasting the control structure (thereby also avoiding providing a clear delineation of the actions being performed). Those conversions should have been put in their own routine (convertRuntimeTypeFormalsAndReturnValues(), e.g.) to make obvious the intent of the added code.

Making the code run efficiently is the compiler's job. Making the code obvious, and consequently easy to understand and maintain is the programmer's job.