Previous Section Table of Contents Next Section

Walk Through Each Section

To walk through code, you have to learn how to "think like a computer"-that is, how to walk through source code while tracking the exact state that the computer is in and thus, hopefully, trigger the "Eureka" moment when you realize where the actual state diverges from the intended state. In other words, you find the bug.

Emulating a computer might seem obvious, but in practice, it can be quite hard.

It can be difficult, especially after reading through lots of code, to avoid simply sliding over statements that look reasonable. Remember that the computer devotes its full attention to each statement as it is being executed, and you need to do the same. No matter if a statement seems obvious, if a constant definition looks trivial, if an expression seems correct at first glance, you have to force yourself to focus on what is actually in the code, not what is supposed to be there or what you think is there. This involves walking through the code for a specific input. You are not walking through trying to keep track of a range of possibilities based on different inputs, such as "this variable will be 0 unless the height was greater than 100, in which case, it will be 1." Every input will have an exact value, and this will determine the exact values of other variables.

Track Variables

When walking through code, you need to keep track of what value is in every variable, unless you have determined that a variable is no longer important to the function (and even then, you might discover that such a determination was false).

There are really two ways to keep track of variables:

  • Say to yourself, as you begin to look at each statement, "OK, so x is 12 and subtotal will be 32 right here . . . ." This can work well for simple cases.

  • Write down all the variables on a piece of paper. This way is better if there are many variables, or the statements contain complicated expressions where parsing would require too much brainpower for you to simultaneously remember what values were stored in every variable.

Take this example:


userid = get_userid();

access = (privilege > 3) ?

            max_access(userid) : (privilege << 2) + 1;


Hmmm, now what was the value of userid again?

Keep in mind that for every variable, every statement in the program either modifies the variable or does not modify the variable. The computer never loses track and forgets to modify a variable if instructed to do so; writing it all down on paper helps prevent you from losing track. It also helps you realize which variables change during the section and which ones remain constant.

If you discover that the inputs you have selected make it too difficult to keep track of all the variables-for example, the array you have chosen is too large-you can go back and change your inputs. Keep in mind, however, that certain bugs might appear only with large enough inputs.

Code Layout

The layout of the code in most languages is intended as a hint to a person who is reading the code, but it usually is not used by the compiler or interpreter when determining how to execute a program. Unless a language specifically requires it, indentation and the placement of curly braces should not be used to infer the semantics of code; you have to check that the actual semantics are correct. Code such as the following


if (a == b)

    function_A();

    function_B();


likely has a different meaning from


if (a == b) {

    function_A();

    function_B();

}


You might need to read the code very carefully to notice it.

On the other hand, in some languages, layout issues, such as indentation or which column a character appears in, are significant, and can cause the opposite sort of confusion, where you miss the significance of indentation. In Python, the code


if a == b:

    function_A()

    function_B()


is different from


if a == b:

    function_A()

function_B()


Improperly terminated comments can also obscure the true nature of code. In the following C code fragment


/*

 * Add x

 *



tot += x;



/*

 * now add y;

 */



tot += y;


the statement


tot += x;


is not executed because it is part of a comment. If you are debugging code and you have narrowed the problem down to a small section of code, but you simply cannot determine where the bug is, some languages allow you to remove the comments (for example, running the code through the C preprocessor) to check whether the bug is related to a statement unexpectedly being commented out.

Also, be careful when reading complicated arithmetic expressions, especially those that do not use parentheses to make the order of evaluation explicit. If you are not sure how an expression will be parsed, you can add parentheses yourself in a way that you feel is correct, and then see if this changes the program's behavior.

Loops

Loops can be especially tricky to walk through because you cannot usually simulate every iteration of a loop.

With code that proceeds linearly without loops, it is often easy to spot bugs by examining each line in turn. With loops, however, it is usually impossible to walk through the entire set of instructions that will be executed when the loop is completely iterated.

With any loop, pay attention to where the loop exits and where it exits to. Normally, a loop exits at the end when the termination condition becomes false, but loops can also exit because of break statements in the middle, or return statements from inside a function. Note if a loop has a break statement and where it will jump to. Some languages have a way to specify code that is always executed when the loop ends, such as the else clause you can add to a loop in Python (it is executed if the loop ends naturally-when a for loop list is finished, or a while condition becomes false-but not if the loop is exited because of a break statement).

Of course, remember that the exit condition of a loop is only implicitly tested at the end of a loop. With code such as the following


while x > 0:

    # code block A

    if (some_condition):

        x = 0

    # code block B


code block B will still execute after x is set to 0, unless an explicit break is added after the x = 0 statement. You might be constantly evaluating loop exit conditions in your head, but the computer isn't. This means that, if somewhere in code block B, there is an assumption that x is always greater than 0, then the code may break.

When the loop is done, it is important in those cases to be aware of what state a particular language will leave a loop counter in. In particular, will it be set to the value it had during the last iteration, or one more than that? The following Python loop statement


for i in range(3, 10):


and the C loop


for (i = 3; i < 10; i++)


appear to do the same thing: loop i through the values 3, 4, 5, 6, 7, 8, and 9. However, after the Python loop, i will have the value 9, whereas after the C loop, i will have the value 10.

When you have a loop that needs to iterate many times, you have to choose certain iterations of the loop to walk through. A good choice to begin with is to walk through the first iteration, the second iteration, the second-to-last iteration, and the last iteration. For example, in code such as the following


for (k = 0; k < MAX_COUNT; k++) {

    // loop body

}


walk through with k equal to 0, 1, MAX_COUNT-2, and MAX_COUNT-1. Of course, this won't catch every bug, but in general, if the loop does the right thing for those values, it probably does the right thing for the intermediate values that you don't walk through.

In cases where the result of an iteration depends on what happened in the previous iteration, you can often use an inductive process to prove to yourself that the loop is correct: Assume the loop worked correctly on the previous iteration, and then see if this implies that it will work correctly on this one.

    Previous Section Table of Contents Next Section