D-Data

D.index

A D.index error occurs when an invalid index is used when walking through an array or other data structure.

Many languages use zero-based arrays; that is, the valid indices for an array of size n go from 0 to n-1. This leads to a common indexing error when looping through such an array, starting at 1 instead of 0. (In the Python example here, note that range(1,n) includes numbers from 1 to n-1.)


for i in range(1, n):

    # code that processes array[i]

Beginning the indexing at 1 instead of 0 causes the code to miss the first element of the array. (In Python, you can write range(n) as a shortcut for range(0,n), which makes this error less likely to occur.) Similarly, you can make the same mistake on the other end, going past the end of the array:


for (i = 0; i <= n; i++)

    // code that processes array[i]

As mentioned earlier, such errors can be categorized as A.off-by-one errors, but here they will be listed as D.index instead. An index error does not have to be off by one. It can be off by much more, especially when the index is part of a calculation:


int process_array(int my_array[]) {

    int index_to_check;

    for (int k = 0; k < my_array.length; k++) {

        if (k < my_array.length / 2)

            index_to_check = k;

        else

            index_to_check = my_array.length + k;

        check(k[index_to_check]);

    }

}

This code miscalculates index_to_check in the else clause of the if, which leads to an index error.

D.limit

A D.limit error involves failing to process data correctly at the limits: the first or last element of the data set (or possibly, the first few or last few elements).

An index error often leads to a limit error. It may cause the code to not process the first or last element at all (if the indexing is too restrictive, as in the preceding example where the range() starts at 1). Or, it may cause the code to crash accessing past the end of the data (if the indexing is too expansive, as in the previous example where the loop termination check is i <= n instead of i < n).

Other D.limit errors occur when the code makes assumptions that are true except on the first or last element. For example, code that is parsing lines in a file into sections delimited by lines containing "###" might have a section like this:


String line;

while (true) {

    line = getnewline();

    if (line.equals("###")) {

        break;

    // other code

    }

}

If the file ends without a "###" line the code might loop forever.

I also include as a D.limit error cases where the code works incorrectly on certain inputs near the beginning or end of the range of valid inputs. That is, unlike the previous examples, which tend to slightly misprocess all inputs, these are cases where the code works fine on most inputs, but completely fails on a small subset near the limit. For example, this code attempts to print a baseball player's batting average to three decimal places. (The Python function str() converts a number to its string representation; string.zfill() pads a string with zeros up to the number specified as the second argument.)


def print_average(hits, at_bats):

    average_string = str((1000 * hits) / at_bats)

    print "." + string.zfill(average_string, 3)

The code has a D.limit error. It works except when hits is equal to at_bats. In that case, it prints out ".1000" instead of "1.000".

D.number

The D.number class of data errors relates to how numbers are stored on a computer. Knuth mentions floating-point rounding errors and calls them Language Lossage. However, D.number errors are usually not specific to a particular language, but rather to how a particular processor stores numbers (and because different machines use the same processor, the same type of error can occur in many languages on many machines).

I won't use any floating-point numbers in the examples, but certain types of errors occur because of how integers are stored in memory.

The most basic of these is an overflow, which is when a program attempts to store a number in an area of memory that is not large enough.

One form of an overflow error is an assignment between variables of different sizes:


long a_long;

short b_short;

b_short = a_long;

Assuming that a long holds 32 bits of data and a short holds 16 bits of data, this assignment results in a_long being truncated down to 16 bits, which causes loss of data if a_long holds more than 16 bits of information (and might cause a signed/unsigned error if a_long has exactly 16 bits-see the discussion below). Most compilers give a warning about this or require that the programmer make the conversion explicit, for example, through a cast in C:


b_short = (short)a_long;

The cast does not change the problem of overflowing b_short. It merely quiets the compiler and hopefully forces the programmer to realize that something risky is occurring.

Another form of overflow can happen with types of the same size. In an expression such as the following


int c, d, e;

c = d + e;

if d and e added together are larger than the maximum number that can be stored in c, this causes an overflow, or possibly a signed/unsigned error. Compilers won't warn about this type of error. The programmer must be careful about storing values close to the size limit of a certain data type.

A signed/unsigned error happens because most computers store negative numbers in what is known as two's complement. To negate a number, invert all the bits (0 becomes 1, 1 becomes 0) and then add 1. For example, using 8-bit values, the number 11 is stored in binary as

00001011

and -11 is stored as

11110101

The problem is that the positive number 245 is also stored as 11110101.

A signed 8-bit variable can hold values from -128 to 127; an unsigned 8-bit variable can hold values from 0 to 255. For both signed and unsigned variables, the numbers from 0 to 127 are stored the same way-using values from 00000000 through 01111111. The signed versus unsigned difference is whether the values from 10000000 to 1111111 are interpreted as the range from 128 to 255 or the range from -128 to -1.

Thus, languages often require that a variable be declared as either signed or unsigned (with signed usually the default). It doesn't affect how the data is stored. It's just a convention for how to interpret them when displaying them (and affects some operations such as extending them to fit in a variable with a larger number of bits; for example, converting a short to a long). Writing the following


char j = -11;

unsigned char k = 245;

results in 11110101 being stored in both j and k.

As you can see, in signed notation, a negative number has the high bit (the leftmost bit in the binary representation) set to 1. A signed/unsigned error can happen when two signed numbers are added and the result has the incorrect value in the high bit. For example, adding 127 + 3 with unsigned 8-bit values results in the value 130, but with signed values, it results in the value -126. Negative numbers can improperly wind up positive: An 8-bit addition of the values -100 and -100 results in the value 56.

Programmers sometimes have to be aware of another detail of number storage that concerns how the bytes in a number are arranged in memory. A machine that stores the least significant byte first is known as little-endian. Machines that store the most significant byte first are known as big-endian. That is, the 32-bit number whose hexadecimal representation is 0x12345678 would be stored on a little-endian machine as four consecutive bytes


0x78 0x56 0x34 0x12

while on a big-endian machine, it would be stored as follows:


0x12 0x34 0x56 0x78

Normally, these differences don't matter, but they become important in C code such as the following:


long l;

short s = *((short *)&l);

If you don't write code like that, you probably don't have to worry about little-endian versus big-endian.

Finally, errors can occur because of truncation or rounding-even with integers. During integer division, the remainder is not preserved, so a routine that attempts to compute an average by recalculating the "current total" for each iteration


integer count = 0;

integer avg = 0;

for (j = 0; j < array.length; j++) {

    tot = avg * count;

    count = count + 1;

    avg = (tot + array[j]) / count

}

would likely generate an incorrect result because of the intermediate conversion of the division result back to the integer avg.

D.memory

The D.memory error involves mismanaging memory. One way to cause this error is to attempt to access memory that is not accessible to the program, by improperly manipulating an array index or pointer:


int a[5];

int j = a[200];

Another way to cause this error is to allocate memory after it is freed. (The C functions malloc() and free() allocate and free memory; memcpy() copies bytes of data.)


char * k = malloc(200);

char * kcopy = k;

memcpy(k, buffer, 200);

// k is processed...

free(k);

// at some point later...

do_something(kcopy);

Both these examples are contrived and obviously incorrect at a quick glance. Real invalid memory bugs are better disguised and more difficult to find.

Instead of freeing memory too soon, programs can forget to free it, which causes a memory leak. (This is impossible in some languages where the user does not have the ability to explicitly allocate and deallocate memory).

A section of code might leak all the memory it allocates


for (k = 0; k < buffer_count; k++) {

    void * temp_buffer = malloc(80);

   // some processing using temp_buffer

}

or it might leak only memory in certain situations:


for (k = 0; k < buffer_count; k++) {

    void * temp_buffer = malloc(80);

    // some processing using temp_buffer

    if (unexpected_endoffile())

        break;   // oops, don't free(temp_buffer)

    free(temp_buffer);

}

Code with memory leaks often works for a long time and then fails unexpectedly when new memory cannot be allocated. This is an unpredictable situation based on the hardware being used, what other applications are running, and other hard-to-predict factors.

A final way to mismanage memory is to use the same variable for two different reasons in different sections of code, but later discover that the logical scope of the two areas overlaps. One example of this is using the same loop counter in a nested loop:


for (i = 0; i < count; i++) {

    length = getnextbuffer(buf);

    for (i = 0; i < length; i++) {

        process(buf[i]);

    }

}

This error can occur when code is cut-and-pasted into the middle of a larger section of code that uses the same variable. It can also happen when programmers reuse the same variable name for different functions on the theory that it saves memory. In fact, modern compilers can often figure out if two variables have scopes that do not intersect and reuse the same storage for them. Therefore, it is best to use separate variable names for separate purposes.

Although D.memory errors can cause some of the hardest-to-find bugs in the real world, it is hard to create a short example with a non-obvious case. Therefore, none of the programs in this book have a D.memory error.

Table of Contents