Previous Section Table of Contents Next Section

Simple Database

This function is a simple file-driven database. Entries in the database are identified by a name. The program reads a file that contains lines of the form:


put name value # assign entry "name" the value "value"

get name       # prints the value of entry "name"

delete name    # removes the entry "name"

length name    # prints the length of entry "name"

dump           # displays the entire database


To make processing easier, the names are defined to be "Perl words" (which means that they contain letters, numbers, and/or underscores-the characters that the '\w' regular expression escape matches). The spaces in between operators and names can be any number of "Perl spaces" (spaces, return, tab, or formfeed-the characters that the '\s' regular expression escape matches, except for newline, which will delimit an entire line). For the "put" operation, "value" is whatever is left on the line after the spaces following "name" have been skipped; "value" might be empty if nothing more is specified on the line after "put name".

The program uses regular expression matches and particularly the memory variables, which are expressions grouped in parentheses that, in the event of a successful match, result in the match text being stored in variables named $1, $2, and so on.

Entries in the database are stored in a hash with the name used as the key.

Recall that an if test containing only a regular expression compares it to $_ and that /i following a regular expression indicates that the match should be case insensitive.

The exists() function checks a hash for the existence of an element with a given key:


if (exists $table{$1})


delete() removes an element from a hash, given its key (it does nothing, silently, if no element exists for that key):


delete $table{$1};


The each() function iterates through the key-value pairs in a hash, as in the following code:


while ( ($key, $value) = each %table )


This causes the loop to exit after every element of %table has been iterated through. (This code also shows how the two-element list returned by each() is assigned to a two-element list containing two variables.)

Source Code


 1.     # loop through stdin processing commands

 2.

 3.     while (<>) {

 4.         if (/get\s+(\w+)/i) {

 5.             if (exists $table{$1}) {

 6.                 print "$1: $table{$1}\n";

 7.             }

 8.         }

 9.         elsif (/put\s+(\w+)\s+(.+)/i) {

10.             $table{$1} = $2;

11.         }

12.         elsif (/dump/i) {

13.             while ( ($key, $value) = each %table ) {

14.                 print "$key: $value\n";

15.             }

16.         }

17.         elsif (/delete\s+(\w+)/i) {

18.             delete $table{$1};

19.         }

20.         elsif (/length\s+(\w+)/i) {

21.             if (exists $table{$1}) {

22.                 $len = length $table{$1};

23.                 print "length($1): $len\n";

24.             }

25.         }

26.         else {

27.             print("Syntax error: $_");

28.         }

29.     }


Suggestions

  1. Read through each regular expression carefully to be sure that you understand exactly which strings it will match.

  2. What happens if a particular name is "put" if it already exists in %table?

  3. Lines returned by <> still have the '\n' at the end. Does the program deal with this correctly?

  4. Does the program correctly ignore extra data at the end of an input line?

Hints

Walk through the function with the following inputs:

  1. Add one element, operate on it, and delete it:

    put Name Smith

    get Name

    length Name

    delete Name

  2. Add one element and then add it again blank:

    put ID 1234

    length ID

    put ID

    length ID

  3. Add two elements and then delete one:

    put A A

    put B B

    delete A

    dump

Explanation of the Bug

The statement on line 9 that matches the put operation


elsif (/put\s+(\w+)\s+(.+)/i) {


does not correctly handle the case where there is no value. The match string (.+) requires one or more characters, but there might not be more characters. As a result, a line such as the following


put ID


is reported as a syntax error on line 27.

The regular expression on line 9 needs to be changed:


elsif (/put\s+(\w+)\s+(.*)/i) {


This change allows the value to be an empty string. This bug could be classified as either D.limit or B.expression.

    Previous Section Table of Contents Next Section