Calculate Student Test Averages
This function calculates averages from a file containing student test results.
The file contains sections in the following format:
testname: maximumscore
studentname1 score1
studentname2 score2
The "testname" line indicates the start of a new section. (Note that those lines have a colon (:) in them, while score lines do not.)
In the file, testname and studentname must be a single word, and the scores must be numeric.
A student does not necessarily have a score for each test. If the line is missing, it indicates that the student did not take the test.
When the program is done processing the file, for each student, it prints his or her average for taken tests, and his or her overall average (counting missed tests as 0).
The program uses memory variables: The fact that if a regular expression is matched, parts of the pattern that were in parentheses are automatically stored in variables named $1, $2, and so on.
The Perl built-in function split() splits a string into arrays by using a separator defined by a regular expression (in this case, the vertical bar character):
@teststaken = split /\|/, $studenttesttaken{$_};
Because it matters in this particular program, note that split() does return empty fields at the beginning of the string it is splitting, so split(/:/, ":a:b") returns three elements ("", "a", and "b"). In contrast, by default, it does not return empty fields at the end of the string.
Source Code
1. # score file is of the form
2. # testname: maximumscore
3. # studentname score
4.
5. $currenttest = -1;
6.
7. while (<>) {
8. $thisline = $_;
9.
10. # does this line have a testname on it?
11. if ($thisline =~ /\s*(\w+)\s*:\s*(\d+)/) {
12. ++$currenttest;
13. $testnames[$currenttest] = $1;
14. $testmaximums[$currenttest] = $2;
15. $thistestname = $1;
16. $thistestmaximum = $2;
17. }
18.
19. # or does it have a score in it?
20. if ($thisline =~ /\s*(\w+)\s*(\d+)/) {
21. $studentpoints{$1} += $2;
22. $studentmaximums{$1} += $thistestmaximum;
23. $studenttesttaken{$1} .= "|$thistestname";
24. }
25. }
26.
27. # find the total maximum score
28.
29. foreach (@testmaximums) {
30. $totalmaximum += $_;
31. }
32.
33. # now print results
34.
35. foreach (keys %studentpoints) {
36. @teststaken = split /\|/, $studenttesttaken{$_};
37. $testcount = $#teststaken;
38. print("student name: $_\n");
39 print("total tests taken: $testcount\n");
40.
41. $testaverage =
42. $studentpoints{$_} / $studentmaximums{$_};
43. $testoverall = $studentpoints{$_} / $totalmaximum;
44. printf("average in tests taken: %d%%\n",
45. $testaverage * 100);
46. printf("average in all tests: %d%%\n",
47. $testoverall * 100);
48. }
Suggestions
What happens to input lines that do not match either of the regular expressions on lines 11 or 20? What happens if the program receives an empty file as input? What about a file with testname lines, but no student score lines? Confirm that testcount is set correctly on lines 36-37. How many different meanings is $_ used for in the program?
Hints
Walk through the program with the following input files:
One test, two students: test1: 10 joe 5 susan 8 Two tests, one student per test: test1: 10 joe 6 test2: 20 susan 15 Two students, different number of tests: test1: 10 joe 6 susan 10 test2: 20 susan 18
Explanation of the Bug
The check for a score line on line 20
if ($thisline =~ /\s*(\w+)\s*(\d+)/) {
might also match a test line, which confuses the program into thinking that a new student has taken the test.
Keep in mind that the regular expression does not have to match the entire line, just a part of it (unless it is anchored to the ends with ^ and $).
A test line of the form
test: 20
matches the regular expression on line 20. It would consider the student name to be "2" and the score to be "0". In the specific hints given, the lines with "test1" or "test2" in them result in a student named "test" being credited with a score of 1 or 2.
This is an F.location error because the test for a score line should not be done if a line has already been determined to be a testname line. The if should be changed to an elsif, although the regular expression on line 20 could also be tweaked to require a space between the student name and the score:
elsif ($thisline =~ /\s*(\w+)\s+(\d+)/) {
|