Tab Expansion
This function expands tabs in the STDIN passed to it.
The function is passed a numeric tab stop, indicating how many spaces separate each tab. A tab character in the input is expanded out to the correct number of spaces to move the line to the next tab stop.
In Perl, the character "\t" denotes a tab.
Perl's x operator repeats a string a specified number of times:
" " x ($tabwidth - ($cur % $tabwidth));
Therefore, this code repeats the left-side string argument (a single space, in this case) as many times as what's specified by the right side's numeric argument (which is the rest of the glop on that line).
Source Code
1. # tab stop is the argument to the program
2.
3. $tabwidth = shift @ARGV;
4.
5. while (<STDIN>) {
6.
7. # loop through each character of the input line
8. foreach $cur (0..(length($_)-1)) {
9.
10. # examine current character
11. $thischar = substr($_, $cur, 1);
12.
13. # if a tab, replace with spaces
14. if ($thischar eq "\t") {
15. substr($_, $cur, 1) =
16. " " x ($tabwidth - ($cur % $tabwidth));
17. }
18. }
19.
20. # print out the modified input line
21. print;
22. }
Suggestions
What is the already solved input to this program? What can be said about $_ after one iteration of the foreach loop on lines 8-18? How does the program behave if $tabwidth is 8, and an input line has eight characters and then a tab? What is the value of $cur at line 15 when the program encounters the tab?
Hints
Walk through the program with the following input lines, assuming that $tabwidth is 8. (Note that each line of input is handled completely separately, so all the hints are only one line.) Tabs are denoted with \t:
One tab in the middle:
ABCDEFG\tH
Tab at the beginning of the line:
\tAB\tCD\tEF
Two tabs at the beginning of the line:
\t\t\tABCD
Tab at the eighth character ($tabwidth is 8):
ABCDEFGH\tABCDEFG
Explanation of the Bug
The program contains an A.logic error. The problem occurs in the way the foreach loop is defined on line 8:
foreach $cur (0..(length($_)-1)) {
This statement calculates a list from the specified range and then iterates through the list. The list is not changed after it is created. However, the length of $_ can change as tabs are expanded into spaces, which leaves unprocessed data at the end of the string.
For example, with an input line containing only two tab characters, length($_) will be 2 at the time the list is constructed on line 8, so the list controlling the foreach loop will be (0, 1). However, after the replacement of the first tab with the spaces is complete, the string becomes longer and the second tab is now further out in the string. (For example, assuming a tab stop of 8, the second tab would now be the ninth character.) The foreach loop exits after $_ is equal to 1, and the second tab character is never expanded.
The solution is to replace the foreach on line 8 with a for loop that recalculates length($_) each time:
foreach ($cur = 0; $cur < length($_); $cur++) {
|