Previous Section Table of Contents Next Section

Parse Numbers Written in English

This function, when passed a string such as "six hundred twenty two," should return the numerical value. It should handle numbers up to 999,999,999.

This function ignores the word "and" anywhere it appears in the input string.

Source Code


 1.     """ Define a dictionary mapping between English words

 2.         and the corresponding numbers.

 3.     """

 4.     digitmap = {

 5.         "zero" : 0,

 6.         "one" : 1,

 7.         "two" : 2,

 8.         "three" : 3,

 9.         "four" : 4,

10.         "five" : 5,

11.         "six" : 6,

12.         "seven" : 7,

13.         "eight" : 8,

14.         "nine" : 9,

15.         "ten" : 10,

16.         "eleven" : 11,

17.         "twelve" : 12,

18.         "thirteen" : 13,

19.         "fourteen" : 14,

20.         "fifteen" : 15,

21.         "sixteen" : 16,

22.         "seventeen" : 17,

23.         "eighteen" : 18,

24.         "nineteen" : 19,

25.         "twenty" : 20,

26.         "thirty" : 30,

27.         "forty" : 40,

28.         "fifty" : 50,

29.         "sixty" : 60,

30.         "seventy" : 70,

31.         "eighty" : 80,

32.         "ninety" : 90

33.     }

34.

35.

36.     """ These words act as multipliers for the numbers

37.         before them.

38.     """

39.     multipliermap = {

40.         "hundred" : 100,

41.         "thousand" : 1000,

42.         "million" : 1000000

43.     }

44.

45.     def parseNumber( numberString ):

46.         """ Convert a text string into a number, up to

47.             999,999,999.

48.

49.             numberString: The English form of a number,

50.                           with spaces between each part.

51.

52.             Returns: The integer value.

53.         """

54.

55.         retVal = 0

56.

57.         """ The function split() takes a string and parses

58.             it into substrings, using a specified delimiter

59.             character (or a space delimeter if none is

60.             specified). So this next call breaks the string

61.             into a list of individual words.

62.         """

63.

64.         numberList = numberString.split()

65.

66.         """ Walk through the list of words, but with the

67.             word "and" removed.

68.         """

69.

70.         for word in [ e for e in numberList if e != "and" ]:

71.

72.             """ If word is a number, add to running total.

73.             """

74.

75.             if word in digitmap:

76.

77.                 retVal = retVal + digitmap[word]

78.

79.             """ If word is a multiplier, multiply

80.                 running total by the multiplier.

81.             """

82.

83.             if word in multipliermap:

84.

85.                 retVal= retVal * multipliermap[word]

86.

87.         return retVal


Suggestions

  1. The declaration of digitmap and multipliermap is the kind of repetitive code that it is easy for the eyes to gloss over. Double-check that it is correct.

  2. Look at the for loop with the list comprehension on line 70. Make sure that you understand what the goal of numberList is at the beginning of the loop, what the for loop does, and whether it is correct.

  3. What are the trivial and empty cases for this function? Are they handled correctly?

  4. Pick one single parameter to the function that you feel would exercise all the code.

Hints

Walk through the function with the following values for the numberString parameter:

  1. Three consecutive nonzero digits: "six hundred twenty two"

  2. A gap between nonzero digits: "four thousand and five"

  3. Zero digits at the end of the number: "four thousand five hundred"

Explanation of the Bug

The bug lies in the code that handles the case where word is in multipliermap; that is, where the if on line 83 is true. A numberString value, such as "four thousand five hundred" (as shown in the second hint) exposes the problem: It results in a return value of 400500.

This is a tricky A.logic problem: At the point where the main loop has iterated three times (having processed the words "four", "thousand", and "five"), the running total in retVal is correctly set to 4005. However, when the word "hundred" is seen next, retVal is multiplied by 100, which results in an incorrect value of 400500.

The code that appears on line 85


retVal = retVal * multipliermap[word]


needs to be replaced. One solution is to apply a multiplier only to the part of retVal that is less than the multiplier. So, for example, when you apply a multiplier of 100, you will save off the part of retVal that is greater than 100, multiply what is left by 100, and then add them together.

The following code does that. First, it saves the "high part" of retVal (the portion of the original number greater than 100, as an example), and then it calculates the "low part" (the portion less than 100), and then it adds them back, multiplying only the low part by the multiplier:


multiplier = multipliermap[word]

highPart = ((retVal / multiplier) * multiplier)

lowPart = retVal - highPart

retVal = highPart + (lowPart * multiplier)


It isn't perfect-it doesn't handle cases like "two thousand thousand"-and it is a bit inelegant (you should probably be accumulating the numbers between multipliers in a separate variable), but it works well enough.

    Previous Section Table of Contents Next Section