Previous Section Table of Contents Next Section

Convert a 64-Bit Number to a Decimal String

This function converts a 64-bit unsigned number to its decimal string equivalent.

Three parameters are pushed on the stack in order: the buffer that holds the resulting string, the low 32 bits of the number, and the high 32 bits of the number. The program should NULL-terminate the string it stores in the output buffer. Assume the buffer is large enough to hold the result.

The program works by producing the proper digit for each "place" (as in the ones place, tens place, hundreds place, and so on) in turn, moving from high to low place. It first calculates 10 to the appropriate power and then keeps subtracting that from the original number until it becomes less than it. The number of times the subtraction is done is the correct digit for that place.

The program "knows" that the largest 64-bit unsigned number is between 1019 and 1020.

The main procedure is to_decimal, which calls the helper procedure. mult10. to_decimal is responsible for preserving register values and cleaning parameters off the stack. It uses three local variables, all of which are bytes.

The shld instruction shifts a certain number of bits to the left, the same as shl, but it fills in the bits from another location rather than putting in 0. The effect is that the sequence


shld esi, ebx, 2

shl ebx, 2


shifts esi:ebx left by 2 bits.

Recall that adc allows multi-byte additions with carry, so that


add ebx, eax

adc esi, edx


adds edx:eax to esi:ebx. The subtraction equivalent is "subtract with borrow." The sbb instruction adds the carry flag to the subtrahend (the subtrahend in a subtraction operation is the number to the right of the minus sign). Thus, you calculate edx:eax -= esi:ebx with the following:


sub eax, ebx

sbb edx, esi


If you think it through, it works. Remember that after sub eax, ebx, the carry flag is set if ebx was less than eax. In that case, you want to "borrow" one from edx, which is what happens because in that case, sbb edx, esi becomes edx -= (esi+1).

Source Code


 1.     mult10:

 2.         ; multiply esi:ebx by 10, ecx times

 3.         jecxz multret

 4.

 5.         push eax

 6.         push edx  ; save these

 7.

 8.     multloop:

 9.         ; x * 10 == x * 8 + x * 2

10.         shld esi, ebx, 1

11.         shl ebx, 1

12.         mov eax, ebx

13.         mov edx, esi

14.         ; edx:eax is now 2 x the number

15.         shld esi, ebx, 2

16.         shl ebx, 2

17.         ; esi:ebx is now 8 x the number

18.         add ebx, eax

19.         adc esi, edx

20.         ; esi:ebx is now 10 x the number

21.         loop multloop

22.

23.         pop edx

24.         pop eax

25.     multret:

26.         ret

27.

28.

29.     to_decimal:

30.

31.         push ebp

32.         mov ebp, esp

33.         sub esp, 4    ; [ebp-4] place we are producing

34.                       ; [ebp-3] next digit

35.                       ; [ebp-2] on if number is nonzero

36.                       ; [ebp-1] unused

37.

38.         mov edx, [ebp+8]  ; high 32 bits

39.         mov eax, [ebp+12] ; low 32 bits

40.         mov edi, [ebp+16] ; buffer

41.

42.         mov byte ptr [ebp-4], 19   ; initial place is 19

43.

44.     mainloop:

45.         ; first put 10^place in esi:ebx

46.         mov ebx, 1

47.         mov esi, 0

48.         mov ecx, 0

49.         mov cl, byte ptr [ebp-4] ; place goes in ecx

50.         call mult10

51.

52.         ; subtract esi:ebx from edx:eax until it is less

53.         ; than esi:ebx. Track number of times in [ebp-3].

54.

55.         mov byte ptr [ebp-3], 0   ; digit starts at 0

56.     subloop:

57.         cmp edx, esi

58.         jb subdone

59.         ja subandloop

60.         cmp eax, ebx

61.         jb subdone

62.     subandloop:

63.         sub eax, ebx

64.         sbb edx, esi

65.         inc byte ptr [ebp-3]  ; increment digit

66.         jmp subloop

67.

68.     subdone:

69.         cmp byte ptr [ebp-3], 0   ; is digit 0?

70.         jne adddigit              ; no, so add it

71.         cmp byte ptr [ebp-2], 0  ; digit is 0, so add if a

72.                                  ; non-zero part of number

73.                                  ; has already been output

74.         jne adddigit

75.         jmp nextdigit

76.     adddigit:

77.         add byte ptr [ebp-3], '0'  ; convert digit to ASCII

78.         mov bl, byte ptr [ebp-3]

79.         mov byte ptr [edi], bl     ; store digit in buffer

80.         mov byte ptr [ebp-2], 1    ; nonzero is true now

81.         inc edi

82.

83.     nextdigit:

84.         cmp byte ptr [ebp-4], 0    ; did we finish place 0?

85.         je done                    ; if so, we are done

86.         dec byte ptr [ebp-4]       ; on to next-lower place

87.         jmp mainloop

88.

89.     done:

90.         cmp [ebp-2], 0       ; is nonzero still false?

91.         jne skipzero

92.         mov byte ptr [edi], '0'  ; output a single zero

93.         inc edi

94.

95.     skipzero:

96.         mov byte ptr [edi], 0    ; '\0'-terminate buffer

97.

98.         pop ebp

99.         pop ebp

100.        ret 12               ; pop 12 bytes off the stack


Suggestions

  1. Look at each instance where to_decimal uses one of its three 1-byte local variables, and use that to come up with names for each of them.

  2. Verify the algorithm and implementation of mult10.

  3. The procedure treats edx:eax and esi:ebx as 64-bit quantities. If you can show that these are always used properly, you can run through the algorithm without having to use a value greater than 32 bits as an input. In particular, determine what arithmetic comparison between edx:eax and esi:ebx is being tested on lines 58-62.

  4. What is the meaning of edi?

Hints

Walk through the program with the following inputs on the stack, with the current stack location at the bottom (assume the output buffer is correct and will hold the result):

  1. Test if zero is handled correctly:

    output_buffer

    0 [low 32 bits]

    0 [high 32 bits]

  2. A single-digit number:

    output_buffer

    9 [low 32 bits]

    0 [high 32 bits]

  3. Three digits, all in the low 32 bits:

    output_buffer

    234 [low 32 bits]

    0 [high 32 bits]

  4. One digit on in the high 32 bits:

    output_buffer

    0 [low 32 bits]

    1 [high 32 bits]

Explanation of the Bug

The "nonzero" variable stored at [ebp-2] is not initialized to false. It is not initialized at all, which is an F.init error. As a result, it randomly starts out as true or false based on what happens to be lying around on the stack. It needs to be initialized around line 43:


mov byte ptr [ebp-2], 0    ; nonzero starts as false


If the stack had a zero at that byte, the program works as expected. If not, it prints leading zeros in the output buffer.

    Previous Section Table of Contents Next Section