Recipe 3.4. Converting from One Base to Another
Problem
You need to convert strings
representing numbers in some base to numbers in another base.
Solution
This example provides a general solution for converting from any base
between 2 and 36 to any base in the same range. It uses two global
variables to encode the value of all characters in a base 36 system
as offsets into the stringone for uppercase encoding and the
other for lowercase:
<xsl:variable name="ckbk:base-lower"
select="'0123456789abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="ckbk:base-upper"
select="'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:template name="ckbk:convert-base">
<xsl:param name="number"/>
<xsl:param name="from-base"/>
<xsl:param name="to-base"/>
<xsl:variable name="number-base10">
<xsl:call-template name="ckbk:convert-to-base-10">
<xsl:with-param name="number" select="$number"/>
<xsl:with-param name="from-base" select="$from-base"/>
</xsl:call-template>
</xsl:variable>
<xsl:call-template name="ckbk:convert-from-base-10">
<xsl:with-param name="number" select="$number-base10"/>
<xsl:with-param name="to-base" select="$to-base"/>
</xsl:call-template>
</xsl:template>
This template reduces the general problem to two subproblems of
converting to and from base 10. Performing base 10 conversions is
easier because it is the native base of XPath numbers.
The template ckbk:convert-to-base-10 normalizes
the input number to lowercase. Thus, for example, you treat
ffff hex the same as FFFF hex,
which is the normal convention. Two error checks are performed to
make sure the base is in the range you can handle and that the number
does not contain illegal characters inconsistent with the base. The
trivial case of converting from base 10 to base 10 is also handled:
<xsl:template name="ckbk:convert-to-base-10">
<xsl:param name="number"/>
<xsl:param name="from-base"/>
<xsl:variable name="num"
select="translate($number,$ckbk:base-upper, $ckbk:base-lower)"/>
<xsl:variable name="valid-in-chars"
select="substring($ckbk:base-lower,1,$from-base)"/>
<xsl:choose>
<xsl:when test="$from-base < 2 or $from-base > 36">NaN</xsl:when>
<xsl:when test="not($num) or translate($num,$valid-in-chars,'')">NaN</xsl:when>
<xsl:when test="$from-base = 10">
<xsl:value-of select="$number"/>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="ckbk:convert-to-base-10-impl">
<xsl:with-param name="number" select="$num"/>
<xsl:with-param name="from-base" select="$from-base"/>
<xsl:with-param name="from-chars" select="$valid-in-chars"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Once error checking is taken care of, you can defer the actual
conversion to another recursive template that does the work. This
template looks up the decimal value of each character as an offset
into the string of characters you obtained from the caller. The
recursion keeps multiplying the result by the base and adding in the
value of the first character until you are left with a string of
length 1:
<xsl:template name="ckbk:convert-to-base-10-impl">
<xsl:param name="number"/>
<xsl:param name="from-base"/>
<xsl:param name="from-chars"/>
<xsl:param name="result" select="0"/>
<xsl:variable name="value"
select="string-length(substring-before($from-chars,substring($number,1,1)))"/>
<xsl:variable name="total" select="$result * $from-base + $value"/>
<xsl:choose>
<xsl:when test="string-length($number) = 1">
<xsl:value-of select="$total"/>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="ckbk:convert-to-base-10-impl">
<xsl:with-param name="number" select="substring($number,2)"/>
<xsl:with-param name="from-base" select="$from-base"/>
<xsl:with-param name="from-chars" select="$from-chars"/>
<xsl:with-param name="result" select="$total"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
The other half of the problem requires a conversion from base 10 to
any base. Again, you separate error checking from the actual
conversion:
<xsl:template name="ckbk:convert-from-base-10">
<xsl:param name="number"/>
<xsl:param name="to-base"/>
<xsl:choose>
<xsl:when test="$to-base < 2 or $to-base > 36">NaN</xsl:when>
<xsl:when test="number($number) != number($number)">NaN</xsl:when>
<xsl:when test="$to-base = 10">
<xsl:value-of select="$number"/>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="ckbk:convert-from-base-10-impl">
<xsl:with-param name="number" select="$number"/>
<xsl:with-param name="to-base" select="$to-base"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
The actual conversion is simply a matter of picking out each digit
form the $ckbk:base-lower table based on the
remainder (i.e., mod) obtained when dividing by the
$to-base. You recurse on the leftover integer
portion, concatenating the digit onto the front of the result.
Recursion ends when the number is less than the base:
<xsl:template name="ckbk:convert-from-base-10-impl">
<xsl:param name="number"/>
<xsl:param name="to-base"/>
<xsl:param name="result"/>
<xsl:variable name="to-base-digit"
select="substring($ckbk:base-lower,$number mod $to-base + 1,1)"/>
<xsl:choose>
<xsl:when test="$number >= $to-base">
<xsl:call-template name="ckbk:convert-from-base-10-impl">
<xsl:with-param name="number" select="floor($number div $to-base)"/>
<xsl:with-param name="to-base" select="$to-base"/>
<xsl:with-param name="result" select="concat($to-base-digit,$result)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat($to-base-digit,$result)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Discussion
Base conversions are a common programming task and most developers
already know how to perform them. Many languages have built-in
provisions for these conversions; XSLT does not. The fact that XPath
1.0 and XSLT 1.0 provide no means of getting the integer value of a
Unicode character makes these conversions more cumbersome. In XPath
2.0 you can use the functions string-to-codepoints and
codepoints-to-string. Hence, you must resort to playing tricks with
strings that act like lookup tables. These manipulations are
inefficient, but reasonable for most conversion needs.
The code assumes that bases higher than 10 will use the standard
convention of assigning successive alphas to digits higher than 9. If
you work with an unconventional encoding, then you will have to
adjust the mapping strings accordingly. You can potentially extend
this code beyond base 36 by adding the characters used to
encode digits higher than Z.
|