As you saw in the past chapter, functions can be extremely useful in processing data stored in a database and can save you a lot of work when you deal with results returned by SQL queries. So far, most of the functions you've seen have been aggregate functions, but there are a lot more than that available to you.
The general syntax for calling a function is simply as follows:
FUNC(Parameters);
Here FUNC is the name of the function, and Parameters are the parameters of the function, which may be column names, literals, expressions, or even other functions. Where more than one parameter is required, they're separated by commas:
Functions can be used pretty much anywhere in a SQL statement because they can evaluate to numeric values, strings, Boolean values, and so on.
Unfortunately, functions tend to be very RDBMS specific, where each of the major RDBMSs has its own suite of functions—far too many to list here. However, there's some overlap, and you can examine some of the more commonly used functions without too much worry concerning compatibility issues.
One area where many functions are shared is with mathematical functions. Whatever RDBMS you're using, you're likely to have access to functions that do the following:
Return the absolute value of a value, typically called ABS or similar
Perform trigonometric calculations, such as SIN, COS, and so on
Calculate logarithms, such as LOG10 or LOG, to use base e
Calculate square roots, SQRT
Raise values to a given power, POWER
Generate random numbers, RAND
RDBMSs usually include functions for dealing with date and time values, working with strings (obtaining substrings, replacing characters, reversing character order, and so on), and much more. Typically, access to all the functionality of a database is achieved via functions, including security functionality, general database statistics, configuration, and most other things you might imagine.
As an example, consider the situation where you want to obtain the square root of the numbers in a column, which means using a function such as SQRT, as noted previously. In SQL Server, Oracle, DB2, and MySQL, you could use simply the following:
SELECT SQRT(ColumnName) AS RootOfColumnName FROM TableName;
However, Access doesn't have a SQRT function, so you have to use SQR instead:
SELECT SQR(ColumnName) AS RootOfColumnName FROM TableName;
The result is the same, but the name of the function is different. Sadly, this is typical and can mean checking the function list of the RDBMS you're using to find the exact syntax you need to use.
Because there are so many functions available, and because they're so database-specific, there's really no point in attempting to cover everything here. Instead you'll look at some of the main areas where you need functions—manipulating numbers, strings, and dates and times and converting between different data types—and you'll deal with other individual functions as you come across them in the remainder of the book. Later, at the end of the chapter, you'll also look at the CREATE FUNCTION statement that lets you define your own functions.
Previously, you saw integer values being used simply by inserting the numeric value, without any ' characters or other delimiting character:
2763
This holds for all numeric types, but there are also several other elements of notation you can use. First, you can include decimal points for floating-point numbers:
3.142
You can also include scientific notation to include the exponent of a number, for example:
7.34E3
You can use the - symbol to specify either a negative number or a negative exponent (or both):
-2.43E-15
You can also use the + symbol in a similar way although this isn't really necessary because numbers will be interpreted as positive by default.
SQL Server and Access have a specific money data type, which allows for constants including a currency symbol, with all the other options described previously:
-$300.50
Other RDBMSs include additional, more exotic string representations of numeric values, such as preceding a string literal with X in DB2 and MySQL to indicate a hexadecimal value.
The following is a DB2/MySQL hexadecimal number:
X'5A0F'
SQL Server, however, uses the prefix 0x for hexadecimal values, which aren't enclosed in quotes (this is also supported by the latest versions of MySQL):
0x5A0F
Note |
By default, MySQL will treat hexadecimal values as ASCII character codes and return the corresponding character(s), but you can ensure that they're treated as numbers by adding zero to them. For example, X'4A' returns 'J', but X'4A' + 0 returns 74. |
DB2, MySQL, and Access have a HEX() function that allows you to convert a numeric value into a hexadecimal string. For example, HEX(74) returns '4A'. You can achieve the same thing with Oracle using the TO_CHAR() function. You'll look at this function in a bit more detail later, but it's used to convert different data types to strings. As well as the data to convert, it can take a string that indicates how the string is to be formatted. An 'X' character in this string represents a hexadecimal digit, so you can convert the number 74 to hexadecimal using TO_CHAR(74, 'XX').
As an example of using mathematical functions in SQL, let's look at the functions used to round numbers up or down to the nearest integer or to a specific precision. There are three basic functions involved:
FLOOR(number): Rounds number down to the highest integer less than number. For example, FLOOR(57.4) and FLOOR(57.8) will both return 57.
CEILING(number) or CEIL(number): Rounds number up to the lowest integer greater than number. For example, FLOOR(57.4) and FLOOR(57.8) will both return 58.
ROUND(number, precision): Rounds number to the nearest integer or floating-point number with the specified precision. For example, ROUND(57.4, 0) returns 57 and ROUND(57.8, 0) returns 58. If the number is negative, the number will be rounded on the left side of the decimal point, so ROUND(57.4, -1) returns 60 and ROUND(57.4, -2) returns 100.
Note |
Access doesn't support either FLOOR() or CEILING(), but you can achieve the same effect using INT(), which works in much the same way as FLOOR() for numeric values, and ROUND(number + 0.5, 0) to replace CEILING(). Also, Access doesn't support negative precision with ROUND(). |
To see this in action, let's revisit the example from Chapter 4, "Summarizing and Grouping Data," where you calculated the average mark that a student scored in his or her exams. You'll show the same data (but for all students), but this time you'll present the average as a floating point, rounded up and rounded down and rounded to the nearest integer.
There are slight differences in the way you implement this in the various RDBMSs, so you'll look at these examples separately.
The same SQL query will run on both SQL Server and DB2, so you'll look at these together. The main issue here is that both SQL Server and DB2 will store the value of AVG(Mark) as an integer, so you need to convert the Mark to a float before calculating the average:
AVG(CAST(Mark AS FLOAT))
You'll look at data type conversion functions in more detail shortly. Also, notice that SQL Server supports only the CEILING() function, but DB2 has both CEIL() and CEILING().
![]() |
Enter this query and execute it against the InstantUniversity database:
SELECT StudentID, AVG(CAST(Mark AS FLOAT)) AS AverageMark, FLOOR(AVG(CAST(Mark AS FLOAT))) AS RoundDown, CEILING(AVG(CAST(Mark AS FLOAT))) AS RoundUp, ROUND(AVG(CAST(Mark AS FLOAT)), 0) AS ClosestInt FROM StudentExam GROUP BY StudentID;
This gives the following output:
StudentID AverageMark RoundDown RoundUp ClosestInt --------- ---------------- --------- ------- ---------- 1 64 64 64 64 2 48.6666666666667 48 49 49 3 80 80 80 80 4 63.3333333333333 63 64 63 5 52 52 52 52 6 68.3333333333333 68 69 68 7 71.3333333333333 71 72 71 8 51.6666666666667 51 52 52 9 68.6666666666667 68 69 69 10 73.5 73 74 74
![]() |
The query for Oracle is similar, but Oracle supports only CEIL(), not CEILING(). Also, you don't need to use the CAST() function (although it's supported, using the same syntax).
![]() |
Enter this query into SQL*Plus:
SELECT StudentID, AVG(CAST(Mark AS FLOAT)) AS AverageMark, FLOOR(AVG(CAST(Mark AS FLOAT))) AS RoundDown, CEIL(AVG(CAST(Mark AS FLOAT))) AS RoundUp, ROUND(AVG(CAST(Mark AS FLOAT)), 0) AS ClosestInt FROM StudentExam GROUP BY StudentID;
This gives the same output as for the SQL Server/DB2 example:
StudentID AverageMark RoundDown RoundUp ClosestInt ---------- ----------- ---------- ---------- ---------- 1 64 64 64 64 2 48.6666667 48 49 49 3 80 80 80 80 4 63.3333333 63 64 63 ... ... ... ... ...
![]() |
MySQL doesn't support the CAST() function and uses CEILING() rather than CEIL() . Otherwise, the query is the same as for Oracle.
![]() |
Enter this query and execute it against the InstantUniversity database:
SELECT StudentID, AVG(Mark) AS AverageMark, FLOOR(AVG(Mark)) AS RoundDown, CEILING(AVG(Mark)) AS RoundUp, ROUND(AVG(Mark), 0) AS ClosestInt FROM StudentExam GROUP BY StudentID;
This gives the following output:
StudentID AverageMark RoundDown RoundUp ClosestInt ---------- ----------- ---------- ---------- ---------- 1 64.0000 64 64 64 2 48.6667 48 49 49 3 80.0000 80 80 80 4 63.3333 63 64 63 ... ... ... ... ...
Note |
This example won't work against a MySQL server running in ANSI mode. Strict ANSI SQL demands that all columns not included in aggregate functions must be included in the GROUP BY clause. This doesn't make any sense (and isn't permitted) for your rounded columns, so this query will only work against MySQL running in standard mode. |
![]() |
The Access version of this query is significantly different because as you saw previously, Access doesn't support the FLOOR() or CEILING() function.
![]() |
The Access version of your query looks like this:
SELECT StudentID, AVG(Mark) AS AverageMark, INT(AVG(Mark)) AS RoundDown, ROUND(AVG(Mark) + .5, 0) AS RoundUp, ROUND(AVG(Mark), 0) AS ClosestInt FROM StudentExam GROUP BY StudentID;
Here you use INT() to round down the mark. INT() converts a non-integer type to an integer. For floating-point values, this effectively means that INT() works in the same way as FLOOR(). The ROUND() function has the same syntax as in the other RDBMSs. You also use this function to simulate CEILING(): If you add 0.5 to a value and then round this to the nearest integer, this has the effect of rounding up to the next integer.
Here are the results:
StudentID AverageMark RoundDown RoundUp ClosestInt ---------- ---------------- --------- --------- ---------- 1 64 64 64 64 2 48.6666666666667 48 49 49 3 80 80 80 80 4 63.3333333333333 63 64 63 ... ... ... ... ...
![]() |
As you've seen in earlier chapters, string literal values are enclosed in ' characters as follows:
'This is a string. Honest.'
One thing you haven't looked at yet is how you embed ' characters themselves in strings, which is a reasonably common occurrence. All you have to do is to replace any single quotes in the string with '':
'It''s cold outside.'
All other characters can be contained in the string literal value as normal.
It's also possible for a string literal to represent an empty string as follows:
''
This is simply two ' characters with nothing in between.
Although the ' character is the ANSI standard, some RDBMSs such as Access and MySQL (when not run in ANSI mode) allow the use of " characters instead. If you use this syntax, then ' characters can be used in the string without doubling up. However, if you want to use " characters, you must instead use "".
In addition, some RDBMSs use escape characters for certain characters such as tabs and carriage returns. MySQL uses syntax similar to that seen in many C++ (and C++ related) implementations, for example, \n for a new line character and \t for a tab. However, these escape codes are the exception rather than the norm.
Most RDBMSs provide a large number of functions that you can use to manipulate strings in various ways, for example, to extract substrings, determine the length of the string, find a character in the string, or convert the string to upper or lower case. Although the behavior of these functions is more or less standard, their names vary slightly from system to system, and in a few cases (such as SUBSTR), the parameter list can vary a little.
In general, these functions take the forms described in Table 5-2.
Function |
Description |
---|---|
LEFT(string, n) |
Returns the n leftmost characters of string. |
RIGHT(string, n) |
Returns the n rightmost characters of string. |
SUBSTRING(string, x, y) |
Returns the substring of string of length y that starts at position x. Oracle, DB2, and MySQL permit you to omit the last argument, in which case the substring from x to the end of the string will be returned. |
INSTR(string, substr) |
Returns the position of the first instance of the character or substring substr within string. |
LENGTH(string) |
Returns the length of string (that is, the number of characters in string). |
UPPER(string) |
Returns string converted to upper case. |
LOWER(string) |
Returns string converted to lower case. |
The names of these functions in the individual RDBMSs are as described in Table 5-3.
SQL Server |
DB2 |
Oracle |
MySQL |
Access |
---|---|---|---|---|
LEFT |
LEFT |
(Use SUBSTR) |
LEFT |
LEFT |
RIGHT |
RIGHT |
(Use SUBSTR) |
RIGHT |
RIGHT |
SUBSTRING |
SUBSTR |
SUBSTR |
SUBSTRING |
MID |
CHARINDEX |
POSSTR |
INSTR |
INSTR |
INSTR |
LEN |
LENGTH |
LENGTH |
LENGTH |
LEN |
UPPER |
UCASE/UPPER |
UPPER |
UCASE/UPPER |
UCASE |
LOWER |
LCASE/LOWER |
LOWER |
LCASE/LOWER |
LCASE |
Note that SQL Server's CHARINDEX() function takes its parameters in the opposite order of the INSTR() functions. So, CHARINDEX(' ', 'Apress LP') will return 7, the position of the space in 'Apress LP'.
To see more clearly how these functions are used, let's look at an example. In this example, you'll reverse the first and last names of the students in the InstantUniversity database (for example, you'll format 'John Jones' as 'Jones, John').
To do this, you'll find the first occurrence of the space in their name and retrieve the substring from the next character to the end of the name (this is the student's last name). Next, you'll concatenate a comma and a space to this using the concatenation techniques you learned in Chapter 2, "Retrieving Data with SQL." Finally, you'll add the first name, which you retrieve by extracting the substring from the start of the string to the character before the space.
Again, you'll examine the examples for each RDBMS separately.
Apart from the idiosyncratic CHARINDEX() function noted previously, the SQL Server version of the example is fairly straightforward.
![]() |
This is the SQL Server version of your query:
SELECT RIGHT(Name, LEN(Name) - CHARINDEX(' ', Name) + 1) + ', ' + LEFT(Name, CHARINDEX(' ', Name) - 1) AS StudentName FROM Student ORDER BY StudentName;
Here you use the RIGHT() function to retrieve the last name for each student and LEFT() to get the first name (although you could have used SUBSTRING()). You also use the LEN() function to calculate the length of the name, and you use this value to calculate the length of the last name. The first four rows output from this query are as follows:
StudentName --------------- Akbar, Mohammed Alaska, Steve Andrews, Vic Burton, Gary ...
![]() |
Oracle doesn't support LEFT() or RIGHT(), so you'll use SUBSTR() to extract the first and last names from the student's full name as stored in the database.
![]() |
The query looks like this in Oracle:
SELECT SUBSTR(Name, INSTR(Name, ' ') + 1) || ', ' || SUBSTR(Name, 1, INSTR(Name, ' ') - 1) AS StudentName FROM Student ORDER BY StudentName;
Oracle allows you to use the version of SUBSTR() with only two parameters, retrieving the whole of the remainder of the string from the specified position onward, so you'll use this to extract the last name (the whole string after the space). You retrieve the first name by retrieving the substring from position one (the first character) to the character before the space. Here are the results:
StudentName --------------- Akbar, Mohammed Alaska, Steve Andrews, Vic Burton, Gary ...
![]() |
DB2 does support LEFT() and RIGHT(), but because its version of SUBSTR() can take only two arguments, the query is actually no longer if you don't use them.
![]() |
This is how the query looks in DB2:
SELECT SUBSTR(Name, POSSTR(Name, ' ') + 1) || ', ' || SUBSTR(Name, 1, POSSTR(Name, ' ') - 1) AS StudentName FROM Student ORDER BY StudentName;
Apart from the fact that DB2 uses POSSTR() for the more usual INSTR(), this is the same as the Oracle version. The output from this query is the same as on SQL Server and Oracle:
One interesting side note is what happens if you try to perform the query using the LEFT() and RIGHT() functions:
SELECT RIGHT(Name, LENGTH(Name) - POSSTR(Name, ' ') + 1) || ', ' || LEFT(Name, POSSTR(Name, ' ') - 1) AS StudentName FROM Student ORDER BY StudentName;
What you get is this error:
An expression resulting in a string data type with a maximum length greater than 255 bytes is not permitted in: A SELECT DISTINCT statement A GROUP BY clause An ORDER BY clause A column function with DISTINCT A SELECT or VALUES statement of a set operator other than UNION ALL.
This is because the LEFT() and RIGHT() functions return results as varchar(4000), so although your formatted StudentName will never approach a length of 255 bytes, it's treated by DB2 as potentially greater than that. To correct that, you can set the length of the column using the VARCHAR() function, which you'll look at shortly when you learn about data type conversions.
![]() |
Like DB2, MySQL supports both LEFT() and RIGHT() and the version of SUBSTRING() with just two parameters.
![]() |
The query looks like this in MySQL:
SELECT CONCAT(RIGHT(Name, LENGTH(Name) - INSTR(Name, ' ') + 1), ', ', LEFT(Name, INSTR(Name, ' ') - 1)) AS StudentName FROM Student ORDER BY StudentName;
This query, using the CONCAT() operator, will work in both standard and ANSI modes. In ANSI mode, you can also use the || operator to perform concatenation, but that won't work in standard mode. Here are the results:
StudentName --------------- Akbar, Mohammed Alaska, Steve Andrews, Vic Burton, Gary ...
![]() |
Access supports LEFT() and RIGHT() but uses MID() instead of SUBSTRING().
![]() |
This is how the query looks in Access:
SELECT RIGHT(Name, LEN(Name) - INSTR(Name, ' ') + 1) & ', ' & LEFT(Name, INSTR(Name, ' ') - 1) AS StudentName FROM Student ORDER BY RIGHT(Name, LEN(Name) - INSTR(Name, ' ') + 1);
Remember that you can't use the alias column names of calculated columns in Access ORDER BY clauses. This means you have to repeat the formula used to calculate the column. To save space, here you've just sorted by the last name part; because no two of the sample students have the same last name, this won't affect the ordering:
StudentName --------------- Akbar, Mohammed Alaska, Steve Andrews, Vic Burton, Gary ...
![]() |
In general, datetime literal values are a special case of string values where the specific string is formatted in a certain way. Most RDBMSs will accept a wide variety of datetime specifications, but you do have to be careful to write datetime literal values in a form that will work internationally. For example, the following literal value might be considered unacceptable:
'05/02/2003'
This is because the locale settings on a server might mean that this is interpreted either as 'mm/dd/yyyy' or 'dd/mm/yyyy'.
However, reversing the values to 'yyyy/mm/dd' works in every locale. Note that the separator here, /, can be replaced with - or in some formats can be omitted entirely or replaced with a dot (.), with the date expressed as an eight-digit number. The standard form is to use dashes; the following literal, then, is unambiguous:
'2003-05-02'
Note |
Note that Access requires datetime literals to be enclosed in pound characters, so you'd write this as #2003-05-02# if you're using Access. |
In addition, you can be even clearer by using a more human readable format such as:
'May 2, 2003'
However, formatting a date in this way may take more fiddling around with strings in your code than you'd like.
Time constants are formatted as follows:
'hh:mm:ss.fff'
Here hh is hours, mm is minutes, ss is seconds, and fff is used to get even more precise with fractions of seconds, for example:
'14:59:02.129'
However, you needn't specify as much information as this. You can omit second data, for example:
'14:59'
You can also use a 12-hour clock:
'2:59 PM'
Finally, date and time values can be combined into a single string:
'2003-05-02 14:59:02.129'
Here a space separates the date and time values, and you can use any of the formats described previously for the date and time.
Note that there are certain restrictions placed on SQL date and time formats enforced by the various RDBMSs. For example, when using Oracle it may be necessary to include a DATE, TIME,or TIMESTAMP keyword before the string used to represent the date and time in order to specify how the string should be interpreted.
For example, you could represent a date using the following Oracle literal value:
Oracle also allows time intervals to be represented, using a similar syntax as the previous example with the addition of a requirement to specify what the first number in the interval string is (such as YEAR or DAY, with an optional single-digit precision specification saying how many digits are used for the value, for example, YEAR(4) for a four-digit amount of years) and, optionally, what the last number is (using similar nomenclature) using a suffix. There are two types of Oracle interval, known as YEAR TO MONTH and DAY TO SECOND, where the former has a precision of months or less and the latter seconds or less.
The following shows a DAY TO SECOND interval of five days and four hours:
INTERVAL '5:4' DAY TO HOUR
The following is a YEAR TO MONTH interval of 5,000 years:
INTERVAL '5000' YEAR(4)
For more details on this, consult your Oracle documentation.
The syntax and functionality of the date functions vary significantly from system to system, so we won't list them here. Instead, we'll present an example for each RDBMS where you calculate how long each student has been enrolled by working out the maximum difference between the current date and the date when they first enrolled.
You group the rows in the Enrollment table by StudentID and use the MAX aggregate function to get the amount of days since the first enrollment (because students may have enrolled in multiple classes, the MAX function finds the earliest). Finally, you order the resultant data by this calculated average to rank students.
In most cases, you can use some form of subtraction to work out the number of days since each student's first enrollment, but the exact syntax varies considerably.
To calculate the difference between two dates in SQL Server, you use the DATEDIFF() function. Subtracting one date from another is permitted in SQL Server, but the return value is itself a date and not very meaningful.
![]() |
Enter and execute this query against the InstantUniversity database:
SELECT StudentID, MAX(DATEDIFF(dd, EnrolledOn, GETDATE())) AS DaysEnrolled FROM Enrollment GROUP BY StudentID ORDER BY DaysEnrolled DESC;
DATEDIFF() calculates the time difference between two dates; the first parameter of this function selects the units to work with, in this case, days. You look at how many days have elapsed between the enrollment date stored in the EnrolledOn column and the current date, obtained using the GETDATE() function. The output is as follows:
StudentID DaysEnrolled ----------- ----------- 4 152 7 152 10 152 9 149 8 149 5 149 6 149 1 149 3 142 2 -216
Note that the last value is negative because some of the enrollment dates are in the future (at the time of writing). This wouldn't happen if you were using real data, of course, but it does demonstrate that DATEDIFF() returns a signed rather than an absolute value.
![]() |
With Oracle, you can perform a straightforward subtraction of two dates.
![]() |
The Oracle version of the query looks like this:
SELECT StudentID, FLOOR(MAX(CURRENT_DATE - EnrolledOn)) AS DaysEnrolled FROM Enrollment GROUP BY StudentID ORDER BY DaysEnrolled DESC;
![]() |
You retrieve the current date and time using CURRENT_DATE and subtract the value of the EnrolledOn column to find out how long each student has been enrolled for each class. This operation returns an INTERVAL value; because this can contain a time portion (expressed using decimals), you round this down to the nearest day using FLOOR() (you use FLOOR() rather than CEILING() to include only complete days). The output is the same as for the SQL Server version:
You again use subtraction to calculate the number of days each student has been enrolled with DB2, but you need first to convert the dates into a simple count of days by calling the DAYS() function.
![]() |
In DB2, the query looks like this:
SELECT StudentID, MAX(DAYS(CURRENT_DATE) - DAYS(EnrolledOn)) AS DaysEnrolled FROM Enrollment GROUP BY StudentID ORDER BY DaysEnrolled DESC;
Again, you retrieve the present date using CURRENT_DATE. The DAYS() function allows you to represent a date as a simple integer by returning the number of days since December 31, 1 BC, so you just need to calculate the difference between the DAYS() values for the current date and for the EnrolledOn date:
StudentID DaysEnrolled ----------- ------------ 4 152 7 152 10 152 ... ...
![]() |
The MySQL version of the query is similar to the DB2 version except that you use the TO_DAYS() function.
![]() |
This is the MySQL version of the query:
SELECT StudentID, MAX(TO_DAYS(CURRENT_DATE) - TO_DAYS(EnrolledOn)) AS DaysEnrolled FROM Enrollment GROUP BY StudentID ORDER BY DaysEnrolled DESC;
As in Oracle and DB2, you use CURRENT_DATE to retrieve the present date. TO_DAYS() works in a similar way to DB2's DAYS() function, but it returns the number of days since the start of a mythical "Year Zero" (this figure is 365 days greater than the figure returned by DAYS()):
StudentID DaysEnrolled ----------- ------------ 4 152 7 152 10 152 ... ...
![]() |
The Access version of the query is the simplest; all you need to do is perform a straightforward subtraction.
![]() |
The query looks like this in Access:
SELECT StudentID, MAX(DATE() - EnrolledOn) AS DaysEnrolled FROM Enrollment GROUP BY StudentID ORDER BY MAX(DATE() - EnrolledOn) DESC;
Here you use the DATE() function to get the current date. Simply subtracting one date from another returns the difference in days between the two dates. As in other Access examples, you can't use the alias of the calculated column in the ORDER BY clause, so you need to repeat the formula. Here are the results:
StudentID DaysEnrolled ----------- ------------ 4 152 7 152 10 152 ... ...
![]() |
You've already seen a few examples where you need to convert a value into a different data type. It may be necessary, for example, when you want to perform string concatenation including a numeric value or when you want to round a floating-point value to an integer for display. Most RDBMSs provide both generic functions that you can use to cast to any type and functions for casting to specific types such as strings or integers.
Let's look first at the functions that exist specifically for converting non-string types to strings. In most cases, you can also use these functions to alter the length of a string field.
In general, casts on SQL Server are performed using the CAST() or CONVERT() functions, which you'll look at shortly. However, you can use the STR() function to convert floating-point values into strings. This takes the following form:
STR(float, [length], [precision])
where float is the number you want to convert to a string, length is the length of the string, and precision is the number of decimal places to retain. Both the length and precision parameters are optional. If they aren't specified, length has a default value of 10, and precision a default of zero.
As an example, STR(43.2461, 5, 2) returns '43.25'. If the data is corrupted because the length isn't long enough to hold all the digits before the decimal point, the returned string will be filled with asterisks. For example, STR(4326.12, 3) returns '***'. If the length isn't sufficient to hold all the decimal places, these will be discarded and the value rounded according the space available, so STR(43.184, 4, 2) returns '43.2'.
Oracle allows you to convert character, numeric, and date types to strings using the TO_CHAR() and TO_NCHAR() functions. With dates and numeric types, you can also specify a format string. For example:
TO_CHAR(333, '0000.0') -- '0333.0' TO_CHAR(333.15, '0000.0') -- '0333.2' TO_CHAR(DATE '2003-05-23', 'DD MONTH YYYY') -- '23 MAY 2003'
DB2 is relatively strongly typed and has specific conversion functions for all its supported data types. These include the CHAR() and VARCHAR() functions, which you use for converting data to the CHAR and VARCHAR types, respectively. CHAR() can be used with most data types, but VARCHAR() is limited to character, date, and graphic types. If used with a character type, VARCHAR() can take a second parameter indicating the length of the string returned; otherwise, it takes only one parameter—the value to convert.
Depending on the type of data involved, CHAR() can also have a second parameter. If you're converting a date, you can specify one of the values ISO, USA, EUR, JIS, or LOCAL to indicate how to format the string. For example, CHAR(DATE('2003-02-24'), USA) returns '02/24/2003', but CHAR(DATE('2003-02-24'), EUR) returns '24.02.2003'.
You can also specify the length for character types or the decimal point character to use for floating-point types. For example, CHAR(32.47, ',') returns '32,47'.
MySQL is weakly typed and will generally try to convert values as necessary. If you need to convert a value explicitly to a string, you can simply concatenate the value with an empty string:
CONCAT(23.52, '') -- Returns '23.52'
Access supports the STR() function for converting dates and numeric types to strings:
STR(#2003-01-01#) ' Returns '01/01/2003', depending on locale STR(23.52) ' Returns '23.52'
Unlike the SQL Server version of this function, STR() in Access doesn't allow you to specify the length of the returned string.
Some RDBMSs also provide specific functions for casting non-numeric types to numeric types, but in many cases more flexibility is available with the generic conversion functions.
SQL Server doesn't have specific functions for converting to numeric types—use CAST()/CONVERT() instead.
Oracle supports the TO_NUMBER() function, which allows you to convert character types into numbers. As well as the string to convert, you can specify a format string that the function can use to work out how to interpret the string. This allows the string to contain characters such as hexadecimal and currency symbols that might not otherwise be permitted in a numeric value:
DB2 supports a range of functions for converting numeric and character values to different numeric types. Some of the more common are the following:
INTEGER()/INT(): Converts numeric or character data to an integer.
DOUBLE()/FLOAT(): Converts numeric or character data to a floating-point value.
DECIMAL()/DEC(): Converts numeric or character data to a decimal. As well as the data to convert, this function lets you specify the total number of digits and the number of digits after the decimal place; for example, DECIMAL(37.54, 3, 1) returns 37.5.
Because of its weak typing, MySQL allows you to convert strings or dates to numbers just by adding zero to them. For example, '23.5' + 0 returns 23.5 as a numeric type. If the string begins with a number but contains non-numeric characters, only that section that can be interpreted as a number will be returned; if the string begins with characters that can't be converted to a numeric value, zero will be returned. For example, '4.5 strings' returns 4.5, but 'strings 4 u' returns 0.
Two generic casting functions have fairly widespread support: CAST(), which is the ANSI standard and is supported by SQL Server, Oracle, DB2, and MySQL, and CONVERT(), which is the Open Database Connectivity (ODBC) form and is supported by SQL Server and MySQL.
Note |
Support for CAST() and CONVERT() was added in MySQL 4.0.2; in previous versions, you'll need to use the implicit casting methods mentioned previously. |
The syntax for CAST() is as follows:
CAST(expression AS data_type)
In SQL Server, Oracle, and DB2, the data_type can optionally include details of the size of the target data type and (for decimal values) the number of places after the decimal point:
CAST('3.521' AS DECIMAL(3,2)) -- Returns 3.52
Note |
Note that MySQL doesn't allow this syntax and only permits you to specify the data type itself—not size or precision details. Also, MySQL will only allow casting to one of the following types: BINARY, DATE, DATETIME, TIME, SIGNED [INTEGER],or UNSIGNED [INTEGER]. |
The syntax for CONVERT() in MySQL is as follows:
CONVERT(expression, data_type)
Again, data_type must be one of the types listed previously, and you aren't permitted to indicate the length of the type.
The SQL Server syntax for CONVERT() is as follows:
CONVERT(data_type [(length)], expression [, style])
The style parameter is a number that indicates how to format a datetime value if you're casting to a datetime or how a datetime is formatted if you're converting one to another type:
CONVERT(varchar, GETDATE(), 106) -- Returns current date in -- the format '27 Feb 2003'
See the Microsoft MSDN Web site at http://msdn.microsoft.com/library/en-us/tsqlref/ts_ca-co_2f3o.asp?frame=true for full details of the possible values of this number.
Now that you've seen how to use some of the most common built-in functions available in most RDBMSs, let's look at how you can define your own. ANSI SQL-99 defines a CREATE FUNCTION statement, which is implemented (with the inevitable differences) by most vendors and which allows you to create your own User-Defined Functions (UDFs). The basic syntax, common to most implementations, boils down to no more than this:
CREATE FUNCTION function_name [(parameter_list)] RETURNS data_type <SQL statement(s)>
To learn the actual syntax in the different RDBMSs, you'll look at an example. Notice that Access doesn't support UDFs, so there isn't an example for that system. For the sake of both clarity and conciseness, we won't present every option for each RDBMS, but we'll explain the main options. This example is based on the string manipulation example earlier in this chapter, where you reversed the order of the first and last names of each student.
Calling user-defined functions is the same as calling the built-in functions. Once you've created it, you can call your FormatName() function in the query:
SELECT FormatName(Name) AS StudentName FROM Student ORDER BY StudentName;
However, you may need to specify the schema where the function is stored in SQL Server (for example, dbo.FormatName(Name)).
This will return a list of the students, sorted according to their last names:
StudentName ----------------- Akbar, Mohammed Alaska, Steve Andrews, Vic Burton, Gary Fernandez, Maria Foster, Andrew Jones, John Jonsson, Isabelle Lee, Bruce Picard, Julia Scarlett, Emily Wolff, Anna
The basic syntax for creating a function in SQL Server is as follows:
CREATE FUNCTION function_name [(parameter_list)] RETURNS data_type AS BEGIN [<SQL statements>] RETURN expression END
The function can contain multiple SQL statements but mustn't perform any permanent changes to the data in the database.
![]() |
Here's the SQL Server version of the FormatName() UDF:
CREATE FUNCTION FormatName (@FullName varchar(50)) RETURNS varchar(50) AS BEGIN RETURN RIGHT(@FullName, LEN(@FullName) - CHARINDEX(' ', @FullName) + 1) + ', ' + LEFT(@FullName, CHARINDEX(' ', @FullName) - 1) END
The parameter list defined after the function name takes the following form:
(@FullName varchar(50))
As with the parameters and variables of stored procedures (which you'll look at in Chapter 9, "Using Stored Procedures"), the parameters and variables in UDFs in SQL Server are prefixed with an @ character, both when they're defined in the parameter list and when they're referenced in the body of the function.
Although you could have multiple SQL statements in the function, in this case you just return the formatted name; this expression returned from the function is exactly the same expression as you used in the early example.
![]() |
In general, the basic syntax for defining Oracle functions is similar to that for stored procedures, which you'll look at in Chapter 9, "Using Stored Procedures":
CREATE [OR REPLACE] FUNCTION function_name (parameter_list) RETURN data_type IS variable_list BEGIN [<SQL statements>] RETURN expression; END; /
There are a couple of differences to the standard syntax. First, as with many Oracle objects, you can indicate that you want to replace any existing object of the same name using the OR REPLACE option. Note that the return type is declared using the RETURN keyword rather than the standard RETURNS. Parameters can be marked as input (IN) or output (OUT) parameters, just as they can for stored procedures.
Second, you need to declare any variables used in the function before the BEGIN...END block. The final / is simply a device to let SQL*Plus know that you've reached the end of a block that includes a number of SQL statements (as the semicolon that usually marks the end of a statement is used within CREATE FUNCTION).
![]() |
The Oracle version of the FormatName() function is as follows:
CREATE OR REPLACE FUNCTION FormatName(FullName IN varchar) RETURN varchar IS FormattedName varchar(50); BEGIN FormattedName := SUBSTR(FullName, INSTR(FullName, ' ') + 1) || ', ' || SUBSTR(FullName, 1, INSTR(FullName, ' ') - 1); RETURN(FormattedName); END; /
Here you define just one input parameter—the name that you want to format—and no output parameters. You also declare a variable, FormattedName, just to show the syntax although you could actually manage without it for this function.
Within the body of the function, you set your FormattedName variable to the expression you want to return (using the special PL/SQL := operator) and then return this variable from the function using the RETURN keyword. You'll look at using variables in SQL in more detail in Chapter 9, "Using Stored Procedures."
![]() |
The basic syntax for creating a function in DB2 follows the ANSI standard quite closely and looks like this:
CREATE FUNCTION function_name(parameter_list) RETURNS data_type [LANGUAGE SQL] [DETERMINISTIC | NON DETERMINISTIC] [CONTAINS SQL | READS SQL DATA] [BEGIN ATOMIC] [<SQL statements>] RETURN expression; [END]
There are a number of options you can specify for the function. First, you can tell DB2 that your function is in SQL (rather than a programming language such as C++ or Java) using the LANGUAGE SQL option. Second, you can use the DETERMINISTIC keyword to indicate that the function will always return the same result for a given parameter (or you can specify NOT DETERMINISTIC if this isn't the case); knowing this allows DB2 to optimize the function. Finally, you can specify READS SQL DATA to indicate that your function reads data from a database or CONTAINS SQL to indicate that your function uses SQL keywords and functions to manipulate the data passed in, without retrieving further data from the database.
The function body can consist merely of a RETURN statement followed by an expression, or it can contain a block of SQL statements surrounded by the BEGIN ATOMIC...END keywords.
![]() |
In DB2, the FormatName() UDF looks like this:
CREATE FUNCTION FormatName(FullName varchar(50)) RETURNS varchar(50) LANGUAGE SQL DETERMINISTIC CONTAINS SQL BEGIN ATOMIC DECLARE FormattedName VARCHAR(50); SET FormattedName = SUBSTR(FullName, POSSTR(FullName, ' ') + 1) || ', ' || SUBSTR(FullName, 1, POSSTR(FullName, ' ') - 1); RETURN FormattedName; END
This function will always return the same value for a given parameter, so you declare it as DETERMINISTIC. You don't read any data from the database—you just work with the string passed into the function—so you also include the CONTAINS SQL clause (rather than READS SQL DATA).
For this example, you could write the body of the function as a single RETURN statement, but you want to show the syntax using a block of SQL statements, so you haven't. Instead, as in the Oracle example, you've defined a variable to store the formatted name and then returned that. Before you use the variable, you must explicitly declare it:
DECLARE FormattedName VARCHAR(50);
You'll look at using DB2 variables in more detail in Chapter 9, "Using Stored Procedures."
![]() |
This last example is by far the most complicated because MySQL doesn't support UDFs written in SQL. Instead, you need to write the function in C or C++ and then register it with MySQL using the CREATE FUNCTION statement. The CREATE FUNCTION syntax for MySQL is as follows:
CREATE [AGGREGATE] FUNCTION function_name RETURNS {STRING | REAL | INTEGER} SONAME shared_library_name;
The AGGREGATE keyword must be specified if your function aggregates rows like SQL aggregate functions such as COUNT. This has an impact on the way the function is coded. The SONAME clause indicates the shared library (SO or DLL) where the function is defined.
Because this isn't a SQL example and because the example will differ depending on what platform you're using, we'll go through it quickly. For full details, check the MySQL documentation and the udf_example example functions, which are available as part of the source distribution of MySQL.
![]() |
First, then, you need to write the C/C++ code for the function. This code must be compiled into a shared object (Linux) or dynamic-link library (Windows). The following is the code for the FormatName() example:
#include <string> #include <my_global.h> #include <my_sys.h> #include <mysql.h> using namespace std; char* FormatName(UDF_INIT *initid, UDF_ARGS *args, char *result, unsigned long *length, char *is_null, char *error) { // Retrieve the Name parameter from the args parameter char* fullName = new char[args->lengths[0]]; fullName = args->args[0]; // Convert from char* to string string strName = fullName; // Trim whitespace from end of string int i = strName.length(); while (strName[i] == ' ' || strName[i] == '\0') i--; string trimName = strName.substr(0, i + 1); // Arrange in 'LastName, FirstName' format int spaceIndex = trimName.find_first_of(" "); string firstName = trimName.substr(0, spaceIndex); string lastName = trimName.substr(spaceIndex + 1); string formattedName = lastName + ", " + firstName; // Convert back to char* and set the length argument char* fmtName = new char[]; formattedName.copy(result, string::npos); *length = static_cast<unsigned long>(i + 2); return result; }
Note that MySQL prescribes the signature for the function. There are three versions, depending on whether the function returns a string (as previously), a floating-point value, or an integer.
Before compiling the DLL/SO, you need to indicate to the linker that your function is to be exported so that MySQL can see it within the library. To do this, you can use a linker definition file (in this case called MySQLFunction.def):
LIBRARY "MySQLFunction" EXPORTS FormatName
Once you've compiled the library, you need to register the function with MySQL. To do this, you just need to execute this CREATE FUNCTION statement:
CREATE FUNCTION FormatName RETURNS STRING SONAME 'C:\\MySQL\\lib\\MySQLFunction.dll';
In this case, you've specified the exact path to the DLL on a Windows machine. On a Linux machine, you would instead copy the SO to a location where the Id program can find it or configure Id to look in the directory where the SO is stored. How to do this varies depending on the exact platform.
![]() |