With the introduction of the RANK() function in the previous section, you actually strayed from the realm of basic aggregate functions into the more complex topic of analytic functions. Oracle and DB2 provide a range of analytic functions that allow you to perform complex computation that would otherwise need to be performed outside of the SQL standard.
Of the databases covered in this book, analytic functions are currently only supported in Oracle and DB2, so this section won't tackle the subject in much detail. You should consult your database documentation to see exactly which analytic functions are supported and how they work. However, if you do happen to be using Oracle or DB2, you can at least learn how aggregate functions such as SUM and AVG can be used as analytic functions.
The basic syntax for analytic functions is as follows:
Function (<argument>) OVER (<Partition clause> <ORDER BY clause> <Windowing clause>)
It's the OVER keyword that identifies your function as an analytic function. The best way to get a feel for this is to see it in action. Let's start with the simplest example:
SELECT StudentID, Mark, AVG(Mark) OVER () Average_Mark FROM StudentExam ORDER BY StudentID, Mark;
The output from this query is as follows (the output is cropped after six rows):
STUDENTID MARK AVERAGE_MARK ---------- ---------- ---------------- 1 55 63.9310345 1 73 63.9310345 2 39 63.9310345 2 44 63.9310345 2 63 63.9310345 3 78 63.9310345 ...
In the absence of any clauses, the average mark is computed over every row. The resulting value displayed for each row is the same as that which would have been obtained by a simple SELECT AVG(Mark) from StudentExam; query. Let's see what happens when you add an ORDER BY clause:
SELECT StudentID, Mark, AVG(Mark) OVER (ORDER BY StudentID, Mark) Running_Average FROM StudentExam ORDER BY StudentID, Mark;
You now obtain the following:
STUDENTID MARK RUNNING_AVERAGE ---------- ------- --------------- 1 55 55 1 73 64 2 39 55.6666667 2 44 52.75 2 63 54.8 3 78 58.6666667 ...<output cropped>... 10 79 63.9310345
As you can see, the calculation is performed differently when an ORDER BY clause is specified. It applies ordering to a group of data and essentially tells the database to calculate the average of the current row and all preceding rows (in effect, this applies a default windowing clause, but more about that in a moment). Thus, you get a running average over the group of data. In this case the "group" is every row of data, so by the time you get to the last row, you arrive at the average over all rows—the same value that the previous query supplied.
Finally, let's add a partitioning clause:
SELECT StudentID, Mark, AVG(Mark) OVER (PARTITION BY StudentID ORDER BY StudentID, Mark) Running_Avg_by_Student FROM StudentExam ORDER BY StudentID, Mark;
The output should look as follows:
STUDENTID MARK RUNNING_AVG_BY_STUDENT ---------- ------- ---------------------- 1 55 55 1 73 64 2 39 39 2 44 41.5 2 63 48.6666667 3 78 78 3 82 80
The partitioning clause logically breaks the data down into groups, and the function is applied independently to each group. Thus, you obtain a running average by student.
To finish off, let's briefly look at the windowing clause. The syntax gets a little complex here, and you should definitely refer to your database manual for full details. However, it basically allows you to supply a specific range of data against which you should execute the function. The previous query is actually equivalent to this:
SELECT StudentID, Mark, AVG(Mark) OVER (PARTITION BY StudentID ORDER BY StudentID, Mark RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) Running_Avg_by_Student FROM StudentExam ORDER BY StudentID, Mark;
The highlighted code specifies that the window of data is the current row and all rows preceding it (which is the default). However, you can change this. For example, the following code would take the average over only the current row and the preceding row:
SELECT StudentID, Mark, AVG(Mark) OVER (PARTITION BY StudentID ORDER BY StudentID, Mark ROWS 1 preceding ) Running_Avg_by_Student FROM StudentExam ORDER BY StudentID, Mark;
This short section should have at least given you a feel for the potential power of analytic functions.