Table of Contents
Previous Section Next Section

Retrieving Catalog Information

Okay, after populating the data tables, it's time to do some querying against your data.

You'll start with some simple queries on the Product table. You'll continue by seeing how to associate products with their categories using the ProductCategory junction table, and finally you'll see how to associate the products with their departments.

What About the Products?

In the following sections, you'll look at queries that involve the Product table alone, not relating it to Category or Department. Let's start with one of the simplest meaningful queries that you can make of your database.

Getting the Entire List of Products Ordered by Name

A single SELECT statement combined with the ORDER BY clause does the trick in this case; this query should work the same for any existing relational database software:

SELECT Name, Description FROM Product ORDER BY Name;

As a result of this query, you should see the product names and descriptions you typed earlier when populating the Product table.

Getting the List of Products that are on Catalog Promotion

This isn't much more complicated than the previous query. This time, you filter the results based on the OnCatalogPromotion field, which needs to have a value of 1. Here's the query:

SELECT Name, Description
FROM Product
WHERE OnCatalogPromotion=1
ORDER BY Name;

Note that Access will store the OnCatalogPromotion value as -1, so you'll need to alter the WHERE clause appropriately to match the required records.

Based on your sample data, this query should return the following data:

   Name                  Description

   --------------------- -------------------------------------
   Beast Mask            Red-eyed and open-mouthed scary mask
                          guaranteed to scare!
   Devil Horn Boppers    These red glitter boppers are
                          guaranteed to attract attention. They
                          will soon be under your spell!
   Vampire Disguise      Vampire Set consisting of wicked wig,
                          fangs, and fake blood. Spinechilling!

Getting the Five Most Expensive Products

For this query, you need to use the ORDER BY clause to order the list of products by Price in descending order:

SELECT Name, Price FROM Product ORDER BY Price DESC;

Note that you can use multiple criteria with the ORDER BY clause. If, say, you want the products with the same price to be sorted by name in ascending order, you'd add a few bits to the query:

SELECT Name, Price FROM Product ORDER BY Price DESC, Name ASC;

The ASC keyword is optional because ascending is the default order anyway.

Now let's see how you get only the top five most expensive products. In practice, you need to limit the results you get from the previous query. This isn't handled in the same way by each RDBMS, so let's see how you can do that with each of them. In any case, you expect to get this list of results:

   Name                                       Price

   ----------------------------------- ----------------
   Miss Santa                               49.9900
   Cleopatra Kit                            14.9900
   Vampire Disguise                          9.9900
   Horned Devil Mask                         5.9900
   Beast Mask                                 5.9900

You may be wondering what would happen if you had asked for the top four most expensive items because you have two items priced at $5.99. Well, the way we've written the following queries, one of the $5.99 items would simply be omitted from the list. We're not going to worry too much about that here, but the way to get around this problem is described in detail, for each database, in Chapter 4, "Summarizing and Grouping Data."

SQL Server

SQL Server and Access support the TOP clause. To get the top five most expensive products, you need this command:

SELECT TOP 5 Name, Price
FROM Product
ORDER BY Price DESC, Name ASC;

With SQL Server, you can alternatively use SET ROWCOUNT to limit the number of returned rows. Here's an example:

SET ROWCOUNT 5;
SELECT Name, Price
FROM Product
ORDER BY Price DESC, Name ASC;

Also, you can limit by specifying what percent of the result set to return. The following command returns the top 15 percent of most expensive products:

SELECT TOP 15 PERCENT Name, Price
FROM Product
ORDER BY Price DESC, Name ASC;

MySQL

MySQL has the LIMIT keyword that has similar functionality. The following is the MySQL command that returns the first five most expensive products:

SELECT Name, Price
FROM Product
ORDER BY Price DESC, Name ASC
LIMIT 0,5;

As you can see, the LIMIT clause has two parameters. The first one specifies the index of the first returned record (starting with zero), and the second specifies how many rows to return.

Oracle

With Oracle, you use the ROWNUM pseudocolumn to limit the number of returned rows:

SELECT Name, Price FROM
      (SELECT Name, Price
       FROM Product
       ORDER BY Price DESC, Name ASC)
WHERE ROWNUM<=5;

This technique works only when you restrict the query for a ROWNUM that starts with one (such as ROWNUM <=5). If you try to restrict using another range, such as ROWNUM BETWEEN m and n, where m is greater than one, the query will return no results. You'll see a workaround to this limitation in the next exercise.

ROWNUM is a pseudocolumn that associates a row number to each record returned by a query. What's important to note here is that ROWNUM values are assigned before the ORDER BY clause executes. For this reason, you need to use a subselect instead of simply limiting by ROWNUM in the first place, like this:

SELECT Name, Price
FROM Product
WHERE ROWNUM<=5
ORDER BY Price DESC, Name ASC

This query doesn't return the expected results because the WHERE filter is applied first and takes the first five rows (in a random order). These rows are then ordered by Price and Name (which isn't what you intended to do).

DB2

Using DB2, you can list the top five most expensive products using FETCH FIRST n ROWS ONLY:

SELECT Name, Price
FROM Product
ORDER BY Price DESC, Name ASC
FETCH FIRST 5 ROWS ONLY;

Browsing Through Products

In real-world scenarios, you frequently need to present the user with lists of products, categories, and so on. When the list is long enough, a typical practice is to present a number of products at a time (five in these examples) and let the user browse the list using Next 5 Products and Previous 5 Products buttons.

In this exercise, you'll see how to retrieve the rows m to n (say, five to 10) in an ordered list of products. This time you'll order the products by their IDs.

At the first sight, it may seem that you can simply get the required products with a simple command such as this:

SELECT Name FROM Product WHERE ProductID BETWEEN 5 AND 10

However, this method isn't reliable because you aren't guaranteed to have the ProductIDs in sequential order, so you aren't guaranteed to get five products back. Additionally, this method is useless if the user wants the list of products sorted by any other column than ProductID.

For the purpose of these examples, you'll ask for the second group of five products (products size to 10), but because currently you have only eight products, this is what you should get:

   ProductID   Name

   ------------- -------------------------------------------
   6                   Sequinned Devil Horns
   7                   Devil Horn Boppers
   8                   Bug Eyed Glasses

Let's see how you can ask your databases to give you a specific portion of an ordered list of products. The solution is different for each RDBMS, so let's analyze each one individually.

MySQL

With MySQL, this problem turns out to be a piece of cake. With MySQL, you use the LIMIT keyword, just like in the previous example but using different parameters this time:

SELECT ProductID, Name
FROM Product
ORDER BY ProductID
LIMIT 5,5;

This time you specified 5 as the first parameter (instead of 0), mentioning that you want retrieve rows starting with the sixth record in the result set (remember that the row number is zero-based). The second parameter specifies how many rows you want to retrieve.

Oracle

With Oracle, you use a command similar to the one you saw in the previous example. However, there's a catch—you can't limit the result set based on the ROWNUM pseudocolumn being between m and n if m>1. If you execute the following query, no rows will be returned:

SELECT ProductID, Name FROM
     (SELECT ProductID, Name
      FROM Product
      ORDER BY ProductID)
WHERE ROWNUM BETWEEN 6 AND 10;

The workaround to this problem is as follows:

SELECT ProductID, Name FROM
     (SELECT ProductID, Name, ROWNUM AS rn FROM
          (SELECT ProductID, Name
           FROM Product
           ORDER BY ProductID)
     ) inner
WHERE inner.rn BETWEEN 6 AND 10;

Here you "saved" the ROWNUM values in a new row of the subquery, so you could filter correctly the results in the outer query.

Alternatively, and more elegantly, you can use the RANK analytic function, as described in Chapter 4, "Summarizing and Grouping Data":

SELECT ProductID, Name FROM
(
SELECT RANK() OVER (ORDER BY ProductID) As Ranking, ProductID, Name
FROM Product
ORDER BY PRODUCTID
)
WHERE Ranking BETWEEN 6 AND 8;
SQL Server

SQL Server doesn't provide you with a simple way to achieve your goal. The TOP clause works well as long as you're only interested in the top records of a query. There are a number of workarounds for this limitation.

Let's discuss the most obvious solution first. This uses the SQL-99 syntax (so with some tweaking it works with other databases as well) and uses correlated subqueries (correlated subqueries were discussed in Chapter 6, "Combining SQL Queries"):

SELECT ProductID, Name
FROM Product AS P1
WHERE
  (SELECT COUNT(*) FROM Product AS P2
   WHERE P1.ProductID >= P2.ProductID) BETWEEN 6 AND 10
ORDER BY ProductID;

Note 

This method only works if the list of products is ordered by a unique column.You can't use it, for example, if the products are ordered by price and if there's more than one product with the same price.

Remember that a correlated subquery executes once for every record of the outer query. In this example, for each product returned by the outer query, the subquery calculates how many products have a lower ID. You're looking for products for which that number is between six and 10.

The second method uses the TOP function. The idea is simple, but the SQL code will look a bit awkward at first. If you want to retrieve products between six and 10, here's a working strategy:

  • First, you get the TOP 10 products, ordered by ProductID (ascending). The products you're interested in are the bottom of this list.

  • You order the resulting list in descending order by ProductID (so the rows you're interested in will be at the TOP).

  • You get the TOP 5 products from list.

  • You finally sort the remaining products in ascending order by ProductID.

By using subqueries, you can implement all these steps in a single SQL command:

SELECT ProductID, Name
FROM
  (SELECT TOP 5 ProductID, Name
   FROM
     (SELECT TOP 10 ProductID, Name
      FROM Product
      ORDER BY ProductID) AS P1
   ORDER BY ProductID DESC) AS P2
ORDER BY ProductID

The last technique I'll present for SQL Server consists of using a temporary table with an IDENTITY column. This solution is versatile so that it can also be implemented for other databases as well. For more information about temporary tables and IDENTITY columns, please read Chapter 12. The code for this solution, presented below, is self-explanatory:

/* Create a temporary table named #Product, having an IDENTITY column */
CREATE TABLE #Product
(RowNumber SMALLINT NOT NULL IDENTITY(1,1),
 ProductID INT,
 Name VARCHAR(50))
/* Populate the temporary table with records from the table we browse through.
   The RowNumber column will generate consecutive numbers for each product.
   We're free to order the list of products by any criteria. */
INSERT INTO #Product (ProductID, Name)
SELECT ProductID, Name
FROM Product
ORDER BY Price DESC

/* Retrieve the requested group or records from the temporary table */
SELECT ProductID, Name
FROM #Product
WHERE RowNumber BETWEEN 6 and 10

/* Drop the temporary table */
DROP TABLE #Product
DB2

The first query that was presented for SQL Server (using correlated subqueries) works with no modifications with DB2. Here it is again, for reference:

SELECT ProductID, Name
FROM Product AS P1
WHERE
  (SELECT COUNT(*) FROM Product AS P2
   WHERE P1.ProductID >= P2.ProductID) BETWEEN 6 AND 10
ORDER BY ProductID;

The second solution is DB2 specific. It uses the ROWNUMBER function in a sub-query to generate a "rownumber" fabricated column, which is then used to filter and retrieve the specified range of products:

WITH temp AS
  (SELECT ProductID, Name,
           ROWNUMBER() OVER(ORDER BY ProductID) AS row
   FROM Product)
SELECT ProductID, Name FROM temp WHERE row>5 and row<=10

ROWNUMBER, RANK, and DENSERANK are three analytical functions supported by DB2 that come in handy when you need to perform tasks that imply record numbering or ranking. The functions RANK and DENSE_RANK are also supported by Oracle, along with several others. (Chapter 4, "Summarizing and Grouping Data," demonstrates how to use the RANK function.) You'll also see them in action again, later in this case study. For detailed information about the analytic functions supported and how they work, please refer to your vendor's SQL reference manual.

Associated Products and Categories

In the previous exercises, you asked for products without associating them with the existing categories or departments. In reality, most of your queries regarding products will also involve the categories to which they belong.

Selecting Products that Belong to a Specific Category

In real-world scenarios, it's likely that you'll need to gather a list of products that belong to a specific category. In this case, the category of interest will always be identified by its ID (its primary key). You're interested in getting both the Name and Description of the products that belong to a specific category.

If you want only the IDs of the products associated with a specific Category, a simple query like this on ProductCategory solves the problem:

SELECT ProductID FROM ProductCategory WHERE CategoryID=3;

However, to get the Name and Description, you need to perform an INNER JOIN between Product and ProductCategory. You learned how to join tables in Chapter 7, "Querying Multiple Tables." This query returns all the products that belong to the category with an ID of 3:

SELECT Name
FROM Product INNER JOIN ProductCategory
ON Product.ProductID = ProductCategory.ProductID
WHERE ProductCategory.CategoryID = 3;

As an alternative syntax, you could use the WHERE clause to join your two tables:

SELECT Name
FROM Product, ProductCategory
WHERE Product.ProductID = ProductCategory.ProductID
AND ProductCategory.CategoryID = 3;

The results of the query look like this:

   Name

   ---------------------
   Beast Mask
   Cleopatra Kit
   Horned Devil Mask
   Sequinned Devil Horns

Selecting Categories that Contain a Specific Product

Compared to the previous query, this section covers the opposite scenario. Now you know the product, and you need to find the categories associated with it. Here's the query that lists the categories that contain the product with ProductID of 6:

SELECT Category.CategoryID, Name
FROM Category INNER JOIN ProductCategory
ON Category.CategoryID = ProductCategory.CategoryID
AND ProductCategory.ProductID = 6;

Caution 

Access doesn't like this join syntax because it's a bit more picky about how joins are constructed (see Chapter 7, "Querying Multiple Tables" for more information). Instead of using the JOIN syntax, you should use the alternative subquery method supplied here or alter the join appropriately.

You can achieve the same result using subqueries. The subquery returns the IDs of categories associated with the product you want, and using these results you get the names and descriptions you're interested in:

SELECT CategoryID, Name
FROM Category
WHERE CategoryID IN
    (SELECT CategoryID
     FROM ProductCategory
     WHERE ProductCategory.ProductID = 6);


Caution 

Remember that MySQL doesn't support subqueries. None of the examples that include subqueries will run in MySQL.

Based on the sample data, there are two matching categories for the product for which you were looking:

   CategoryID           Name

   -------------- --------------------
   3                    Masks
   4                    Sweet Stuff

What Products Belong to the Same Categories as Another Product?

This kind of query is useful in situations where you want to find out which products are similar to a specified product. So how do you find out what products belong to the same categories as another product (remember that a product can belong to more than one category)?

After having run the previous two exercises, this should be fairly straightforward. First, remember how you extracted the products that belong to a category:

SELECT Name
FROM Product INNER JOIN ProductCategory
ON Product.ProductID = ProductCategory.ProductID
AND ProductCategory.CategoryID = 3;

Now you need to determine which products belong to one or more categories, so you'll need to do something like this:

SELECT Name
FROM Product INNER JOIN ProductCategory
ON Product.ProductID = ProductCategory.ProductID
WHERE ProductCategory.CategoryID IN (<list of categories>)

All you need to do is substitute <list of categories> for the list of categories in which you're interested.

Here's the query that alphabetically lists all the products that belong to the same categories as the product with the ID of 6:

SELECT DISTINCT Product.ProductID, Product.Name
FROM Product INNER JOIN ProductCategory
ON Product.ProductID = ProductCategory.ProductID
WHERE ProductCategory.CategoryID IN
   (SELECT Category.CategoryID
   FROM Category INNER JOIN ProductCategory
   ON Category.CategoryID = ProductCategory.CategoryID
   WHERE ProductCategory.ProductID = 6)
ORDER BY Name;

Here are the results:

   ProductID      Name

   ----------- ---------------------
   1              Beast Mask
   2              Cleopatra Kit
   7              Devil Horn Boppers
   3              Horned Devil Mask
   6              Sequinned Devil Horns

Note that you use SELECT DISTINCT to guard against duplicate results. You need this because the subquery might return more categories, and the products that belong to more than one of them would be listed more than once (because, remember, a product can belong to more than one category!).

Unfortunately, the previous statement won't execute on MySQL because MySQL doesn't support nested queries. Access also doesn't like this particular syntax, but in just a moment, you'll see an alternative syntax that does work on Access.

Getting the Same Results using Table Joins

The solution presented earlier is probably the most commonly used. However, as anticipated, you can obtain the same results using joins, with a much more elegant query. This solution also works with MySQL, which doesn't support subqueries:

SELECT DISTINCT P1.ProductID, P1.Name
FROM Product P1
  INNER JOIN ProductCategory PC1
    ON P1.ProductID = PC1.ProductID
  INNER JOIN ProductCategory PC2
    ON PC1.CategoryID = PC2.CategoryID
WHERE PC2.ProductID = 6
ORDER BY Name;

Tip 

Referring to Figure 14-6 will help you construct join queries. Just follow the path of the table joins in the figure, and everything will become clearer.

Notice that, in this example, we've used some table aliases to make the SQL code a bit shorter (ProductCategory is, after all, quite a long word to type!).

This is the same query, written with the alternate syntax (which works on Access):

SELECT DISTINCT P1.ProductID, P1.Name
FROM Product P1, ProductCategory PC1, ProductCategory PC2
WHERE P1.ProductID = PC1.ProductID
  AND PC1.CategoryID = PC2.CategoryID
  AND PC2.ProductID = 6
ORDER BY Name;

Wow, you have just joined three tables! And, to add to the confusion, you used two instances of the ProductCategory table. The action of joining two instances of the same table is called a self-join, and it works just as if they were two separate tables.

Joining the two ProductCategory instances on the CategoryID field generates a list of ProductID/ProductID pairs, containing products associated with the same category.

Now that you have this list of products, which are related by their category, the rest is simple: In the right side of the list, you filtered according to the ProductID that you were looking for (6 in this case). In this manner, in the left side of the list there remained only the products related to the product with an ID of 6. Finally, in the left side of the list you joined with the Product table in order to get the names and descriptions of the resulted products.

Note that you still need the DISTINCT keyword to filter duplicate records from products that belong to more than one of the resulting categories.

Getting a List of Categories and Their Products

Because you're expert in table joins by now, let's look at another simple exercise: Say you want to get a list containing categories and their products, like this:

   Category Name                       Product Name

   --------------------------------- -----------------------
   Exotic Costumes                     Cleopatra Kit
   Exotic Costumes                     Miss Santa
   Masks                               Horned Devil Mask
   Masks                               Cleopatra Kit
   Masks                               Sequinned Devil Horns
   Scary Costumes                      Vampire Disguise
   Scary Stuff                         Bug Eyed Glasses
   Sweet Stuff                         Sequinned Devil Horns
   Sweet Stuff                         Devil Horn Boppers

If you need to associate categories with products, the path followed by joins is quite clear. You have the ProductCategory table, which contains the IDs of associated categories and products. You join this table on its left and right sides to get the category and product names associated with the IDs. Here's the query that does this for you:

SELECT C.Name as "Category Name", P.Name as "Product Name"
FROM Product P
INNER JOIN ProductCategory PC ON P.ProductID = PC.ProductID
INNER JOIN Category C ON PC.CategoryID = C.CategoryID
ORDER BY C.Name, P.Name;

The same query, rewritten using WHERE statements instead of JOIN statements, looks like this (and works on Access):

SELECT C.Name as "Category Name", P.Name as "Product Name"
FROM Product P, ProductCategory PC, Category C
WHERE P.ProductID = PC.ProductID AND PC.CategoryID = C.CategoryID
ORDER BY C.Name, P.Name;

Getting the Same Results using Correlated Subqueries

This method is less obvious, but it's good to know that there are multiple ways to get the same results with SQL. In this particular scenario, the solution using table joins is likely to be faster, but in practice remember that only a good set of tests can show you what's the best solution for your database:

SELECT C.Name as "Category Name", Product.Name as "Product Name"
FROM Product, Category C
WHERE Product.ProductID IN
  (SELECT ProductID FROM ProductCategory
   WHERE ProductCategory.CategoryID = C.CategoryID)
ORDER BY C.Name, Product.Name;

This query is a bit more complicated than the previous ones—if you have problems understanding it, please reread the section on correlated subqueries in Chapter 6, "Combining SQL Queries."

What you did was to ask from the start for a category name and product name without specifying anything about the linking table, ProductCategory. In the outer query, you asked only for the columns you were interested in without specifying how they're related. However, for each category, the WHERE clause uses the correlated subquery to filter out only the products that are associated with it.

Remember that correlated subqueries execute for each record of the main query. In this case, for each category returned by the outer query, the correlated subquery executes and returns the products associated with that category. The outer query gets the results back and uses them to compose the list containing category and product names.

The important detail to understand is the way the categories from the outer query are used in the subquery:

WHERE ProductCategory.CategoryID = C.CategoryID)

This line does the whole trick: The category of the outer query (C.CategoryID) is used in a table join in the correlated subquery. The category ID from the outer query is referenced through the C alias defined for Category.

Getting the Most Expensive Products in Each Category

Say you want to get a list containing all existing categories, along with the most expensive two products for each category. Based on the data you have, the list should contain the following products:

   Category                Product                       Price

   --------------------- -----------------------------  ---------
   Exotic Costumes         Miss Santa                   49.9900
   Exotic Costumes         Cleopatra Kit                14.9900
   Masks                   Cleopatra Kit                14.9900
   Masks                   Beast Mask                    5.9900
   Scary Costumes          Vampire Disguise              9.9900
   Scary Stuff             Bug Eyed Glasses              2.7500
   Sweet Stuff             Sequinned Devil Horns         3.7500
   Sweet Stuff             Devil Horn Boppers            2.5000

Let's see how to obtain this list with SQL Server, Oracle, and DB2.

SQL Server

For SQL Server, here's the query that does the trick. You get the top two products for each category using a correlated subquery:

SELECT C.Name AS "Category", Product.Name AS "Product",
        Product.Price as "Price"
FROM Product, Category C
WHERE Product.ProductID IN
  (SELECT TOP 2 ProductID FROM ProductCategory
   WHERE ProductCategory.CategoryID = C.CategoryID
   ORDER BY Price DESC)
ORDER BY C.Name, Product.Price DESC;
DB2

DB2 doesn't allow you to use FETCH FIRST n ROWS ONLY in subqueries, so you can't simply translate the SQL Server version (which uses TOP) to a DB2 equivalent.

The DB2 solution uses the ROWNUMBER function, which you already saw in an earlier example. However, this time you use PARTITION BY in combination with ORDER BY to create a ranking for each product in descending order of its price, for each category in part:

WITH tmp (CategoryName, ProductName, ProductPrice, rank)
AS
    (SELECT C.Name, P.Name, P.Price,
            ROWNUMBER() OVER (PARTITION BY C.Name ORDER BY P.Price DESC)
            AS rank
    FROM Product P
    JOIN ProductCategory PC ON P.ProductID = PC.ProductID
    JOIN Category C ON C.CategoryID = PC.CategoryID)

SELECT CategoryName, ProductName, ProductPrice
FROM tmp
WHERE rank <= 2
ORDER BY CategoryName

PARTITION BY C.Name specifies that you want the products ranking created separately for each category. ORDER BY P.Price DESC specifies that, for each category, you want the products sorted in descending order of their price.

ROWNUMBER (or ROW_NUMBER) always returns a unique number for each returned row. DB2 also supports two similar functions, RANK and DENSERANK (or DENSE_RANK). RANK and DENSERANK return the same value for records that are equal in respect to the ORDER BY criterion—in this case, if two or more products of a category have the same price, they'll receive the same ranking. The difference between RANK and DENSERANK is that RANK leaves gaps in rankings when two or more products have the same ranking. Say, for example, if the first three most expensive products have the same price, they'll all receive a ranking of 1. The following product in the list will have a RANK of 4 but a DENSERANK of 2 because DENSERANK leaves no gaps.

Oracle

With Oracle, you apply a similar method as for DB2, except that the syntax is a little different. Oracle supports RANK and DENSE_RANK just like DB2, but it doesn't support ROW_NUMBER. For this reason the Oracle query will not be guaranteed to retrieve a maximum of two products for each category:

SELECT CategoryName, ProductName, ProductPrice
  FROM
    (SELECT C.Name as CategoryName,
             P.Name as ProductName,
             P.Price as ProductPrice,
             RANK() OVER (PARTITION BY C.Name ORDER BY P.Price DESC)
             AS rank
     FROM Product P
     JOIN ProductCategory PC ON P.ProductID = PC.ProductID
     JOIN Category C ON C.CategoryID = PC.CategoryID)
WHERE rank <= 2
ORDER BY CategoryName;

What About Products and Their Departments?

Here you take things one step further and analyze how you can get different product listings based on the departments to which they belong. The queries become a bit more complicated because a fourth table, Department, comes into play.

Selecting Products that Belong to a Specific Department

Say you know the ID of a department, and you want to get all the products that belong to it.

In the previous exercises, you learned how to extract the products in a specific category. Now, instead of looking for the products in a category, you're looking for the products that belong to a list of categories (more specifically, the list of categories in the department for which you're looking).

Using subqueries, you use the IN keyword to filter the results based on a list of category IDs. Here's the SQL query that does the trick for you:

SELECT Product.ProductID, Product.Name
FROM Product INNER JOIN ProductCategory
ON Product.ProductID = ProductCategory.ProductID
WHERE ProductCategory.CategoryID IN
   (SELECT CategoryID
    FROM Category
    WHERE DepartmentID = 1)
ORDER BY Product.Name;

This query outputs these records:

   ProductID                   Name

   ---------------------- ---------------------
   2                         Cleopatra Kit
   4                         Miss Santa
   5                         Vampire Disguise
Getting the Same Results using Table Joins

Using table joins, you need to join the Product, ProductCategory, and Category tables. You don't need to also join the Department table because you don't need anything from it (its name or description). You filter the results based on the DepartmentID field of the Category table:

SELECT Product.ProductID, Product.Name
FROM Product
   INNER JOIN ProductCategory
   ON Product.ProductID = ProductCategory.ProductID
   INNER JOIN Category
   ON ProductCategory.CategoryID = Category.CategoryID
WHERE Category.DepartmentID = 1
ORDER BY Product.Name;

The same query, without using JOIN, looks like this (note that this form will work on Access and pre-9i versions of Oracle):

SELECT Product.ProductID, Product.Name
FROM Product, ProductCategory, Category
WHERE Product.ProductID = ProductCategory.ProductID
   AND ProductCategory.CategoryID = Category.CategoryID
   AND Category.DepartmentID = 1
ORDER BY Product.Name;

Getting a List of the Departments and Their Products

Now, what if you want to see a list containing the departments and the products that belong to them: For example:

   Department                         Product

   --------------------        ---------------------
   Costume Accessories         Beast Mask
   Costume Accessories         Cleopatra Kit
   Costume Accessories         Horned Devil Mask
   Costume Accessories         Sequinned Devil Horns
   Full Costumes               Cleopatra Kit
   Full Costumes               Miss Santa
   Full Costumes               Vampire Disguise
   Jokes and Tricks            Bug Eyed Glasses
   Jokes and Tricks            Devil Horn Boppers
   Jokes and Tricks            Sequinned Devil Horns

This time, apart from the Product, ProductCategory, and Category tables, you'll also need the Department table. You need it to extract the department names.

Using table joins, you can extract this list using the following query. To get a list with products and departments, you make the usual trip starting with the Product table and ending with Department. This trip walks through the tables that have relationships: Product...ProductCategory...Category...Department. If you follow this road, what the query does becomes quite obvious:

SELECT Department.Name AS "Department",
        Product.Name AS "Product"
FROM Department
   INNER JOIN Category
   ON Department.DepartmentID = Category.DepartmentID
   INNER JOIN ProductCategory
   ON Category.CategoryID = ProductCategory.CategoryID
   INNER JOIN Product
   ON ProductCategory.ProductID = Product.ProductID
ORDER BY Department.Name, Product.Name;

You ordered departments by name, and for each department the products are also ordered by name.

Here is the same query using the alternate syntax that works on Access, as well as the other platforms:

SELECT Department.Name AS "Department",
        Product.Name AS "Product"
FROM Product, ProductCategory, Category, Department
WHERE Product.ProductID = ProductCategory.ProductID
  AND ProductCategory.CategoryID = Category.CategoryID
  AND Category.DepartmentID = Department.DepartmentID
ORDER BY Department.Name, Product.Name;
Getting the Same List with a Correlated Subquery

If you prefer using correlated subqueries, this is the way to go:

SELECT D.Name AS "Department", P.Name AS "Product"
FROM Product P, Department D
WHERE ProductID IN
     (SELECT ProductID
      FROM ProductCategory INNER JOIN Category
      ON ProductCategory.CategoryID = Category.CategoryID
      WHERE Category.DepartmentID = D.DepartmentID)
ORDER BY D.Name, P.Name;

On the outer query, you only specify the fields in which you're interested and enforce the relationships between them using the subquery. The correlated (inner) subquery is called for each department returned by the outer query, and it returns the ProductIDs associated with that department.

Note that you have flexibility to write the subquery in any number of ways, including using another subquery instead of table joins, like this:

SELECT D.Name AS "Department", Product.Name AS "Product"
FROM Product, Department D
WHERE ProductID IN
     (SELECT ProductID from ProductCategory
      WHERE ProductCategory.CategoryID IN
             (SELECT CategoryID FROM Category
              WHERE D.DepartmentID = DepartmentID))
ORDER BY D.Name, Product.Name;

You'll rarely want to complicate your queries like this, but it demonstrates how flexible SQL can be.

What About Departments that Have No Products?

In the previous queries, when you retrieved a list of departments and their products, you always assumed that each department had at least one product. Just for the fun of it, let's add a new department now:

INSERT INTO Department (Name) VALUES ('Books');

With this new department in place, run the previous SQL queries again. Maybe you'll be surprised to see that the new department doesn't show up, but this behavior is quite correct.

If you want to have all the departments listed, regardless of whether they have products, you'll need to modify the first query to use outer joins instead of inner joins. Chapter 7, "Querying Multiple Tables," presented outer joins. Here's the updated query:

SELECT Department.Name AS "Department",
        Product.Name AS "Product"
FROM Department
   LEFT JOIN Category
   ON Department.DepartmentID = Category.DepartmentID
   LEFT JOIN ProductCategory
   ON Category.CategoryID = ProductCategory.CategoryID
   LEFT JOIN Product
   ON ProductCategory.ProductID = Product.ProductID
ORDER BY Department.Name, Product.Name;

The result of this query is as follows:

   Department                         Product

   ----------------------          -----------------------
   Books                              NULL
   Costume Accessories                Beast Mask
   Costume Accessories                Cleopatra Kit
   Costume Accessories                Horned Devil Mask
   Costume Accessories                Sequinned Devil Horns
   Full Costumes                      Cleopatra Kit
   Full Costumes                      Miss Santa
   Full Costumes                      Vampire Disguise
   Jokes and Tricks                   Bug Eyed Glasses
   Jokes and Tricks                   Devil Horn Boppers
   Jokes and Tricks                   Sequinned Devil Horns

You used left outer joins in this example to make sure all departments are returned. If you used right outer joins instead, you would have to make that all products were listed instead-—including the ones that don't belong to any department.

Remember that an outer join always takes all the rows in one side of the join (left or right) and tries to match with rows from the other side of the join. If no match is made, NULL is assumed instead.

This is different from the inner join (the default type of join), which is exclusive—it only returns rows that have matches in both sides of the join. The opposite is the full join (FULL OUTER JOIN), which includes all the rows in both sides of the join.

How Many Products Belong to Each Department?

If you managed to find out how to get a list of products for each department, you're only one step away from counting how many products exist in each department. You do this using the GROUP BY clause and using the COUNT aggregate function:

SELECT Department.Name AS "Department",
        COUNT(Product.Name) AS "Products"
FROM Product
   INNER JOIN ProductCategory
   ON Product.ProductID = ProductCategory.ProductID
   INNER JOIN Category
   ON ProductCategory.CategoryID = Category.CategoryID
   INNER JOIN Department
   ON Category.DepartmentID = Department.DepartmentID
GROUP BY Department.Name
ORDER BY Department.Name;

Recall that we discussed the GROUP BY clause and aggregate functions in Chapter 4, "Summarizing and Grouping Data."

Here's the results list:

   Department                        Products

   --------------------            -----------
   Costume Accessories                   4
   Full Costumes                         3
   Jokes and Tricks                      3

If you wanted to count only those products that are on promotion for each department, you'd simply need to filter the products:

SELECT Department.Name AS "Department",
        COUNT(Product.Name) AS "Featured Products"
FROM Product
   INNER JOIN ProductCategory
   ON Product.ProductID = ProductCategory.ProductID
   INNER JOIN Category
   ON ProductCategory.CategoryID = Category.CategoryID
   INNER JOIN Department
   ON Category.DepartmentID = Department.DepartmentID
WHERE Product.OnDepartmentPromotion = 1
GROUP BY Department.Name
ORDER BY Department.Name;

The results should look like this:

   Department                     Featured Products

   --------------------            -----------------
   Costume Accessories                     1
   Full Costumes                           1
   Jokes and Tricks                        1
Getting the Same List with Correlated Subqueries

You can obtain the same results using correlated subqueries. There are two main ways you can rewrite the previous queries using correlated subqueries.

In the first example, you first group the products by department using GROUP BY and perform a COUNT for each group:

SELECT D.Name AS "Department", COUNT(P.Name) AS "Products"
FROM Department D, Product P
WHERE ProductID IN
     (SELECT ProductID
      FROM ProductCategory INNER JOIN Category
      ON ProductCategory.CategoryID = Category.CategoryID
      WHERE Category.DepartmentID = D.DepartmentID)
GROUP BY D.Name
ORDER BY D.Name;

In the second solution, you use a subquery to calculate the number of products for each department instead of using GROUP BY to group by departments. Note that the subquery is still a correlated subquery, and it executes once for each department selected by the outer query:

SELECT D.Name AS "Department",
     (SELECT COUNT(ProductID)
      FROM ProductCategory INNER JOIN Category
      ON ProductCategory.CategoryID = Category.CategoryID
      WHERE Category.DepartmentID = D.DepartmentID) AS "Products"
FROM Department D;

What is the Average Price of Products in Each Department?

Suppose you want to see the average price for all the products in each department. You can do this quickly now because it is similar to the query that counted the number of products in each department:

SELECT Department.Name AS "Department",
        AVG(Product.Price) AS "Average Price"
FROM Product
   INNER JOIN ProductCategory
   ON Product.ProductID = ProductCategory.ProductID
   INNER JOIN Category
   ON ProductCategory.CategoryID = Category.CategoryID
   INNER JOIN Department
   ON Category.DepartmentID = Department.DepartmentID
GROUP BY Department.Name;

After executing this query, you see that the Full Costumes department has the most expensive products:

   Department                     Average Price

   --------------------        ---------------------
   Costume Accessories               7.6800
   Full Costumes                    24.9900
   Jokes and Tricks                  3.0000

You can complicate things a bit more and ask, for example, for the average price in each department but only for those departments having at least three products. Also, you can request the results listed in descending order of the average department price:

SELECT Department.Name AS "Department",
        COUNT(Product.Name) AS "Products",
        AVG(Product.Price) AS "Average Price"
FROM Product
   INNER JOIN ProductCategory
   ON Product.ProductID = ProductCategory.ProductID
   INNER JOIN Category
   ON ProductCategory.CategoryID = Category.CategoryID
   INNER JOIN Department
   ON Category.DepartmentID = Department.DepartmentID
GROUP BY Department.Name
HAVING COUNT(Product.Price)>=3
ORDER BY AVG(Product.Price) DESC;

Because all departments have at least three products, you'll receive all of them back; this time, they're listed in descending order of their average price:

   Department                   Products           Average Price

   --------------------        -----------     ---------------------
   Full Costumes                   3                   24.9900
   Costume Accessories             4                    7.6800
   Jokes and Tricks                3                    3.0000

Table of Contents
Previous Section Next Section