Using Indexes

Indexes are database objects designed to increase the overall speed of database operations by creating internal "lookup" tables to allow for very quick searches. They work on the presumption that the vast majority of database operations are read and search operations. Having an index in a table is likely to improve SELECT operations on that table, but it could slow down DELETE, UPDATE, and INSERT operations. In most cases, the gains of having an index on a frequently accessed column will considerably outweigh the drawbacks.

You can create one or more indexes on a particular table, each index working on one column or on a set of columns. When a table is indexed on a specific column, its rows are either indexed or physically arranged based on the values of that column and of the type of index. This allows the database to find information quickly if you perform a search based on the indexed columns. For example, if you have a long friends list in the Friend table, having an index on the Name column would significantly improve searches based on the name. This happens because instead of performing a full-table scan, the index is used to quickly find the requested record.

Indexes can slow down INSERT, UPDATE, and DELETE operations because the internal index structures need to be updated (or the table rows rearranged) each time such an operation occurs. For this reason, having too many indexes can slow down the general performance of the database and therefore should applied carefully.

The general rule of thumb is to set indexes on columns that are frequently used in WHERE or ORDER BY clauses, that are used in table joins, or that have foreign key relationships with other tables. However, especially for big databases, a carefully designed performance-testing plan can reveal the best combination of indexes for your database.

Also, from a performance standpoint, it's worth placing indexes on columns that aren't going to have much duplicated data (they're said to have high selectivity) and on columns for which the SELECT statements are likely to be highly restrictive—for example, when you're searching for a particular row rather than reading half of the table (in which case having an index may actually slow down the query).

Indexes can be unique or non-unique. Unique indexes don't permit repeating values on the indexed columns.

Note

Unique indexes are always created for columns that have PRIMARY KEY or UNIQUE constraints set.

Indexes that are automatically created as a result of setting a constraint (PRIMARY KEY or UNIQUE) are removed when the constraint is removed, and they can't be removed separately. They're tied to the constraint that created them.

Creating Indexes

Let's consider the Friend table once again:

CREATE TABLE Friend (
   FriendID INT PRIMARY KEY NOT NULL,
   Name VARCHAR(50),
   PhoneNo VARCHAR(15) DEFAULT 'Unknown Phone');

If you're planning to do frequent searches on this table based on the name (such as, What is John's phone number?), placing an index on the Name column will improve the performance of your queries.

The command to create a non-unique index on Name is as follows:

CREATE INDEX NameIndex ON Friend (Name);

For unique indexes, you add the UNIQUE keyword:

CREATE UNIQUE INDEX NameIndex ON Friend (Name);

Make sure Friend is empty and execute the following INSERT statements:

INSERT INTO Friend (FriendID, Name, PhoneNo) VALUES (1,'Mike','1234567');
INSERT INTO Friend (FriendID, Name, PhoneNo) VALUES (2,'Mike','3333333');
INSERT INTO Friend (FriendID, Name, PhoneNo) VALUES (3,'Cath','7654321');

If you created a UNIQUE index, the second INSERT will be rejected because it duplicates an existing name.

Having an index on Name will now make queries such as the following much faster (supposing, of course, that the Friend table is filled with far more records than you inserted here):

SELECT FriendID, Name FROM Friend WHERE Name='Cath';

The same happens for queries returning multiple rows:

SELECT FriendID, Name FROM Friend WHERE Name LIKE 'M%';

However, there's one catch: Text columns are indexed from left to right. For this reason, queries such as the previous one use the index you created on Name. However, the index isn't used when you're searching for substrings that don't occur at the start of the string (that is, the first character of the search string is a wildcard, as in the following query). In such situations, a normal, full-table scan is performed to find the requested rows:

SELECT FriendID, Name FROM Friend WHERE Name LIKE '%ike';

Dropping Indexes

The SQL command to drop indexes is DROP INDEX. However, its exact syntax depends on the database vendor.

With SQL Server, you need to prefix the index name with the table name:

DROP INDEX Friend.PhoneNoIndex;

With Oracle and DB2, you just need to specify the index name:

DROP INDEX PhoneNoIndex;

With MySQL, you also need to specify both the table name and index name but use a different syntax:

DROP INDEX PhoneNoIndex ON Friend;