Adding New Data

Obviously, SQL wouldn't be much use if all it allowed you to do were to extract data. In order for it to be useful, you need to be able to add data as well, which you do using the INSERT keyword.

Performing a Single Row INSERT

The basic use of INSERT uses the following syntax:

INSERT INTO Table (Columns) VALUES (Values);

The word INTO is an optional keyword in some Relational Database Management System (RDBMS) implementations, notably SQL Server and MySQL. It is, however, part of the SQL-99 specification and a required keyword for Access, DB2, and Oracle. Including INTO also makes what you're doing a bit more obvious, so we've included it in the SQL statements in this section. The following continues your examination of the syntax:

Table is the table to add values to, and acceptable values for this part of the SQL statement are the same as for SELECT statements examined in the previous chapter.
Columns is a comma-delimited list of the columns in the table into which to insert data. This can be a complete list of the columns in a table, but it doesn't necessarily have to be, depending on the database configuration.
Values is a comma-delimited list of the values to insert into those columns, which must appear in the same order as the columns in Columns.

Let's look at a simple example:

INSERT INTO LineProducts (ProductName, ProductCost, ItemsInStock)
VALUES ('Plastic Asparagus Tips', 15.99, 10);

This example would add a single new row into the LineProducts table. Notice that the literal values to be inserted (one string, one float, and one int) are formatted in various ways. This formatting depends on the data type of the column and the RDBMS being used. In the previous chapter you saw several string literal values enclosed in ' characters, but other values such as integer and floating-point values don't have delimiters.

Note

Chapter 5, "Performing Calculations and Using Functions," explores literal values in more depth.

We discussed earlier that the list of columns provided doesn't necessarily have be a complete list of the columns in a table. For example, the LineProducts table may well contain a primary key column called ProductID, in which case you might expect to see an INSERT statement such as the following:

INSERT INTO LineProducts (ProductID, ProductName, ProductCost, ItemsInStock)
VALUES (8, 'Plastic Asparagus Tips', 15.99, 10);

However, you can omit ProductID from your INSERT statement if the RDBMS is configured to automatically insert a value into that column whenever a row is added to the table. Such columns are often referred to as identity columns. This is often the case for primary key columns because it's much easier and safer to let the RDBMS take responsibility for entering data into such columns (where each value entered into that column must be unique).

Note

The technique you use to instruct your RDBMS to automatically insert values into a primary key column varies from one RDBMS to another. Chapter 12, "Working with Database Objects," discusses these techniques in detail.

There are other reasons why you might be able to omit columns from your INSERT statements, so let's summarize them now:

The column value must be automatically created by the RDBMS, which is usually the case for primary key columns.
The column value may be automatically created by the RDBMS—that is, where a default column value exists and can be added by the RDBMS when no alternative is supplied.
The column is set to be a timestamp type, in which case some RDBMSs will insert the current date and time for the column value.
The column is set to allow NULL values, and none of the previous reasons apply, meaning that the column will simply not have any data in it for the inserted row.

The previous conditions are RDBMS dependent, and some simpler implementations may work slightly differently. For example, there may not be a timestamp data type. For the most part, though, the previous will hold true whichever RDBMS you use.

Let's look at a working example.

ADDING A SINGLE ROW TO A TABLE

Say you want to insert a single row into the Professor table of your InstantUniversity database. Table 3-1 shows the Professor table.

Table 3-1: The Professor Table

Column Name

Null?

Data Type

PROFESSORID

NOT NULL

NUMBER(38)

NAME

NOT NULL

VARCHAR2(50)

Table 3-1: The *Professor* Table
Column Name	Null?	Data Type
PROFESSORID	NOT NULL	NUMBER(38)
NAME	NOT NULL	VARCHAR2(50)

In this case, the ProfessorID column is the primary key column, but you haven't instructed the RDBMS to automatically insert values on this column because, as discussed, the way in which you do this varies from RDBMS to RDBMS and because you want the base database to work on each platform. So, in this case, you have to specify both the ProfessorID and Name columns in the INSERT statement.

Let's see what data is in the table:

SELECT ProfessorID, Name from Professor;

You'll get the following results:

   ProfessorID  Name

   -----------  ----------------
   1            Prof. Dawson
   2            Prof. Williams
   3            Prof. Ashby
   4            Prof. Patel
   5            Prof. Jones
   6            Prof. Hwa

Now that you know which values have already been used in the ProfessorID column, you can safely insert your new row:

INSERT INTO Professor (ProfessorID, Name)
VALUES (7, 'Snail at work');

If you now rerun your previous SELECT query, you'll see the following:

   ProfessorID  Name

   -----------  ----------------
   1            Prof. Dawson
   2            Prof. Williams
   3            Prof. Ashby
   4            Prof. Patel
   5            Prof. Jones
   6            Prof. Hwa
   7            Snail at work

You've added a new row to the Professor table in the InstantUniversity database.

If the ProfessorID column were an identity column, then you would have to omit it in the INSERT statement because the RDMBS would be responsible for adding this data automatically. If you did attempt to insert a value into such a column in an RDBMS such as SQL Server that includes such functionality, you'd receive the following error:

Server: Msg 544, Level 16, State 1, Line 1
Cannot insert explicit value for identity column in table 'ColumnName' when IDENTITY_INSERT is set to OFF.

This message implies that this behavior can be overridden, which is true although this book won't be covering how to do things such as this until later. Even then, it's difficult to think of a situation where supplying your own value to an automatically numbered column would be useful. One possible situation could be where data absolutely must have specific values for all columns, but of course then you run the risk of insert failure if duplicate identity values crop up.

Note
Incidentally, the sharp-eyed among you may notice that 'Snail at work' is an anagram, but you'll have to figure it out for yourself.

Before moving on, it's worth mentioning that it isn't strictly necessary to include Columns in your INSERT statements. If you know the names of each column in the table and the exact order in which they were entered into the database, then you can omit the list of columns. For example:

INSERT INTO Professors
VALUES (7, 'Snail at work');

However, this isn't practical for larger tables and, in any event, it's much clearer and safer to explicitly name the columns in your statements.

Note also that on certain RDBMSs, including SQL Server, you can create a new row based on default values that are defined in the table specification using the DEFAULT VALUES method in place of the VALUES section. Suppose you have a table like this:

 CREATE TABLE Author (
   AuthorID int IDENTITY PRIMARY KEY,
   Name varchar(100) DEFAULT 'Anonymous',
   Dates varchar(20) DEFAULT 'Unknown'
);

and you want to insert a new row with just the default values for each column and the auto-generated ID field. You can't do it using the standard INSERT...VALUES syntax, so instead you'd use this:

INSERT INTO Author DEFAULT VALUES;

And that does the trick. Using DEFAULT VALUES instead of the standard VALUES clause inserts the default value for every column (or an auto-generated value for identity columns). If there isn't a default value for a non-identity column, the value is treated as NULL. If any column in the table is NOT NULL, isn't an identity column, and doesn't have a default value, then an error will be raised.

This could be useful for keeping an activity log (inserting date/time information whenever a transaction occurs) or for entering a placeholder record (for example, an "anonymous" user as the default entry for posts to a forum where registration isn't required).

Performing a Multi-Row INSERT

The first thing to point out in this section is that SQL doesn't allow you to add multiple rows using literal values as in the previous section. However, it's possible to add multiple rows using values obtained from the results of an embedded SELECT query. In effect, what this means is that you copy data from one database table into another although of course the names and number of columns needn't be the same for the source and destination tables.

When using an INSERT statement in this way, there are two changes from the syntax used in the previous section:

You don't use the VALUES keyword; instead, you use SELECT to embed the query that obtains the data to insert.
The list of columns to insert data to is optional. If it's supplied, then there must be an equal number of columns of the right types and in the right order returned by the SELECT query that makes up the values for the INSERT statement. If it isn't supplied, then the columns in the SELECT query must match the columns in the target table. As a rule, it's always worth including column names because it'll make debugging much easier.
This makes the syntax for a multi-row INSERT statement as follows:
```
INSERT INTO Table (Columns) SELECT SelectStatement;
```

where Columns is optional.

This statement can be useful in moving large amounts of data around in a database without having to do much work, for example, in backing up data. For instance, you could create a backup table called StuffBackup and then execute the following:

INSERT INTO StuffBackup SELECT ThingName, ThingCost FROM Stuff;

Alternatively, you can easily reshape data and rename columns of rows using a statement such as this:

INSERT INTO StuffV2 (TheName, TheCost)
SELECT ThingName, ThingCost FROM Stuff;

ADDING MULTIPLE ROWS TO A TABLE

Execute the following SQL statement against the InstantUniversity database:

INSERT INTO Professor (ProfessorID, Name)
   SELECT StudentID + 7, Name
   FROM Student;

If you then query the table in the same way as before, you should see something like the following:

   ProfessorID  Name

   -----------  ---------------
   ...          ...
   6            Prof. Hwa
   7            Snail at work
   8            John Jones
   9            Gary Burton
   ...          ...

Here you've taken data from the StudentID and Name columns in the Student table and added the data to the similarly named columns in the Professor table. With one short SQL statement, you added 12 rows of data to the Shippers table.

Note that the data also underwent some modification on the way—you added seven to the StudentID value before assigning it to the ProfessorID column, simply to avoid overlapping non-unique values.

In addition to this method for inserting multiple rows into a database, in SQL Server it's also possible to call a stored procedure when performing an insert, which could result in multiple rows being inserted into a table. For example:

INSERT INTO Professor (ProfessorID, Name)
EXEC sp_RecruitProfessors;

In this case, the stored procedure you've used (sp_RecruitProfessors) could contain SQL to perform a similar operation to the inline SELECT you saw in the example, producing the same result.