Duplicate Records On Insert And Update

Sometimes its necessary to check if a record exists based on a few conditions before inserting a new record. For example you may want to check if a user with a particular email address exists before inserting a new record. To solve this problem you maybe tempted to do a look-up first, using code similar to listing 1.1 below.

Listing 1.1

  1. SELECT 1 AS user_exists FROM users WHERE email='john@mailxyz.com'

While the approach mentioned above will work, it's not very efficient since it would mean two separate requests to the MySQL server, wasting unnecessary resources.
Another approach is to add a unique index on the users table that will create a constraint such that values in the index must be distinct. In the case of the users table, we can add a unique key on the email field ensuring that duplicate records with the same email address don't get inserted. Listing 1.2 below shows a create table script. Notice it has a unique key on the email field also take note that there is an auto increment on the user_id.

Listing 1.2

  1. CREATE TABLE `users` (
  2. `user_id` int(11) NOT NULL AUTO_INCREMENT,
  3. `email` varchar(75) DEFAULT NULL,
  4. `first_name` varchar(45) DEFAULT NULL,
  5. `last_name` varchar(45) DEFAULT NULL,
  6. PRIMARY KEY (`user_id`),
  7. UNIQUE KEY `uni_email` (`email`)
  8. ) ENGINE=MyISAM

The above code will create a users table and create a constraint on the email field. To test the constraint, let's add some users.

  1. INSERT INTO users (email,first_name,last_name) VALUES
  2. ('j.smith@mailxyz.com', 'John', 'Smith'),
  3. ('s.baker@mailxyz.com', 'Suzan', 'Baker'),
  4. ('a.latif@mailxyz.com', 'Abdul', 'Latif')

 Now that we have a few users, let's try to add a user with a duplicate email address.

  1. INSERT INTO users (email,first_name,last_name) VALUES ('s.baker@mailxyz.com', 'Sarah', 'Baker')

If you try to execute the insert script above, MySQL will throw an error message similar to that shown in listing 1.3 below.

Listing 1.3


Error Code: 1062. Duplicate entry 's.baker@mailxyz.com' for key 'uni_email'


At this stage, we know that the constraint is working. All we need to do, is tell MySQL to ignore any inserts that will result in a duplicate entry. This can be done using the INSERT IGNORE syntax. Modify the insert statement from above to the following:

  1. INSERT IGNORE INTO users (email,first_name,last_name) VALUES ('s.baker@mailxyz.com', 'Sarah', 'Baker')

Now MySQL will no longer generate an error on duplicate entries but generate warning, the record being inserted will still be ignored and your application code will not result in an error. Also the auto increment value will be maintained, so you wont have gaps in your auto increment field. Using this approach ensures you don't have to use two separate queries (Look-up and Insert) and the integrity of your table will be maintained.

Return Insert Id On Duplicate Entry

In the previous examples, we looked at how to avoid duplicate entries using the IGNORE keyword, but the statement will not return the last insert id. This is an expected behavior since no record was actually inserted. In some cases however you may want to return the id of either the last insert id or the record id of the duplicate entry. The solution to this problem comes in the form of listing 1.4.

Listing 1.4

  1. INSERT INTO users (email,first_name,last_name) VALUES ('s.baker@mailxyz.com', 'Suzan', 'Baker')
  2. ON DUPLICATE KEY UPDATE user_id = LAST_INSERT_ID(user_id)

The code above simply does an update if a record matching the constraint has been found. You can update multiple columns using the normal update syntax, that is comma separate each column. Whats interesting in the example, is the use of the LAST_INSERT_ID function. The record id (user_id) is used to set the last insert id, which in tern is used to set the user_id in the update portion. Remember it is an update, so the update must be fulfilled.

No comments:

Post a Comment