MySQL ON DUPLICATE KEY UPDATE Explained: Syntax, Examples, Best Practices

目次

1. Introduction

When working with databases, one common challenge is handling duplicate data. For example, in a system that manages customer information, when registering a new customer, you must check whether the data already exists and update it if necessary. Managing this process manually can lead to errors and processing delays.

This is where MySQL’s ON DUPLICATE KEY UPDATE syntax becomes useful. By using this feature, you can automatically perform the appropriate action when duplicate data is detected. As a result, data management becomes more efficient and the workload on developers is reduced.

In this article, we will explain the basic syntax and usage examples of ON DUPLICATE KEY UPDATE, advanced techniques, and important points to keep in mind. By the end, developers from beginner to intermediate levels will be able to effectively use this feature in real-world projects.

2. What Is ON DUPLICATE KEY UPDATE?

In MySQL, ON DUPLICATE KEY UPDATE is a convenient clause that automatically updates existing data when an INSERT statement violates a primary key or unique key constraint. This allows you to handle both data insertion and updates efficiently within a single query.

Basic Concept

Normally, when inserting data with an INSERT statement, a duplicate primary key or unique key results in an error. However, by using ON DUPLICATE KEY UPDATE, you can perform the following actions:

  1. If the data being inserted is new, the INSERT operation executes normally.
  2. If the data being inserted conflicts with existing data, the specified columns are updated.

This enables efficient data manipulation while avoiding errors.

Basic Syntax

The basic syntax of ON DUPLICATE KEY UPDATE is as follows:

INSERT INTO table_name (column1, column2, column3)
VALUES (value1, value2, value3)
ON DUPLICATE KEY UPDATE column1 = value1, column2 = value2;
  • table_name: The target table name.
  • column1, column2, column3: The column names for insertion.
  • value1, value2, value3: The values to insert.
  • ON DUPLICATE KEY UPDATE: Specifies the update action when a duplicate key is detected.

Requirements

For this clause to work, the table must have at least one of the following constraints:

  • PRIMARY KEY: A column that holds unique values.
  • UNIQUE KEY: A column that does not allow duplicate values.

If none of these constraints exist, ON DUPLICATE KEY UPDATE will not function.

Example

As a simple example, consider inserting or updating data in a table that manages user information.

Table Definition

CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    email VARCHAR(100) UNIQUE
);

Using INSERT

The following query handles cases where the user ID or email address already exists:

INSERT INTO users (id, name, email)
VALUES (1, 'Taro', 'taro@example.com')
ON DUPLICATE KEY UPDATE name = 'Taro', email = 'taro@example.com';
  • If a user with ID 1 already exists, the name and email values are updated.
  • If not, a new record is inserted.

3. Basic Usage Examples

In this section, we introduce basic usage examples of ON DUPLICATE KEY UPDATE. We will explain both single-record and multiple-record operations.

Handling a Single Record

Let’s look at an example where a single record is inserted and updated if duplicate data exists.

Table Definition

CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    stock INT
);

Basic INSERT Statement

The following query inserts product data with ID 1. If it already exists, the stock value is updated.

INSERT INTO products (id, name, stock)
VALUES (1, 'Product A', 100)
ON DUPLICATE KEY UPDATE stock = 100;

How It Works

  • If product ID 1 does not exist, a new record is inserted.
  • If product ID 1 already exists, the stock column is updated to 100.

Handling Multiple Records

Next, let’s look at how to process multiple records in bulk.

Bulk Insert of Multiple Values

The following query inserts multiple product records at once:

INSERT INTO products (id, name, stock)
VALUES 
    (1, 'Product A', 100),
    (2, 'Product B', 200),
    (3, 'Product C', 300)
ON DUPLICATE KEY UPDATE stock = VALUES(stock);

How It Works

  • VALUES(stock) refers to the inserted values for each record (100, 200, 300).
  • If a product ID already exists, its stock is updated based on the inserted value.
  • If it does not exist, a new record is inserted.

Advanced: Updating Dynamic Values

You can also dynamically update values based on existing data. For example, adding to the existing stock:

INSERT INTO products (id, name, stock)
VALUES (1, 'Product A', 50)
ON DUPLICATE KEY UPDATE stock = stock + VALUES(stock);

How It Works

  • If product ID 1 already exists, 50 is added to the current stock value.
  • If it does not exist, a new record is inserted with stock set to 50.

Summary

  • You can efficiently process not only single records but also multiple records at once.
  • By using VALUES(), you can flexibly update columns based on inserted data.

4. Advanced Usage

By using ON DUPLICATE KEY UPDATE, you can go beyond basic insert/update operations and implement more flexible data handling. In this section, we explain advanced usage patterns such as conditional updates and combining this feature with transactions.

Conditional Updates

With ON DUPLICATE KEY UPDATE, you can update columns conditionally using CASE expressions or IF functions. This enables more flexible update logic depending on the situation.

Example: Update Stock Only When It Is Below a Threshold

The following example updates the stock value only when the current stock is below a certain number.

INSERT INTO products (id, name, stock)
VALUES (1, 'Product A', 100)
ON DUPLICATE KEY UPDATE stock = CASE 
    WHEN stock < 50 THEN VALUES(stock)
    ELSE stock
END;

How It Works

  • If product ID 1 exists and the current stock is less than 50, it is updated to the new value (100).
  • If the stock is 50 or more, it is not updated and the existing value is preserved.

Using Dynamic Updates

You can also perform dynamic calculations and update values based on the inserted data.

Example: Updating Cumulative Values

The following example adds the inserted stock value to the existing stock.

INSERT INTO products (id, name, stock)
VALUES (2, 'Product B', 50)
ON DUPLICATE KEY UPDATE stock = stock + VALUES(stock);

How It Works

  • If product ID 2 already exists, 50 is added to the existing stock value.
  • If it does not exist, a new record is inserted.

Combining with Transactions

By executing multiple INSERT statements (and other data operations) within a transaction, you can perform complex operations while maintaining data consistency.

Example: Batch Processing with a Transaction

The following example processes multiple records as a batch, and rolls back if an error occurs.

START TRANSACTION;

INSERT INTO products (id, name, stock)
VALUES 
    (1, 'Product A', 100),
    (2, 'Product B', 200)
ON DUPLICATE KEY UPDATE stock = VALUES(stock);

INSERT INTO products (id, name, stock)
VALUES 
    (3, 'Product C', 300)
ON DUPLICATE KEY UPDATE stock = stock + VALUES(stock);

COMMIT;

How It Works

  • Multiple queries are executed between START TRANSACTION and COMMIT.
  • If any query fails, the transaction is rolled back and no changes are applied to the database.

Practical Scenarios for Advanced Usage

Scenario 1: Inventory Management in an E-Commerce Site

When a product is purchased, you may want to decrease its stock count.

INSERT INTO products (id, name, stock)
VALUES (4, 'Product D', 100)
ON DUPLICATE KEY UPDATE stock = stock - 1;

Scenario 2: A User Points System

When adding points for an existing user:

INSERT INTO users (id, name, points)
VALUES (1, 'Taro', 50)
ON DUPLICATE KEY UPDATE points = points + VALUES(points);

Summary

  • By using CASE expressions and dynamic updates, you can implement complex conditional logic.
  • Combining transactions helps you perform safe operations while maintaining data consistency.
  • Applying this feature to practical scenarios enables more efficient data management.

5. Pitfalls and Best Practices

When using ON DUPLICATE KEY UPDATE, incorrect usage can lead to unexpected behavior or performance degradation. This section highlights key pitfalls and best practices to use it effectively.

Key Pitfalls

1. Interaction with AUTO_INCREMENT

  • Issue
    If the primary key uses AUTO_INCREMENT, the auto-increment value can increase even when a duplicate occurs.
    This happens because MySQL reserves a new ID at the time it attempts the INSERT.
  • Solution
    To avoid wasting IDs when an INSERT conflicts, rely on a unique key (not only AUTO_INCREMENT) and, if needed, use LAST_INSERT_ID() to retrieve the latest ID.
INSERT INTO products (id, name, stock)
VALUES (NULL, 'Product E', 50)
ON DUPLICATE KEY UPDATE stock = stock + VALUES(stock);

2. Deadlock Risk

  • Issue
    If multiple threads run ON DUPLICATE KEY UPDATE concurrently on the same table, deadlocks may occur.
  • Solution
  1. Standardize the execution order of queries.
  2. Use table locks when necessary (but be mindful of performance impact).
  3. Implement retry logic when deadlocks occur.

3. Proper Index Design

  • Issue
    If there is no primary key or unique key, ON DUPLICATE KEY UPDATE will not work. Also, poor indexing can severely degrade performance.
  • Solution
    Always define a primary key or unique key, and add appropriate indexes to frequently searched or updated columns.

Best Practices

1. Check Data in Advance

  • Use a SELECT statement before inserting to confirm whether the data exists and prevent unintended updates.
SELECT id FROM products WHERE id = 1;

2. Use Transactions

  • Use transactions to group multiple INSERT/UPDATE operations. This helps you safely maintain consistency.
START TRANSACTION;

INSERT INTO products (id, name, stock)
VALUES (1, 'Product A', 100)
ON DUPLICATE KEY UPDATE stock = stock + 50;

COMMIT;

3. Minimize Updated Columns

  • Limit the columns you update to improve performance and avoid unnecessary changes.
INSERT INTO products (id, name, stock)
VALUES (1, 'Product A', 100)
ON DUPLICATE KEY UPDATE stock = VALUES(stock);

4. Implement Error Handling

  • Prepare for deadlocks or failed inserts by implementing error handling, including retry or rollback logic.

Summary

  • Pitfalls: Be careful about AUTO_INCREMENT increments, deadlocks, and poor index design.
  • Best Practices: Use transactions and error handling to process data safely and efficiently.

6. Similar Features in Other Databases

MySQL’s ON DUPLICATE KEY UPDATE is a powerful feature that enables efficient data handling. However, it is specific to MySQL. Other database systems provide similar functionality, each with different characteristics. In this section, we compare similar features in PostgreSQL and SQLite.

PostgreSQL: ON CONFLICT DO UPDATE

In PostgreSQL, the equivalent feature is ON CONFLICT DO UPDATE. This clause provides a flexible way to handle duplicate data by specifying what action to take when a conflict occurs.

Basic Syntax

INSERT INTO table_name (column1, column2)
VALUES (value1, value2)
ON CONFLICT (column1) DO UPDATE
SET column2 = value2;
  • ON CONFLICT (column1): Specifies the conflict target (such as a primary key or unique key).
  • DO UPDATE: Defines the update action to execute when a conflict occurs.

Example

In the products table, update the stock if the product ID already exists:

INSERT INTO products (id, name, stock)
VALUES (1, 'Product A', 100)
ON CONFLICT (id) DO UPDATE
SET stock = EXCLUDED.stock;
  • EXCLUDED.stock: Refers to the value that was attempted to be inserted.

Key Characteristics

  • Difference from MySQL
    PostgreSQL allows you to explicitly define the conflict condition, making it more flexible when working with tables that have multiple unique constraints.
  • Advantages
    It supports advanced conditional logic and fine-grained control over which columns are updated.

SQLite: INSERT OR REPLACE / INSERT OR IGNORE

SQLite provides INSERT OR REPLACE and INSERT OR IGNORE, which differ slightly from MySQL and PostgreSQL syntax.

INSERT OR REPLACE

INSERT OR REPLACE deletes the existing row and inserts a new one when a duplicate is detected.

Basic Syntax

INSERT OR REPLACE INTO table_name (column1, column2)
VALUES (value1, value2);

Example

If a product ID already exists, delete the existing record and insert a new one:

INSERT OR REPLACE INTO products (id, name, stock)
VALUES (1, 'Product A', 100);

Key Characteristics

  • Behavioral Difference
    Unlike MySQL or PostgreSQL, SQLite removes the existing record before inserting the new one.
  • Caution
    Because the old record is deleted, delete triggers may fire. Be careful if triggers are defined.

INSERT OR IGNORE

INSERT OR IGNORE silently skips the operation if a duplicate exists, without raising an error.

Comparison Table

DatabaseSyntaxCharacteristics
MySQLON DUPLICATE KEY UPDATEUpdates specific columns when duplicates occur. Simple and efficient.
PostgreSQLON CONFLICT DO UPDATESupports advanced conditional logic and high flexibility.
SQLiteINSERT OR REPLACE / IGNOREREPLACE deletes then inserts. IGNORE skips errors.

Summary

  • MySQL’s ON DUPLICATE KEY UPDATE is simple and efficient for handling insert-or-update logic.
  • PostgreSQL’s ON CONFLICT DO UPDATE offers more flexibility and advanced control.
  • SQLite’s INSERT OR REPLACE deletes existing data before inserting, which may trigger delete actions.

7. Conclusion

In this article, we explored MySQL’s ON DUPLICATE KEY UPDATE from basic syntax to advanced use cases, important considerations, and comparisons with other database systems. By properly understanding and using this feature, you can make database operations more efficient and improve application performance and reliability.

Advantages of ON DUPLICATE KEY UPDATE

  1. Efficient Data Management
  • Insert and update operations can be handled in a single query, making processing concise and fast.
  1. Simplified Duplicate Handling
  • You can clearly define behavior for duplicate data and reduce the risk of errors.
  1. High Flexibility
  • Supports dynamic updates and conditional logic for more advanced scenarios.

Effective Usage Scenarios

  • Inventory Management Systems
  • Dynamically update product stock levels.
  • User Management Systems
  • Add or update user information.
  • Points Management Systems
  • Add or update user reward points.

In these scenarios, using ON DUPLICATE KEY UPDATE reduces code complexity and improves maintainability.

Review of Important Considerations

  1. AUTO_INCREMENT Considerations
  • If the primary key uses AUTO_INCREMENT, be aware that IDs may increase even when duplicates occur.
  1. Avoiding Deadlocks
  • Properly design query execution order and transaction structure.
  1. Importance of Index Design
  • Properly configure primary and unique keys to avoid errors and improve performance.

Comparison Highlights

  • PostgreSQL’s ON CONFLICT DO UPDATE supports flexible conflict targeting.
  • SQLite’s INSERT OR REPLACE deletes before inserting, which may affect triggers.

Final Recommendations

  • Use ON DUPLICATE KEY UPDATE proactively for simple insert/update operations.
  • For large-scale operations or complex logic, combine it with transactions and pre-checks to improve safety.

By using ON DUPLICATE KEY UPDATE appropriately, you can enhance both development efficiency and application reliability. Apply the concepts from this article to your own projects.

8. FAQ

This article has covered many aspects of MySQL’s ON DUPLICATE KEY UPDATE. In this section, we address frequently asked questions to provide additional practical insights.

Q1: Which versions of MySQL support ON DUPLICATE KEY UPDATE?

  • A1: It is available in MySQL 4.1.0 and later. However, some behaviors may differ by version, so always consult the official documentation for your specific version.

Q2: Does ON DUPLICATE KEY UPDATE work without a primary key?

  • A2: No. It works only on tables that have a primary key or at least one unique key defined.

Q3: What is the difference between ON DUPLICATE KEY UPDATE and REPLACE?

  • A3:
  • ON DUPLICATE KEY UPDATE updates specified columns when a duplicate is detected.
  • REPLACE deletes the existing record and then inserts a new one, which may trigger delete actions and affect data consistency.

Q4: How can I optimize performance when using ON DUPLICATE KEY UPDATE?

  • A4:
  1. Proper Index Design: Ensure primary and unique keys are correctly defined.
  2. Minimize Updated Columns: Update only necessary columns.
  3. Use Transactions: Batch operations to reduce database overhead.

Q5: Can I change the duplicate detection condition?

  • A5: To change the condition, you must modify the primary key or unique key definition. The behavior of ON DUPLICATE KEY UPDATE itself cannot be altered.

Q6: What causes a “Duplicate entry” error and how can I fix it?

  • A6:
  • Cause: Attempting to insert data that violates a primary key or unique key constraint.
  • Solution:
    1. Check the table schema and identify the column causing the duplication.
    2. Use a SELECT statement to check for existing data before inserting.
    3. Properly configure ON DUPLICATE KEY UPDATE to handle conflicts.

Q7: Do triggers affect ON DUPLICATE KEY UPDATE?

  • A7: Yes. Both INSERT and UPDATE triggers may fire when using ON DUPLICATE KEY UPDATE. Design trigger logic accordingly.

Q8: Can I use the same query in other databases?

  • A8: Other databases provide similar functionality, but syntax and behavior differ. For example:
  • PostgreSQL: ON CONFLICT DO UPDATE
  • SQLite: INSERT OR REPLACE

Summary

This FAQ addressed common questions about ON DUPLICATE KEY UPDATE. Understanding error causes and performance optimization strategies is especially valuable in production environments. If issues arise, refer to these guidelines for troubleshooting.

By mastering ON DUPLICATE KEY UPDATE, you can build efficient and reliable database operations.