- 1 1. Introduction
- 2 2. What Is the ROW_NUMBER() Function?
- 3 3. Practical Use Cases
- 4 4. Comparison with Other Window Functions
- 5 5. Alternatives for MySQL Versions Below 8.0
- 6 6. Caveats and Best Practices
- 7 7. Conclusion
1. Introduction
MySQL version 8.0 introduced many new features, and one of the most notable is support for window functions. In this article, we’ll focus on one of the most frequently used functions: ROW_NUMBER().
The ROW_NUMBER() function provides powerful capabilities for data analysis and reporting, making it easy to sort and rank data based on specific conditions. This article explains everything from basic usage and practical examples to alternative approaches for older MySQL versions.
Target readers
- Beginner to intermediate users with basic SQL knowledge
- Engineers and data analysts who process and analyze data using MySQL
- Anyone considering migrating to the latest MySQL version
Benefits of ROW_NUMBER()
This function lets you assign a unique number to each row based on specific conditions. For example, you can easily write queries like “create a ranking in descending order of sales” or “extract and organize duplicate data” in a concise way.
In older versions, you often had to write complex queries using user-defined variables. With ROW_NUMBER(), your SQL becomes simpler and more readable.
In this article, we’ll use concrete query examples and explain them in a beginner-friendly way. In the next section, we’ll take a closer look at the basic syntax and behavior of this function.
2. What Is the ROW_NUMBER() Function?
The ROW_NUMBER() function, newly added in MySQL 8.0, is a type of window function that assigns sequential numbers to rows. It can number rows by a specific order and/or within each group, which is extremely useful for data analysis and reporting. Here we’ll explain the basic syntax in detail with practical examples.
Basic syntax of ROW_NUMBER()
First, the basic format of ROW_NUMBER() is as follows.
SELECT
column_name,
ROW_NUMBER() OVER (PARTITION BY group_column ORDER BY sort_column) AS row_num
FROM
table_name;Meaning of each element
- ROW_NUMBER(): Assigns a sequential number to each row.
- OVER: Keyword used to define the window for a window function.
- PARTITION BY: Groups data by the specified column. Optional. If omitted, numbering is applied across all rows.
- ORDER BY: Defines the ordering used to assign numbers, i.e., the sorting criteria.
Basic example
For example, assume you have a table named “sales” with the following data.
| employee | department | sale |
|---|---|---|
| A | Sales Department | 500 |
| B | Sales Department | 800 |
| C | Development Department | 600 |
| D | Development Department | 700 |
To assign sequential numbers within each department in descending order of sales, use the following query.
SELECT
employee,
department,
sale,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS row_num
FROM
sales;Result
| employee | department | sale | row_num |
|---|---|---|---|
| B | Sales Department | 800 | 1 |
| A | Sales Department | 500 | 2 |
| D | Development Department | 700 | 1 |
| C | Development Department | 600 | 2 |
From this result, you can see that rankings by sales within each department are displayed.
How to use PARTITION BY
In the example above, the data is grouped by the “department” column. This assigns a separate sequence for each department.
If you omit PARTITION BY, numbering is assigned across all rows as one sequence.
SELECT
employee,
sale,
ROW_NUMBER() OVER (ORDER BY sale DESC) AS row_num
FROM
sales;Result
| employee | sale | row_num |
|---|---|---|
| B | 800 | 1 |
| D | 700 | 2 |
| C | 600 | 3 |
| A | 500 | 4 |
Characteristics and caveats of ROW_NUMBER()
- Unique numbering: Even if values are the same, the assigned numbers are unique.
- Handling NULLs: If ORDER BY includes NULLs, they appear first in ascending order and last in descending order.
- Performance impact: For large datasets, ORDER BY can be expensive, so proper indexing is important.
3. Practical Use Cases
Here are practical scenarios using MySQL’s ROW_NUMBER() function. This function is useful in many real-world cases, such as ranking data and handling duplicates.
3-1. Ranking within each group
For example, consider the case where you want to “rank employees by sales within each department” using sales data. Use the following dataset as an example.
| employee | department | sale |
|---|---|---|
| A | Sales Department | 500 |
| B | Sales Department | 800 |
| C | Development Department | 600 |
| D | Development Department | 700 |
Query example: Sales ranking by department
SELECT
employee,
department,
sale,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
sales;Result:
| employee | department | sale | rank |
|---|---|---|---|
| B | Sales Department | 800 | 1 |
| A | Sales Department | 500 | 2 |
| D | Development Department | 700 | 1 |
| C | Development Department | 600 | 2 |
In this way, each department gets its own sequence in descending order of sales, making it easy to generate rankings.
3-2. Extracting the Top N rows
Next, let’s look at a case where you want to “extract the top 3 employees by sales within each department.”
Query example: Extract Top N rows
WITH RankedSales AS (
SELECT
employee,
department,
sale,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
sales
)
SELECT
employee,
department,
sale
FROM
RankedSales
WHERE
rank <= 3;Result:
| employee | department | sale |
|---|---|---|
| B | Sales Department | 800 |
| A | Sales Department | 500 |
| D | Development Department | 700 |
| C | Development Department | 600 |
This example retrieves only the top 3 rows by sales within each department. As you can see, ROW_NUMBER() is suitable not only for ranking but also for filtering top results.
3-3. Finding and removing duplicate data
Databases sometimes contain duplicate records. In such cases, you can also handle them easily using ROW_NUMBER().
Query example: Detect duplicates
SELECT *
FROM (
SELECT
employee,
sale,
ROW_NUMBER() OVER (PARTITION BY employee ORDER BY sale DESC) AS rank
FROM
sales
) tmp
WHERE rank > 1;This query detects duplicates when multiple records exist for the same employee name.
Query example: Delete duplicates
DELETE FROM sales
WHERE id IN (
SELECT id
FROM (
SELECT
id,
ROW_NUMBER() OVER (PARTITION BY employee ORDER BY sale DESC) AS rank
FROM
sales
) tmp
WHERE rank > 1
);Summary
ROW_NUMBER() is useful in a variety of scenarios, such as:
- Ranking within each group
- Extracting Top N rows
- Detecting and deleting duplicates
This makes complex data processing and analysis simpler and more efficient.
4. Comparison with Other Window Functions
In MySQL 8.0, in addition to ROW_NUMBER(), there are window functions like RANK() and DENSE_RANK() that can be used for ranking and position calculations. While they have similar roles, their behavior and results differ. Here we’ll compare each function and explain when to use them.
4-1. RANK() function
The RANK() function assigns ranks, giving the same rank to equal values and skipping the next rank number.
Basic syntax
SELECT
column_name,
RANK() OVER (PARTITION BY group_column ORDER BY sort_column) AS rank
FROM
table_name;Example
Using the following data, calculate sales ranks.
| employee | department | sale |
|---|---|---|
| A | Sales Department | 800 |
| B | Sales Department | 800 |
| C | Sales Department | 600 |
| D | Sales Department | 500 |
Query example: Using RANK()
SELECT
employee,
sale,
RANK() OVER (ORDER BY sale DESC) AS rank
FROM
sales;Result:
| employee | sale | rank |
|---|---|---|
| A | 800 | 1 |
| B | 800 | 1 |
| C | 600 | 3 |
| D | 500 | 4 |
Key points:
- A and B with the same sales amount (800) are both treated as rank “1”.
- The next rank “2” is skipped, so C becomes rank “3”.
4-2. DENSE_RANK() function
The DENSE_RANK() function also assigns the same rank to equal values, but it does not skip the next rank number.
Basic syntax
SELECT
column_name,
DENSE_RANK() OVER (PARTITION BY group_column ORDER BY sort_column) AS dense_rank
FROM
table_name;Example
Using the same data as above, try the DENSE_RANK() function.
Query example: Using DENSE_RANK()
SELECT
employee,
sale,
DENSE_RANK() OVER (ORDER BY sale DESC) AS dense_rank
FROM
sales;Result:
| employee | sale | dense_rank |
|---|---|---|
| A | 800 | 1 |
| B | 800 | 1 |
| C | 600 | 2 |
| D | 500 | 3 |
Key points:
- A and B with the same sales amount (800) are both treated as rank “1”.
- Unlike RANK(), the next rank starts at “2”, so ranking continuity is preserved.
4-3. How ROW_NUMBER() differs
The ROW_NUMBER() function differs from the other two in that it assigns a unique number even when values are the same.
Example
SELECT
employee,
sale,
ROW_NUMBER() OVER (ORDER BY sale DESC) AS row_num
FROM
sales;Result:
| employee | sale | row_num |
|---|---|---|
| A | 800 | 1 |
| B | 800 | 2 |
| C | 600 | 3 |
| D | 500 | 4 |
Key points:
- Even if values are the same, each row gets a unique number, so there are no duplicate ranks.
- This is useful when you need strict ordering control or per-row uniqueness.
4-4. Quick use-case summary
| Function | Ranking behavior | Typical use case |
|---|---|---|
| ROW_NUMBER() | Assigns a unique number | When you need sequential numbering or unique identification per row |
| RANK() | Same rank for ties; skips the next rank number | When you want rankings with gaps reflecting ties |
| DENSE_RANK() | Same rank for ties; does not skip rank numbers | When you want continuous ranks without gaps |
Summary
ROW_NUMBER(), RANK(), and DENSE_RANK() should be used appropriately depending on the scenario.
- ROW_NUMBER() is best when you need unique numbers per row.
- RANK() is useful when you want ties to share a rank and you want to emphasize rank gaps.
- DENSE_RANK() is suitable when you want continuous ranks without gaps.

5. Alternatives for MySQL Versions Below 8.0
In versions earlier than MySQL 8.0, ROW_NUMBER() and other window functions are not supported. However, you can achieve similar behavior using user-defined variables. This section explains practical alternatives for MySQL versions below 8.0.
5-1. Sequential numbering using user-defined variables
In MySQL 5.7 and earlier, you can use user-defined variables to assign sequential numbers per row. Let’s look at the following example.
Example: Sales ranking by department
Sample data:
| employee | department | sale |
|---|---|---|
| A | Sales Department | 500 |
| B | Sales Department | 800 |
| C | Development Department | 600 |
| D | Development Department | 700 |
Query:
SET @row_num = 0;
SET @dept = '';
SELECT
employee,
department,
sale,
@row_num := IF(@dept = department, @row_num + 1, 1) AS rank,
@dept := department
FROM
(SELECT * FROM sales ORDER BY department, sale DESC) AS sorted_sales;Result:
| employee | department | sale | rank |
|---|---|---|---|
| B | Sales Department | 800 | 1 |
| A | Sales Department | 500 | 2 |
| D | Development Department | 700 | 1 |
| C | Development Department | 600 | 2 |
5-2. Extracting the Top N rows
To retrieve the Top N rows, you can use user-defined variables in a similar way.
Query:
SET @row_num = 0;
SET @dept = '';
SELECT *
FROM (
SELECT
employee,
department,
sale,
@row_num := IF(@dept = department, @row_num + 1, 1) AS rank,
@dept := department
FROM
(SELECT * FROM sales ORDER BY department, sale DESC) AS sorted_sales
) AS ranked_sales
WHERE rank <= 3;Result:
| employee | department | sale | rank |
|---|---|---|---|
| B | Sales Department | 800 | 1 |
| A | Sales Department | 500 | 2 |
| D | Development Department | 700 | 1 |
| C | Development Department | 600 | 2 |
This query assigns ranks by department and then extracts only rows within the top 3.
5-3. Detecting and deleting duplicates
You can also handle duplicate data using user-defined variables.
Query example: Detect duplicates
SET @row_num = 0;
SET @id_check = '';
SELECT *
FROM (
SELECT
id,
name,
@row_num := IF(@id_check = name, @row_num + 1, 1) AS rank,
@id_check := name
FROM
(SELECT * FROM customers ORDER BY name, id) AS sorted_customers
) AS tmp
WHERE rank > 1;Query example: Delete duplicates
DELETE FROM customers
WHERE id IN (
SELECT id
FROM (
SELECT
id,
@row_num := IF(@id_check = name, @row_num + 1, 1) AS rank,
@id_check := name
FROM
(SELECT * FROM customers ORDER BY name, id) AS sorted_customers
) AS tmp
WHERE rank > 1
);5-4. Caveats when using user-defined variables
- Session dependency
- User-defined variables are valid only within the current session. They cannot be reused across different queries or sessions.
- Dependency on processing order
- User-defined variables depend on execution order, so setting ORDER BY properly is critical.
- SQL readability and maintainability
- Queries can become complex, so in MySQL 8.0 and later, using window functions is recommended.
Summary
In MySQL versions below 8.0, you can use user-defined variables to implement sequential numbering and ranking in place of window functions. However, because queries tend to become more complex, it’s best to consider migrating to a newer version whenever possible.
6. Caveats and Best Practices
MySQL’s ROW_NUMBER() function and variable-based alternatives are very convenient, but there are important points to keep in mind to run them accurately and efficiently. This section explains practical caveats and best practices for performance optimization.
6-1. Performance considerations
1. ORDER BY cost
ROW_NUMBER() is always used with ORDER BY. Since it requires sorting, processing time can increase significantly for large datasets.
Mitigation:
- Use indexes: Add indexes to the columns used in ORDER BY to speed up sorting.
- Use LIMIT: Retrieve only the number of rows you actually need to reduce the amount of data processed.
Example:
SELECT
employee,
sale,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
sales
LIMIT 1000;2. Increased memory usage and disk I/O
Window functions are processed using temporary tables and memory. As data volume grows, memory consumption and disk I/O can increase.
Mitigation:
- Split queries: Break processing into smaller queries and extract data step-by-step to reduce load.
- Use temporary tables: Store extracted data in a temporary table and run aggregation from there to distribute the workload.
6-2. Query tuning tips
1. Check the execution plan
In MySQL, you can use EXPLAIN to check the query execution plan. This helps you verify whether indexes are being used correctly.
Example:
EXPLAIN
SELECT
employee,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
sales;Example output:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|---|---|---|---|---|---|---|---|---|---|
| 1 | SIMPLE | sales | index | NULL | sale | 4 | NULL | 500 | Using index |
If you see Using index, it indicates that the index is being used appropriately.
2. Optimize indexes
Be sure to add indexes to columns used in ORDER BY and WHERE. Pay special attention to the following.
- Single-column indexes: Good for simple sort conditions
- Composite indexes: Effective when multiple columns are involved in conditions
Example:
CREATE INDEX idx_department_sale ON sales(department, sale DESC);3. Use batch processing
Instead of processing a huge dataset all at once, you can reduce load by processing data in batches.
Example:
SELECT * FROM sales WHERE department = 'Sales Department' LIMIT 1000 OFFSET 0;
SELECT * FROM sales WHERE department = 'Sales Department' LIMIT 1000 OFFSET 1000;6-3. Maintaining data consistency
1. Updates and recalculation
When rows are inserted or deleted, numbering can change. Build a mechanism to recalculate numbers as needed.
Example:
CREATE VIEW ranked_sales AS
SELECT
employee,
sale,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
sales;Using a view helps you keep rankings current based on the latest data.
6-4. Best-practice query example
Below is an example of best practices that consider performance and maintainability.
Example: Extract Top N rows
WITH RankedSales AS (
SELECT
employee,
department,
sale,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
sales
)
SELECT *
FROM RankedSales
WHERE rank <= 3;This structure uses a common table expression (CTE) to improve readability and reusability.
Summary
When using ROW_NUMBER() or its alternatives, keep these points in mind:
- Improve speed through index optimization.
- Identify bottlenecks by checking the execution plan.
- Plan for data updates and maintain consistency.
- Use batch processing and CTEs to distribute load.
Applying these best practices will enable efficient processing for large-scale data analysis and reporting.
7. Conclusion
In this article, we focused on MySQL’s ROW_NUMBER() function, explaining everything from basic usage and practical examples to alternatives for older versions, plus caveats and best practices. In this section, we’ll recap the key points and summarize practical takeaways.
7-1. Why ROW_NUMBER() is useful
The ROW_NUMBER() function is especially convenient for data analysis and reporting in the following ways:
- Sequential numbering within groups: Easily create sales rankings by department or category-based rankings.
- Extracting Top N rows: Efficiently filter and extract data based on specific conditions.
- Detecting and deleting duplicates: Useful for data cleanup and organization.
Because it simplifies complex queries, it significantly improves SQL readability and maintainability.
7-2. Comparison with other window functions
Compared to window functions such as RANK() and DENSE_RANK(), ROW_NUMBER() differs in that it assigns a unique number even for identical values.
| Function | Feature | Use case |
|---|---|---|
| ROW_NUMBER() | Assigns a unique sequential number to each row | Best when you need unique identification or ranking with no duplicates |
| RANK() | Same rank for ties; skips the next rank number | When you need tie-aware rankings and rank gaps matter |
| DENSE_RANK() | Same rank for ties; does not skip rank numbers | When you want continuous ranking while handling ties |
Choosing the right function:
Selecting the best function for your purpose enables efficient data processing.
7-3. Handling older MySQL versions
For environments below MySQL 8.0, we also introduced approaches using user-defined variables. However, you should consider these caveats:
- Reduced readability due to more complex SQL
- Query optimization can be more difficult in some cases
- Additional handling may be required to maintain data consistency
If possible, strongly consider migrating to MySQL 8.0 or later and using window functions.
7-4. Key points for performance optimization
- Use indexes: Add indexes to columns used in ORDER BY to improve speed.
- Check execution plans: Validate performance in advance with EXPLAIN.
- Adopt batch processing: Process large datasets in smaller chunks to distribute load.
- Use views and CTEs: Improve reusability and simplify complex queries.
By applying these techniques, you can achieve efficient and stable data processing.
7-5. Final notes
ROW_NUMBER() is a powerful tool that can significantly improve the efficiency of data analysis.
In this article, we covered everything from basic syntax and practical examples to caveats and alternatives.
We encourage you to run the queries yourself while following along with this article. Improving your SQL skills will help you tackle more complex data analysis and reporting with confidence.
Appendix: Reference resources
- Official documentation: MySQL Window Functions
- Online SQL environment: SQL Fiddle (a tool that lets you run and test SQL online)


