ROW_NUMBER() ya MySQL Imeelezwa (MySQL 8.0): Upangaji, Maswali ya Top‑N, na Uondoa Nakala

目次

1. Introduction

MySQL version 8.0 introduced many new features, and one of the most notable is support for window functions. In this article, we’ll focus on one of the most frequently used functions: ROW_NUMBER().

The ROW_NUMBER() function provides powerful capabilities for data analysis and reporting, making it easy to sort and rank data based on specific conditions. This article explains everything from basic usage and practical examples to alternative approaches for older MySQL versions.

Target readers

  • Beginner to intermediate users with basic SQL knowledge
  • Engineers and data analysts who process and analyze data using MySQL
  • Anyone considering migrating to the latest MySQL version

Benefits of ROW_NUMBER()

This function lets you assign a unique number to each row based on specific conditions. For example, you can easily write queries like “create a ranking in descending order of sales” or “extract and organize duplicate data” in a concise way.

In older versions, you often had to write complex queries using user-defined variables. With ROW_NUMBER(), your SQL becomes simpler and more readable.

In this article, we’ll use concrete query examples and explain them in a beginner-friendly way. In the next section, we’ll take a closer look at the basic syntax and behavior of this function.

2. What Is the ROW_NUMBER() Function?

The ROW_NUMBER() function, newly added in MySQL 8.0, is a type of window function that assigns sequential numbers to rows. It can number rows by a specific order and/or within each group, which is extremely useful for data analysis and reporting. Here we’ll explain the basic syntax in detail with practical examples.

Basic syntax of ROW_NUMBER()

First, the basic format of ROW_NUMBER() is as follows.

SELECT
    column_name,
    ROW_NUMBER() OVER (PARTITION BY group_column ORDER BY sort_column) AS row_num
FROM
    table_name;

Meaning of each element

  • ROW_NUMBER() : Assigns a sequential number to each row.
  • OVER : Keyword used to define the window for a window function.
  • PARTITION BY : Groups data by the specified column. Optional. If omitted, numbering is applied across all rows.
  • ORDER BY : Defines the ordering used to assign numbers, i.e., the sorting criteria.

Basic example

For example, assume you have a table named “sales” with the following data.

employeedepartmentsale
ASales Department500
BSales Department800
CDevelopment Department600
DDevelopment Department700

To assign sequential numbers within each department in descending order of sales, use the following query.

SELECT
    employee,
    department,
    sale,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS row_num
FROM
    sales;

Result

employeedepartmentsalerow_num
BSales Department8001
ASales Department5002
DDevelopment Department7001
CDevelopment Department6002

From this result, you can see that rankings by sales within each department are displayed.

How to use PARTITION BY

In the example above, the data is grouped by the “department” column. This assigns a separate sequence for each department.

If you omit PARTITION BY, numbering is assigned across all rows as one sequence.

SELECT
    employee,
    sale,
    ROW_NUMBER() OVER (ORDER BY sale DESC) AS row_num
FROM
    sales;

Result

employeesalerow_num
B8001
D7002
C6003
A5004

Characteristics and caveats of ROW_NUMBER()

  • Unique numbering : Even if values are the same, the assigned numbers are unique.
  • Handling NULLs : If ORDER BY includes NULLs, they appear first in ascending order and last in descending order.
  • Performance impact : For large datasets, ORDER BY can be expensive, so proper indexing is important.

3. Practical Use Cases

Here are practical scenarios using MySQL’s ROW_NUMBER() function. This function is useful in many real‑world cases, such as ranking data and handling duplicates.

3-1. Ranking within each group

Kwa mfano, zingatia hali ambapo unataka “kuorodhesha wafanyakazi kwa mauzo ndani ya kila idara” kwa kutumia data ya mauzo. Tumia kichache hiki cha data kama mfano.

employeedepartmentsale
ASales Department500
BSales Department800
CDevelopment Department600
DDevelopment Department700

Mfano wa Swali: Upangaji wa Mauzo kwa Idara

SELECT
    employee,
    department,
    sale,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
    sales;

Matokeo:

employeedepartmentsalerank
BSales Department8001
ASales Department5002
DDevelopment Department7001
CDevelopment Department6002

Kwa njia hii, kila idara inapata mfuatano wake mwenyewe katika mpangilio wa kushuka wa mauzo, na hivyo kufanya iwe rahisi kuzalisha viwango.

3-2. Kuchukua Safu za Juu N

Kisha, wacha tuangalie hali ambapo unataka “kuchukua wafanyakazi 3 bora zaidi kwa mauzo ndani ya kila idara.”

Mfano wa Swali: Chukua Safu za Juu N

WITH RankedSales AS (
    SELECT
        employee,
        department,
        sale,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
    FROM
        sales
)
SELECT
    employee,
    department,
    sale
FROM
    RankedSales
WHERE
    rank <= 3;

Matokeo:

employeedepartmentsale
BSales Department800
ASales Department500
DDevelopment Department700
CDevelopment Department600

Mfano huu huchukua safu 3 tu bora zaidi kwa mauzo ndani ya kila idara. Kama unavyoona, ROW_NUMBER() inafaa si kwa upangaji pekee bali pia kwa kuchuja matokeo bora.

3-3. Kutafuta na Kuondoa Data Inayorudiwa

Hifadhidata wakati mwingine huwa na rekodi zinazorudiwa. Katika hali hizo, unaweza kuzishughulikia kwa urahisi kwa kutumia ROW_NUMBER().

Mfano wa Swali: Tambua Kurudiwa

SELECT *
FROM (
    SELECT
        employee,
        sale,
        ROW_NUMBER() OVER (PARTITION BY employee ORDER BY sale DESC) AS rank
    FROM
        sales
) tmp
WHERE rank > 1;

Swali hili linatambua kurudiwa wakati rekodi nyingi zipo kwa jina moja la mfanyakazi.

Mfano wa Swali: Futa Kurudiwa

DELETE FROM sales
WHERE id IN (
    SELECT id
    FROM (
        SELECT
            id,
            ROW_NUMBER() OVER (PARTITION BY employee ORDER BY sale DESC) AS rank
        FROM
            sales
    ) tmp
    WHERE rank > 1
);

Muhtasari

ROW_NUMBER() ni muhimu katika hali mbalimbali, kama vile:

  1. Upangaji ndani ya kila kundi
  2. Kuchukua Safu za Juu N
  3. Kutambua na kufuta kurudiwa

Hii inafanya usindikaji na uchambuzi wa data ngumu kuwa rahisi na bora zaidi.

4. Kulinganisha na Vipengele Vingine vya Dirisha

Katika MySQL 8.0, pamoja na ROW_NUMBER(), kuna vipengele vya dirisha kama RANK() na DENSE_RANK() ambavyo vinaweza kutumika kwa upangaji na hesabu za nafasi. Wakati wana majukumu sawa, tabia na matokeo yao yanatofautiana. Hapa tutalinganisha kila kipengele na kueleza wakati wa kuzitumia.

4-1. Kipengele cha RANK()

Kipengele cha RANK() kinawapa viwango, kikitumia kiwango sawa kwa thamani sawa na kuruka nambari ya kiwango kinachofuata.

Sintaksisi ya Msingi

SELECT
    column_name,
    RANK() OVER (PARTITION BY group_column ORDER BY sort_column) AS rank
FROM
    table_name;

Mfano

Kwa kutumia data ifuatayo, hesabu viwango vya mauzo.

employeedepartmentsale
ASales Department800
BSales Department800
CSales Department600
DSales Department500

Mfano wa Swali: Kutumia RANK()

SELECT
    employee,
    sale,
    RANK() OVER (ORDER BY sale DESC) AS rank
FROM
    sales;

Matokeo:

employeesalerank
A8001
B8001
C6003
D5004

Mambo Muhimu:

  • A na B zenye kiasi sawa cha mauzo (800) zote zinachukuliwa kama kiwango “1”.
  • Kiwango kinachofuata “2” kinarukwa, hivyo C inakuwa kiwango “3”.

4-2. Kipengele cha DENSE_RANK()

Kipengele cha DENSE_RANK() pia kinawapa kiwango sawa kwa thamani sawa, lakini hakiruki nambari ya kiwango kinachofuata.

Sintaksisi ya Msingi

SELECT
    column_name,
    DENSE_RANK() OVER (PARTITION BY group_column ORDER BY sort_column) AS dense_rank
FROM
    table_name;

Mfano

Kwa kutumia data sawa na hapo juu, jaribu kipengele cha DENSE_RANK().

Mfano wa Swali: Kutumia DENSE_RANK()

SELECT
    employee,
    sale,
    DENSE_RANK() OVER (ORDER BY sale DESC) AS dense_rank
FROM
    sales;

Matokeo:

employeesaledense_rank
A8001
B8001
C6002
D5003

Mambo Muhimu:

  • A na B wenye kiasi sawa cha mauzo (800) wanachukuliwa kama cheo “1”.
  • Tofauti na RANK(), cheo kinachofuata kinaanza kwa “2”, hivyo muendelezo wa cheo unabaki.

4-3. Jinsi ROW_NUMBER() inavyotofautiana

Kazi ya ROW_NUMBER() inatofautiana na nyingine mbili kwa sababu inakabidhi nambari ya kipekee hata wakati thamani ni sawa.

Mfano

SELECT
    employee,
    sale,
    ROW_NUMBER() OVER (ORDER BY sale DESC) AS row_num
FROM
    sales;

Matokeo:

employeesalerow_num
A8001
B8002
C6003
D5004

Mambo Muhimu:

  • Hata kama thamani ni sawa, kila safu hupata nambari ya kipekee, hivyo hakuna cheo kilichojirudia.
  • Hii ni muhimu unapohitaji udhibiti mkali wa mpangilio au kipekee kwa kila safu.

4-4. Muhtasari wa matumizi ya haraka

FunctionRanking behaviorTypical use case
ROW_NUMBER()Assigns a unique numberWhen you need sequential numbering or unique identification per row
RANK()Same rank for ties; skips the next rank numberWhen you want rankings with gaps reflecting ties
DENSE_RANK()Same rank for ties; does not skip rank numbersWhen you want continuous ranks without gaps

Muhtasari

ROW_NUMBER(), RANK(), na DENSE_RANK() zinapaswa kutumika ipasavyo kulingana na hali.

  1. ROW_NUMBER() ni bora unapohitaji nambari za kipekee kwa kila safu.
  2. RANK() ni muhimu unapohitaji viungo kushiriki cheo na unataka kusisitiza mapengo ya cheo.
  3. DENSE_RANK() inafaa unapohitaji cheo endelevu bila mapengo.

5. Mbadala kwa Matoleo ya MySQL Chini ya 8.0

Katika matoleo ya awali ya MySQL 8.0, ROW_NUMBER() na kazi nyingine za dirisha hazijaungwa mkono. Hata hivyo, unaweza kupata tabia sawa kwa kutumia vigezo vilivyofafanuliwa na mtumiaji. Sehemu hii inaelezea mbadala wa vitendo kwa matoleo ya MySQL chini ya 8.0.

5-1. Utoaji wa nambari mfululizo kwa kutumia vigezo vilivyofafanuliwa na mtumiaji

Katika MySQL 5.7 na matoleo ya awali, unaweza kutumia vigezo vilivyofafanuliwa na mtumiaji kugawa nambari mfululizo kwa kila safu. Hebu tazama mfano ufuatao.

Mfano: Upangaji wa mauzo kwa idara

Data ya mfano:

employeedepartmentsale
ASales Department500
BSales Department800
CDevelopment Department600
DDevelopment Department700

Swali:

SET @row_num = 0;
SET @dept = '';

SELECT
    employee,
    department,
    sale,
    @row_num := IF(@dept = department, @row_num + 1, 1) AS rank,
    @dept := department
FROM
    (SELECT * FROM sales ORDER BY department, sale DESC) AS sorted_sales;

Matokeo:

employeedepartmentsalerank
BSales Department8001
ASales Department5002
DDevelopment Department7001
CDevelopment Department6002

5-2. Kutoa Safu za Juu N

Ili kupata Safu za Juu N, unaweza kutumia vigezo vilivyofafanuliwa na mtumiaji kwa njia sawa.

Swali:

SET @row_num = 0;
SET @dept = '';

SELECT *
FROM (
    SELECT
        employee,
        department,
        sale,
        @row_num := IF(@dept = department, @row_num + 1, 1) AS rank,
        @dept := department
    FROM
        (SELECT * FROM sales ORDER BY department, sale DESC) AS sorted_sales
) AS ranked_sales
WHERE rank <= 3;

Matokeo:

employeedepartmentsalerank
BSales Department8001
ASales Department5002
DDevelopment Department7001
CDevelopment Department6002

Swali hili linagawa cheo kwa idara kisha huchukua safu zilizo ndani ya top 3.

5-3. Kugundua na kufuta nakala za data

Pia unaweza kushughulikia data inayojirudia kwa kutumia vigezo vilivyofafanuliwa na mtumiaji.

Mfano wa swali: Gundua nakala za data

SET @row_num = 0;
SET @id_check = '';

SELECT *
FROM (
    SELECT
        id,
        name,
        @row_num := IF(@id_check = name, @row_num + 1, 1) AS rank,
        @id_check := name
    FROM
        (SELECT * FROM customers ORDER BY name, id) AS sorted_customers
) AS tmp
WHERE rank > 1;

Mfano wa swali: Futa nakala za data

DELETE FROM customers
WHERE id IN (
    SELECT id
    FROM (
        SELECT
            id,
            @row_num := IF(@id_check = name, @row_num + 1, 1) AS rank,
            @id_check := name
        FROM
            (SELECT * FROM customers ORDER BY name, id) AS sorted_customers
    ) AS tmp
    WHERE rank > 1
);

5-4. Tahadhari wakati wa kutumia vigezo vilivyofafanuliwa na mtumiaji

  1. Utegemezi wa kikao
  • Vigezo vilivyofafanuliwa na mtumiaji ni halali tu ndani ya kikao cha sasa. Haziwezi kutumika tena katika maswali tofauti au vikao vingine.
  1. Utegemezi wa mpangilio wa usindikaji
  • Vigezo vilivyofafanuliwa na mtumiaji vinategemea mpangilio wa utekelezaji, hivyo kuweka ORDER BY ipasavyo ni muhimu.
  1. Usomaji na matengenezo ya SQL
  • Maswali yanaweza kuwa magumu, hivyo katika MySQL 8.0 na baadaye, matumizi ya kazi za dirisha yanapendekezwa.

Muhtasari

Katika matoleo ya MySQL chini ya 8.0, unaweza kutumia vigeuza vilivyoainishwa na mtumiaji kutekeleza nambari za mfululizo na cheo badala ya kazi za dirisha. Hata hivyo, kwa sababu masuala huwa magumu zaidi, ni bora kufikiria kuhamia toleo jipya wakati wowote iwezekanavyo.

6. Tahadhari na Mazoea Bora

Kazi ya ROW_NUMBER() ya MySQL na suluhisho za msingi wa vigeuza ni rahisi sana, lakini kuna mambo muhimu ya kukumbuka ili kuzifanya ziendane vizuri na kwa ufanisi. Sehemu hii inaeleza tahadhari za vitendo na mazoea bora kwa uboreshaji wa utendaji.

6-1. Mazingatio ya utendaji

1. Gharama ya ORDER BY

ROW_NUMBER() hutumika daima pamoja na ORDER BY. Kwa kuwa inahitaji kupanga, wakati wa uchakataji unaweza kuongezeka sana kwa data kubwa.

Suluhisho:

  • Tumia viashiria: Ongeza viashiria kwenye nguzo zinazotumiwa katika ORDER BY ili kuharakisha upangaji.
  • Tumia LIMIT: Chukua idadi ya mistari unayohitaji tu ili kupunguza kiasi cha data inayochakatwa.

Mfano:

SELECT
    employee,
    sale,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
    sales
LIMIT 1000;

2. Matumizi ya kumbukumbu yaliyoongezeka na I/O ya diski

Kazi za dirisha huchakatwa kwa kutumia meza za muda na kumbukumbu. Kadri data inavyoongezeka, matumizi ya kumbukumbu na I/O ya diski yanaweza kuongezeka.

Suluhisho:

  • Gawanya masuala: Vunja uchakataji katika masuala madogo na chukua data hatua kwa hatua ili kupunguza mzigo.
  • Tumia meza za muda: Hifadhi data iliyochukuliwa katika meza ya muda na uendane na muunganisho kutoka hapo ili kusambaza mzigo wa kazi.

6-2. Vidokezo vya kurekebisha masuala

1. Angalia mpango wa utekelezaji

Katika MySQL, unaweza kutumia EXPLAIN kuangalia mpango wa utekelezaji wa swali. Hii inakusaidia kuthibitisha kama viashiria vinatumika vizuri.

Mfano:

EXPLAIN
SELECT
    employee,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
    sales;

Matokeo ya mfano:

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra
1SIMPLEsalesindexNULLsale4NULL500Using index

Kama unaona Using index, inaonyesha kuwa kiashiria kinatumika vizuri.

2. Boresha viashiria

Hakikisha kuongeza viashiria kwenye nguzo zinazotumiwa katika ORDER BY na WHERE. Zingatia sana yafuatayo.

  • Viashiria vya nguzo moja: Vizuri kwa hali za upangaji rahisi
  • Viashiria vya mchanganyiko: Zenye ufanisi wakati nguzo nyingi zinahusika katika hali

Mfano:

CREATE INDEX idx_department_sale ON sales(department, sale DESC);

3. Tumia uchakataji wa kundi

Badala ya kuchakata data kubwa mara moja, unaweza kupunguza mzigo kwa kuchakata data katika makundi.

Mfano:

SELECT * FROM sales WHERE department = 'Sales Department' LIMIT 1000 OFFSET 0;
SELECT * FROM sales WHERE department = 'Sales Department' LIMIT 1000 OFFSET 1000;

6-3. Kudumisha uthabiti wa data

1. Sasisho na kuhesabu upya

Wakati mistari inaingizwa au kufutwa, nambari zinaweza kubadilika. Jenga utaratibu wa kuhesabu nambari upya kama inavyohitajika.

Mfano:

CREATE VIEW ranked_sales AS
SELECT
    employee,
    sale,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
    sales;

Kutumia mwonekano husaidia kuweka viwango vya sasa kulingana na data ya hivi karibuni.

6-4. Mfano wa swali la mazoea bora

Hapa chini kuna mfano wa mazoea bora yanayozingatia utendaji na uwezo wa kudumisha.

Mfano: Chukua Mistari ya Juu N

WITH RankedSales AS (
    SELECT
        employee,
        department,
        sale,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
    FROM
        sales
)
SELECT *
FROM RankedSales
WHERE rank <= 3;

Muundo huu hutumia ushirika wa meza ya kawaida (CTE) ili kuboresha uwezo wa kusomwa na kutumika upya.

Muhtasari

Wakati wa kutumia ROW_NUMBER() au suluhisho zake, weka mambo haya akilini:

  1. Boresha kasi kupitia u boreshaji wa faharasa .
  2. Tambua vikwazo kwa kukagua mpango wa utekelezaji .
  3. Panga sasisho za data na udumishe uthabiti.
  4. Tumia usindikaji wa batch na CTEs kusambaza mzigo.

Kutumia mbinu bora hizi kutaruhusu usindikaji wenye ufanisi kwa uchambuzi wa data wa kiwango kikubwa na utoaji wa ripoti.

7. Conclusion

Katika makala hii, tulijikita kwenye kazi ya ROW_NUMBER() ya MySQL, tukielezea kila kitu kutoka matumizi ya msingi na mifano ya vitendo hadi mbadala kwa matoleo ya zamani, pamoja na tahadhari na mbinu bora. Katika sehemu hii, tutapitia upya pointi kuu na kufupisha mambo muhimu ya vitendo.

7-1. Why ROW_NUMBER() is useful

Kazi ya ROW_NUMBER() ni rahisi hasa kwa uchambuzi wa data na utoaji wa ripoti kwa njia zifuatazo:

  1. Utoaji wa nambari mfuatafu ndani ya makundi: Unda kwa urahisi viwango vya mauzo kwa idara au viwango vinavyotegemea kategoria.
  2. Kuchukua safu za Juu N: Chuja na chukua data kwa ufanisi kulingana na masharti maalum.
  3. Kugundua na kufuta nakala za data: Inasaidia kusafisha na kupanga data.

Kwa sababu inarahisisha maswali tata, inaboresha sana usomaji na matengenezo ya SQL.

7-2. Comparison with other window functions

Ikilinganishwa na kazi za dirisha kama RANK() na DENSE_RANK(), ROW_NUMBER() inatofautiana kwa kutoa nambari ya kipekee hata kwa thamani sawa.

FunctionFeatureUse case
ROW_NUMBER()Assigns a unique sequential number to each rowBest when you need unique identification or ranking with no duplicates
RANK()Same rank for ties; skips the next rank numberWhen you need tie-aware rankings and rank gaps matter
DENSE_RANK()Same rank for ties; does not skip rank numbersWhen you want continuous ranking while handling ties

Kuchagua kazi sahihi:
Kuchagua kazi bora kwa madhumuni yako kunaruhusu usindikaji wa data wenye ufanisi.

7-3. Handling older MySQL versions

Kwa mazingira chini ya MySQL 8.0, pia tulitoa mbinu za kutumia vigezo vilivyotengenezwa na mtumiaji. Hata hivyo, unapaswa kuzingatia tahadhari hizi:

  • Usomaji mdogo kutokana na SQL tata zaidi
  • Uboreshaji wa maswali unaweza kuwa mgumu zaidi katika baadhi ya hali
  • Usimamizi wa ziada unaweza kuhitajika kudumisha uthabiti wa data

Ikiwezekana, fikiria kwa nguvu kuhamia MySQL 8.0 au baadaye na kutumia kazi za dirisha.

7-4. Key points for performance optimization

  1. Tumia faharasa: Ongeza faharasa kwenye safuwima zinazotumika katika ORDER BY ili kuboresha kasi.
  2. Kagua mipango ya utekelezaji: Thibitisha utendaji mapema kwa kutumia EXPLAIN.
  3. Kubali usindikaji wa batch: Shughulikia seti kubwa za data katika vipande vidogo ili kusambaza mzigo.
  4. Tumia maoni na CTEs: Boresha matumizi tena na rahisisha maswali tata.

Kwa kutumia mbinu hizi, unaweza kufanikisha usindikaji wa data wenye ufanisi na thabiti.

7-5. Final notes

ROW_NUMBER() ni chombo chenye nguvu ambacho kinaweza kuboresha sana ufanisi wa uchambuzi wa data.
Katika makala hii, tulijumuisha kila kitu kutoka sarufi ya msingi na mifano ya vitendo hadi tahadhari na mbadala.

Tunakuhimiza uendeshe maswali mwenyewe huku ukifuatilia makala hii. Kuboresha ujuzi wako wa SQL kutakusaidia kukabiliana na uchambuzi wa data na utoaji wa ripoti tata kwa ujasiri.

Appendix: Reference resources