MySQL COUNT(DISTINCT) Explained: How to Count Unique Values Efficiently

目次

1. Introduction

When managing a database, you may encounter situations such as, “How many different countries have been registered?” or “How many unique email addresses are there?”
In such cases, you can use MySQL’s COUNT(DISTINCT column_name) to retrieve the number of records while eliminating duplicates.

This article explains the following topics in detail:

  • The basics of COUNT() and DISTINCT
  • The correct usage of COUNT(DISTINCT column_name)
  • How to count unique values across multiple columns
  • How to improve the performance of COUNT(DISTINCT)

Even beginners can easily understand this guide, as we explain everything with practical examples and SQL queries. Be sure to read through to the end.

2. Basics of Counting Data in MySQL (COUNT)

When analyzing data in a database, the most fundamental function is COUNT().
Let’s first understand how COUNT() works.

2.1 Difference Between COUNT(*) and COUNT(column_name)

The COUNT() function in MySQL can be used in the following two ways:

COUNT FunctionDescription
COUNT(*)Counts all records in the table (including NULL values)
COUNT(column_name)Counts non-NULL values in a specific column

2.2 Basic COUNT() Examples

Here, we will use the following users table as an example:

idnameemailcountry
1Tarotaro@example.comJapan
2Hanakohanako@example.comJapan
3JohnNULLUnited States
4Tanakatanaka@example.comJapan

① Retrieve the Total Number of Records in the Table

SELECT COUNT(*) FROM users;

→ Result: 4 (Total number of records)

② Retrieve the Number of Non-NULL Values in a Specific Column

SELECT COUNT(email) FROM users;

→ Result: 3 (Number of non-NULL email values)

💡 Key Points:

  • COUNT(*) retrieves the total number of records including NULL values.
  • COUNT(email) excludes NULL values when counting.

3. Retrieving Data Without Duplicates (DISTINCT)

When aggregating data, there are many cases where you want to retrieve only unique values.
In such situations, DISTINCT is very useful.

3.1 Basics of DISTINCT

DISTINCT is used to eliminate duplicate data from the specified column and return unique results.

Basic Syntax

SELECT DISTINCT column_name FROM table_name;

3.2 Example of Using DISTINCT

By executing the following SQL query, you can retrieve a list of unique country names registered by users.

SELECT DISTINCT country FROM users;

→ Result:

country
Japan
United States

3.3 Difference Between DISTINCT and GROUP BY

FeatureDISTINCTGROUP BY
PurposeRetrieve unique valuesPerform aggregation by group
UsageSELECT DISTINCT column_nameSELECT column_name, COUNT(*) GROUP BY column_name
ExampleRetrieve unique countriesCount users per country

💡 Key Points:

  • DISTINCT simply removes duplicate data.
  • GROUP BY groups data and is used together with aggregate functions.

4. How to Use COUNT(DISTINCT column_name)

By using COUNT(DISTINCT column_name), you can retrieve the number of unique values.

4.1 Basics of COUNT(DISTINCT)

Basic Syntax

SELECT COUNT(DISTINCT column_name) FROM table_name;

4.2 Example of COUNT(DISTINCT)

SELECT COUNT(DISTINCT country) FROM users;

→ Result: 2 (Two types: “Japan” and “United States”)

4.3 Using COUNT(DISTINCT) with Conditions

SELECT COUNT(DISTINCT email) FROM users WHERE country = 'Japan';

→ Result: 2 (Number of unique email values registered in Japan)

💡 Key Points:

  • COUNT(DISTINCT column_name) excludes NULL values and counts only unique data.
  • By using a WHERE clause, you can count records that meet specific conditions.

5. Using COUNT(DISTINCT) with Multiple Columns

In MySQL, COUNT(DISTINCT column1, column2) cannot be used directly. Instead, a common workaround is to combine columns using CONCAT() and treat them as a single value.

5.1 Why COUNT(DISTINCT column1, column2) Cannot Be Used

In MySQL, you cannot directly apply COUNT(DISTINCT) to multiple columns like this: COUNT(DISTINCT column1, column2).
This is due to a limitation in MySQL.

5.2 How to Count Unique Combinations Across Multiple Columns

To count unique combinations of multiple columns, the typical approach is to combine the columns using CONCAT() and then apply COUNT(DISTINCT) to the result.

Example: Count Unique Combinations of Country and City

SELECT COUNT(DISTINCT CONCAT(country, '-', city)) FROM users;

💡 Key Points:

  • Using CONCAT(column1, '-', column2) allows you to combine multiple columns into a single unique value.
  • COUNT(DISTINCT CONCAT(...)) enables you to retrieve the number of unique combinations across multiple columns.

6. Performance Tuning for COUNT(DISTINCT)

COUNT(DISTINCT) can impact performance, so optimization may be necessary.
When working with large datasets, it is recommended to consider using indexes or alternative approaches.

6.1 Why COUNT(DISTINCT) Can Be Slow

  • MySQL often scans all records to apply DISTINCT.
  • If indexes are not properly configured, query execution becomes slower.
  • A high volume of duplicate data increases computational load.

6.2 Index Optimization to Speed Up COUNT(DISTINCT)

When handling large amounts of data, you can improve query performance by adding an index to the target column.

How to Add an Index

ALTER TABLE users ADD INDEX (country);

Check the Query Execution Plan Using an Index

EXPLAIN SELECT COUNT(DISTINCT country) FROM users;

💡 Key Points:

  • Using EXPLAIN allows you to check how MySQL processes a query.
  • Applying an index may help avoid full table scans and improve search performance.

6.3 Alternative Method: GROUP BY + COUNT

Depending on the aggregation requirement, using GROUP BY may provide better performance.

Example: Count Unique Data Using GROUP BY

SELECT country, COUNT(*) FROM users GROUP BY country;

💡 Key Points:

  • GROUP BY may offer better performance compared to COUNT(DISTINCT) in some cases.
  • It is especially useful when you need to group and aggregate data simultaneously.

7. Common Errors and Solutions for COUNT(DISTINCT)

When using COUNT(DISTINCT), you may encounter several common errors.
Here, we introduce typical issues and their solutions.

7.1 Error 1: COUNT(DISTINCT column1, column2) Cannot Be Used

Cause of the Error

In MySQL, COUNT(DISTINCT column1, column2) is not supported when targeting multiple columns.
Using this syntax directly will result in an error.

Solution: Use CONCAT()

You can avoid this error by combining multiple columns and applying COUNT(DISTINCT) to the result.

SELECT COUNT(DISTINCT CONCAT(country, '-', city)) FROM users;

💡 Key Points:

  • Using CONCAT(column1, '-', column2) allows you to create unique values from multiple columns.
  • COUNT(DISTINCT CONCAT(...)) enables you to retrieve unique values for each combination.

7.2 Error 2: Unexpected Results When NULL Values Are Included

Cause of the Error

  • COUNT(DISTINCT column_name) ignores NULL values, which may cause unexpected results if the column contains NULLs.

Solution: Use IFNULL()

You can replace NULL with another default value (e.g., '' or 'unknown') to ensure proper counting.

SELECT COUNT(DISTINCT IFNULL(email, 'unknown')) FROM users;

💡 Key Points:

  • By using IFNULL(column_name, 'default_value'), you can handle NULL values appropriately.

7.3 Error 3: COUNT(DISTINCT) Is Slow

Cause of the Error

  • COUNT(DISTINCT) scans all data, which can result in slow performance with large datasets.

Solution: Use Indexes

ALTER TABLE users ADD INDEX (country);

💡 Key Points:

  • Adding indexes may improve query performance.
  • Use EXPLAIN to check query optimization status.
EXPLAIN SELECT COUNT(DISTINCT country) FROM users;

By applying these measures, you can enhance the practicality of COUNT(DISTINCT) and avoid performance issues.

8. Frequently Asked Questions (FAQ)

Here are some frequently asked questions about COUNT(DISTINCT).

8.1 What Is the Difference Between COUNT(*) and COUNT(DISTINCT column_name)?

Key Differences

FunctionDescription
COUNT(*)Counts all records (including NULL values)
COUNT(DISTINCT column_name)Counts unique values (excluding NULL values)

Example Usage

SELECT COUNT(*) FROM users;
SELECT COUNT(DISTINCT email) FROM users;

💡 Key Points:

  • COUNT(*) counts all records.
  • COUNT(DISTINCT column_name) retrieves the number of unique values (excluding NULL).

8.2 What Is the Difference Between DISTINCT and GROUP BY?

FeatureDISTINCTGROUP BY
PurposeRetrieve unique valuesPerform aggregation by group
UsageSELECT DISTINCT column_nameSELECT column_name, COUNT(*) GROUP BY column_name
ExampleRetrieve unique countriesCount users per country

Example Usage

-- Using DISTINCT
SELECT DISTINCT country FROM users;

-- Using GROUP BY
SELECT country, COUNT(*) FROM users GROUP BY country;

💡 Key Points:

  • DISTINCT simply removes duplicate data.
  • GROUP BY groups data and can be combined with aggregate functions.

8.3 Is COUNT(DISTINCT) Slow?

Issue

  • COUNT(DISTINCT) scans all data, so performance may degrade with large datasets.

Solution: Use Indexes

ALTER TABLE users ADD INDEX (country);

Alternative Approach: Use GROUP BY

SELECT country, COUNT(*) FROM users GROUP BY country;

💡 Key Points:

  • Applying indexes may improve search performance.
  • Using GROUP BY may produce faster results than COUNT(DISTINCT) in some cases.

8.4 How Can I Use COUNT(DISTINCT column1, column2)?

Issue

  • In MySQL, COUNT(DISTINCT column1, column2) is not supported.

Solution: Use CONCAT()

SELECT COUNT(DISTINCT CONCAT(country, '-', city)) FROM users;

💡 Key Points:

  • Using CONCAT(column1, '-', column2) allows you to create unique values across multiple columns.
  • COUNT(DISTINCT CONCAT(...)) enables you to retrieve unique combinations.

By referring to these questions, you can use COUNT(DISTINCT) more efficiently.

9. Conclusion

In this article, we explained in detail how to use MySQL’s COUNT(DISTINCT) function.
Let’s review the key takeaways.

9.1 What You Learned in This Article

How to count records in MySQL

  • COUNT(*) retrieves the total number of records
  • COUNT(column_name) counts values excluding NULL
  • COUNT(DISTINCT column_name) retrieves the number of unique values

The difference between DISTINCT and COUNT(DISTINCT)

  • DISTINCT retrieves data with duplicates removed
  • COUNT(DISTINCT column_name) counts the number of unique values

How to use COUNT(DISTINCT) with multiple columns

  • Since MySQL does not support COUNT(DISTINCT column1, column2) directly, use CONCAT() instead

Performance optimization techniques

  • Apply indexes to improve search performance
  • Use GROUP BY + COUNT for faster queries when appropriate

9.2 What You Can Do with This Knowledge

By applying this knowledge, you can perform the following types of data aggregation:
🔹 Count unique users
🔹 Retrieve record counts based on specific conditions
🔹 Count unique data across multiple columns
🔹 Optimize queries for large datasets

When performing data aggregation and optimization in MySQL, be sure to use this guide as a reference!