Home
How to Check Duplicates in Excel: 5 Ways That Actually Work
Duplicate data often sneaks into spreadsheets through manual entry errors, system exports, or merging multiple files. Left unchecked, these redundancies lead to skewed analysis, double-billing, and general reporting chaos. Modern Excel provides several built-in tools and advanced functions to identify these outliers without manually scanning thousands of rows.
Selecting the right method depends on whether you need a quick visual highlight, a dynamic list of unique values, or a deep dive into complex datasets spanning multiple sheets. This walkthrough covers the most reliable techniques available in current versions of Excel, including Microsoft 365 and Excel 2024.
1. Using Conditional Formatting for Instant Visuals
For most everyday tasks, conditional formatting is the fastest way to check duplicates in Excel. It doesn't move or delete data; it simply changes the cell color so you can spot repeats immediately.
The Standard Method
To apply the basic highlight:
- Select the range of cells you want to investigate. This could be a single column like "Email Addresses" or a whole table.
- Navigate to the Home tab on the ribbon.
- Click Conditional Formatting in the Styles group.
- Hover over Highlight Cells Rules and select Duplicate Values....
- In the popup dialog, ensure the left dropdown says "Duplicate". On the right, choose a formatting style (e.g., Light Red Fill with Dark Red Text).
- Click OK.
This method is highly effective for a single column of unique identifiers, such as ID numbers or SKUs. However, if you select multiple columns, Excel highlights every cell that has a duplicate anywhere in that range, which might not be what you want if you are looking for duplicate rows.
Pro-Tip: Highlighting Only the Second Occurrence
Sometimes you want the first entry to remain normal and only subsequent duplicates to turn red. This requires a custom formula rule:
- Select your range (e.g., A2:A100).
- Go to Conditional Formatting > New Rule > Use a formula to determine which cells to format.
- Enter the following formula:
=COUNTIF($A$2:A2, A2)>1. - Set your format and apply.
This specific formula uses a "mixed reference" ($A$2:A2). As Excel checks each row, the range expands, effectively telling the program: "Count how many times the current value has appeared from the start of the list down to this row. If that count is greater than one, highlight it."
2. Using Formulas for More Control (COUNTIF and COUNTIFS)
While visual highlights are great for humans, formulas are better for building automated systems or helper columns that flag data for further processing.
Single Column Checks with COUNTIF
Creating a "Helper Column" is a standard practice for data cleaning. If your data is in Column A, go to Column B and enter:
=COUNTIF(A:A, A2)
Drag this formula down. Any result greater than 1 indicates a duplicate. You can then filter this column to show only values higher than 1, allowing you to review them in bulk.
Multi-Column (Row-Level) Checks with COUNTIFS
Often, a duplicate isn't just a single cell; it’s an entire row where the First Name, Last Name, and Date of Birth all match. To check duplicates in Excel across multiple criteria, use COUNTIFS (the plural version):
=COUNTIFS(A:A, A2, B:B, B2, C:C, C2)
This formula counts how many rows have the same value in Column A and Column B and Column C. If the result is 2 or more, you have a duplicate record. This is significantly more accurate than checking just one column, especially in large databases where two different people might share a last name.
3. The Modern Approach: UNIQUE and FILTER Functions
If you are using Microsoft 365 or Excel 2021/2024, the dynamic array engine has revolutionized how we handle duplicates. Instead of just marking them, you can extract a clean list instantly.
Extracting Unique Values
To see a list of everything except the duplicates, use:
=UNIQUE(A2:A500)
This formula "spills" the results into as many cells as needed. If you change the source data, the unique list updates in real-time.
Finding Only the Duplicates
If your goal is to generate a separate list of the problem entries, you can combine FILTER, UNIQUE, and COUNTIF:
=UNIQUE(FILTER(A2:A500, COUNTIF(A2:A500, A2:A500) > 1))
This logic works as follows:
COUNTIFidentifies the frequency of every item.FILTERkeeps only those with a frequency higher than 1.UNIQUEnarrows that list down so each duplicate is only listed once in your report.
4. How to Check Duplicates in Excel Across Multiple Sheets
Checking for duplicates within one sheet is straightforward, but verifying if a value in Sheet1 already exists in Sheet2 or Sheet3 is a common challenge for inventory management or master contact lists.
The INDIRECT Workaround for Conditional Formatting
Excel’s standard conditional formatting has a historical limitation: it doesn't always play nice with direct references to other worksheets. To get around this, the INDIRECT function is necessary.
Suppose you want to highlight values in Sheet1 that already exist in Sheet2 (Column A).
- Select the cells in
Sheet1(e.g., A2:A100). - Open Conditional Formatting > New Rule > Use a formula.
- Enter:
=COUNTIF(INDIRECT("'Sheet2'!$A:$A"), A2) > 0.
Note on Syntax: If your sheet names contain spaces (e.g., "Master List"), you must wrap the name in single quotes inside the double quotes: "'Master List'!$A:$A". Without these quotes, the formula will return an error.
The VLOOKUP or MATCH Method
For a more permanent indicator in a helper column, use ISNUMBER combined with MATCH:
=ISNUMBER(MATCH(A2, 'Sheet2'!A:A, 0))
This returns TRUE if the value in A2 exists anywhere in Sheet2, and FALSE if it doesn't. This is often faster for the calculation engine than COUNTIF when dealing with very large datasets across sheets.
5. Professional Data Cleaning with Power Query
When your data reaches tens of thousands of rows, traditional formulas can slow Excel to a crawl. Power Query is the built-in tool designed for this scale. It’s a "non-destructive" way to check duplicates, meaning your original data remains untouched while you create a cleaned version elsewhere.
Steps to Identify Duplicates in Power Query:
- Select your data range and go to the Data tab.
- Click From Table/Range. This opens the Power Query Editor.
- Select the column(s) you want to check for duplicates. You can hold
Ctrlto select multiple columns for row-level checking. - Right-click the column header and select Keep Duplicates.
Wait, why "Keep Duplicates"? This is a great diagnostic tool. It filters the entire table to show only the rows that are repeated, allowing you to investigate why they exist before you commit to deleting them.
If you simply want to remove them:
- Instead of "Keep Duplicates," select Remove Duplicates.
- Click Close & Load to return the cleaned data to a new worksheet.
Power Query is particularly useful because it remembers these steps. Next month, when you paste new data into the original table, you can simply click Data > Refresh, and the duplicates will be automatically identified or removed based on your previous settings.
Addressing "Hidden" Duplicates: Why Methods Fail
Sometimes you follow the steps perfectly, but Excel fails to find a duplicate that you can clearly see with your own eyes. This is almost always due to "dirty data."
Leading and Trailing Spaces
Excel treats "Apple" and "Apple " (with a space at the end) as two entirely different values. To fix this, use the TRIM function before running your duplicate checks:
=TRIM(A2)
Non-Printing Characters
Data imported from web systems often contains hidden characters (like non-breaking spaces). The CLEAN function can help remove these:
=CLEAN(TRIM(A2))
Case Sensitivity
By default, COUNTIF and MATCH are not case-sensitive. "EXCEL" and "excel" are treated as duplicates. If you need a case-sensitive check, you must use the EXACT function. For example:
=SUMPRODUCT(--EXACT(A2, $A$2:$A$100)) > 1
This formula compares A2 against the range using EXACT, which respects capitalization. The double-negative (--) converts TRUE/FALSE results into 1s and 0s so SUMPRODUCT can add them up.
Choosing the Right Tool for the Job
Not every method is appropriate for every situation. Consider these guidelines for your next project:
- Small lists (under 5,000 rows): Use Conditional Formatting. It’s visual and requires no extra columns.
- Dynamic Reports: Use the UNIQUE function. It’s perfect for dashboards where the source data changes frequently.
- Audit Trails: Use a Helper Column with COUNTIF. This allows you to keep the original data while clearly labeling which rows are problematic.
- Cross-Workbook/Sheet Verification: Use MATCH or INDIRECT. These are more robust for linking data across the file.
- Large-Scale Data Cleaning: Use Power Query. It handles millions of rows more efficiently than any cell-based formula and provides a repeatable workflow.
Before performing any major de-duplication, particularly when using the Data > Remove Duplicates button (which permanently deletes data), it is prudent to create a backup copy of your worksheet. While the methods described above are safe, having a recovery point ensures that an accidental over-deletion doesn't result in the loss of critical information.
By mastering these different ways to check duplicates in Excel, you move beyond basic data entry and into the realm of data integrity, ensuring that your insights are based on clean, accurate, and unique records.
-
Topic: find and remove duplicates - microsoft supporthttps://support.microsoft.com/en-us/office/find-and-remove-duplicates-00e35bea-b46a-4d5d-b28e-66a552dc138d
-
Topic: I have a spread sheet that has multiple sheets and want to check if any of the values in column A are duplicated in any of the other sheets. - Microsoft Q& Ahttps://learn.microsoft.com/en-au/answers/questions/5854466/i-have-a-spread-sheet-that-has-multiple-sheets-and
-
Topic: The Ultimate Guide to Finding and Removing Duplicates in Excelhttps://www.wps.com/blog/the-ultimate-guide-to-finding-and-removing-duplicates-in-excel/