Tuesday 29 May 2018

How To Find Duplicate Values in SQL Server



Create Tables-
Duplicates happen a lot in the database because of many reasons. Finding duplicate values is one of the important tasks that you must deal with when working with the databases.
For the demonstration, we will create a table named contacts with four columns: idfirst_namelast_name, and email.

CREATE TABLE #contacts (
    id INT PRIMARY KEY  IDENTITY (1,1),
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(255) NOT NULL
)


INSERT INTO #contacts (first_name,last_name,email)
VALUES ('Carine ','Schmitt','carine.schmitt@verizon.net'),
       ('Jean','King','jean.king@me.com'),
       ('Peter','Ferguson','peter.ferguson@google.com'),
       ('Janine ','Labrune','janine.labrune@aol.com'),
       ('Jonas ','Bergulfsen','jonas.bergulfsen@mac.com'),
       ('Janine ','Labrune','janine.labrune@aol.com'),
       ('Susan','Nelson','susan.nelson@comcast.net'),
       ('Zbyszek ','Piestrzeniewicz','zbyszek.piestrzeniewicz@att.net'),
       ('Roland','Keitel','roland.keitel@yahoo.com'),
       ('Julie','Murphy','julie.murphy@yahoo.com'),
       ('Kwai','Lee','kwai.lee@google.com'),
       ('Jean','King','jean.king@me.com'),
       ('Susan','Nelson','susan.nelson@comcast.net'),
       ('Roland','Keitel','roland.keitel@yahoo.com');
      

Find duplicate values in one column

The find duplicate values in a table based on one column, you use the following statement:
     
 SELECT
    email,
    COUNT(email)
FROM
    #contacts
GROUP BY email
HAVING COUNT(email) > 1;

Find duplicate values in multiple columns

Sometimes, you want to find duplicate based on multiple columns instead of one. In this case, you can use the following query:
For example, to find rows in the contacts table with duplicate values in first_namelast_name, and email column, you use the following query: 

SELECT
    first_name, COUNT(first_name),
    last_name,  COUNT(last_name),
    email,      COUNT(email)
FROM
    #contacts
GROUP BY
    first_name ,
    last_name ,
    email
HAVING  COUNT(first_name) > 1
    AND COUNT(last_name) > 1

    AND COUNT(email) > 1;