SQL Finds and Deletes Duplicated Rows From a Table

A table contains some rows with duplicated information caused by bugs. Now the bugs have been fixed, and you want to delete the duplicates to clean up the table.

For example, from the OrderAudit table below, you want to remove the duplicated Book and DVD rows(notice the comment column contains different values).

id	orderId	 productName	quantity	comment
1	100	 Book      	5	        aaa       
2	100	 Book      	5	        bbb       
3	200	 DVD       	10	        ccc       
4	200	 DVD       	10	        ddd       
5	300	 Game      	20	        eee       

First let’s find if there is any records with duplicated orderId, productName and quantity.

select count(*),orderid, productname, quantity from OrderAudit
group by orderid, productname, quantity
having count(*) > 1

The result will show you how many duplicates are there in the table.

Then we can keep only one of them and delete the rest.

delete from OrderAudit
where id not in
  select min(id) from OrderAudit
  group by orderid, productname,quantity

After executing the query, the 2nd and 4th records are now deleted.

This entry was posted in Database and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s