How to eliminate duplicate records from a database table?

December 20, 2008 05:00 by akashheranjal


Description of the problem:

A database table by name MyTable has 3 data fields MyID, MyName, MyAddress.
• MyID is Identity and primary key (PK) with Int data type
• MyName is a VarChar(200) data type
• MyAddress is TEXT data type

Accidently, the PK and Identity is removed and there are duplicate records in the table. Now we need to introduce the PK back on MyID by cleaning the database table. There are 10,000,000 records in the table and we really do not know which record has got duplicated and how many times. The focus is to eliminate duplicate records from the database table and introduce the PK back on MyID field.

Solutions/Approach:

• Introduce an additional column by name RecID and make it Identity column
• Run the below query
Delete from MyTable Where RecID Not In
(Select Min(RecID) from MyTable Group By MyID)


General Syntax:
Delete from [TableName] Where [NewIdentityCol] Not In
(Select Min([NewIdentityCol]) from [TableName] Group By [ColNameHavingDuplicate])


• Remove the new column RecID
• Introduce Identity and PK back on MyID field

 


Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Comments

Add comment


 

  Country flag

biuquote
  • Comment
  • Preview
Loading