Tuesday, May 6, 2008

How to delete records where some duplication occurs

I have a table of information of beneficiaries and I need to have one record per SSN, but there were multiple records per SSN due to typos in name or different addresses. I really didn't care which record I kept, just need one record per SSN. This is how I did it:

First see how many SSNs have more than one record
SELECT DISTINCT
SSN
FROM
(SELECT SSN,COUNT(*) FROM SI08005101 GROUP BY SSN HAVING COUNT(*) > 1)

Then delete the unwanted records
DELETE FROM
table_name A
WHERE
A.rowid >
ANY (SELECT B.rowid
FROM
table_name B
WHERE
A.SSN = B.SSN
)

No comments: