MSSQL Query Syntax -- Deleting Duplicate Records?
I have a MSSQL database with 15million + records. I have just learned that the parser I'm using to enter data is putting duplicate records (as many as 20+) in because there are records with identical serial numbers. Can anyone write a query to scan and then delete all except one records that have identical serial numbers? It is important to leave one instance of a particular serial number in, but delete all other instances.
DB = test_database
table = tradeMark
column = SerialNumber
10 points guaranteed if it works and ur first!
Thanks for the suggestions guys. Colanth, I'm not sure I quite understand, but it seems like that is going to be a lot of manual work. I need a query that will delete all duplicates in one fell swoop, as there are 15 million records.
Raja, I will try this tomorrow, but how does the new table only contain duplicates? Is that achieved through the "group by SerialNumber" portion of the query?
Thank you very much KGS, that query makes the most sense from them all. We will maybe test today and then I will come back to give best answer!
- Anonymous1 decade agoFavorite Answer
No need to use any temporary table...
I hope this will be useful for u...
DELETE FROM (SELECT sno,name,ROW_NUMBER() OVER(PARTITION BY sno) AS rank FROM tradeamrk )WHERE rank > 1;
The subquery "SELECT sno,name,ROW_NUMBER() OVER(PARTITION BY sno) AS rank FROM tradeamrk"
results will be like this
sno name rank
11 gopal 1
11 gopal 2
12 nitin 1
12 nitin 2
12 nitin 3
13 naran 1
14 gopi 1
In this subquery u have to mention all the columns from the table trademark
ROW_NUMBER() gives the row number for the rows for every different sno column (ie., partition by sno)
So, required required query is
DELETE FROM (SELECT sno,name,ROW_NUMBER() OVER(PARTITION BY sno ORDER BY SNO) AS rank FROM tradeamrk )WHERE rank > 1;
This query is tested in DB2 9 Express Edition.
Its 100% working......
- Anonymous1 decade ago
Yes, Raja is right.
U cannot delete duplicate record.
If u delete, then both of records = del.
1. create new table with same structure, and add 1 field = autonumber
2. create query to append all record to new table
3. seek duplicate record. And u see, the duplicate rec have diffr autonumber
4. Now del 1 of the duplicate
5. create query to move new table to old table, but first delete all rec on old tabel.
- 1 decade ago
I didn’t test with mssql for below queries, but I tested with mysql. Can you use the below query and remove the duplicate records.
1. Create temporary table, “create table temp_tradeMark as select * from trademark group by SerialNumber”
2. Now the temp table contain records without duplicate entries. Double check with new table, backup the old table, drop it.
3. Rename the temp table to old one, or create table with select query.
Rename table temp_tradeMark to trademark;
Create table trademark as select * from temp_tradeMark;
4. drop the temp table
“drop table temp_tradeMark”
- Anonymous1 decade ago
If the records are identical, the best you can do is select distinct, so you get a record of all the duplicate records, delete them all and insert them properly.
If they have differences, delete the records you don't want, based on the serial number and the field that's different. (You can delete WHERE a field is not equal to the record you want to keep.)
How you do which depends on the language you're using to do it.