MSSQL Query Syntax -- Deleting Duplicate Records?

I have a MSSQL database with 15million + records. I have just learned that the parser I'm using to enter data is putting duplicate records (as many as 20+) in because there are records with identical serial numbers. Can anyone write a query to scan and then delete all except one records that have identical serial numbers? It is important to leave one instance of a particular serial number in, but delete all other instances.

DB = test_database

table = tradeMark

column = SerialNumber

10 points guaranteed if it works and ur first!

Update:

Thanks for the suggestions guys. Colanth, I'm not sure I quite understand, but it seems like that is going to be a lot of manual work. I need a query that will delete all duplicates in one fell swoop, as there are 15 million records.

Raja, I will try this tomorrow, but how does the new table only contain duplicates? Is that achieved through the "group by SerialNumber" portion of the query?

Thanks

Update 2:

Thank you very much KGS, that query makes the most sense from them all. We will maybe test today and then I will come back to give best answer!

4 Answers

Relevance
  • Anonymous
    1 decade ago
    Favorite Answer

    No need to use any temporary table...

    I hope this will be useful for u...

    Eg:

    SNo Name

    --------------------------

    11 gopal

    12 nitin

    13 naren

    14 gopi

    11 gopal

    12 nitin

    12 nitin

    query:

    DELETE FROM (SELECT sno,name,ROW_NUMBER() OVER(PARTITION BY sno) AS rank FROM tradeamrk )WHERE rank > 1;

    required op:

    sno name

    -------------

    11 gopal

    12 nitin

    13 naren

    14 gopi

    Explanation:

    The subquery "SELECT sno,name,ROW_NUMBER() OVER(PARTITION BY sno) AS rank FROM tradeamrk"

    results will be like this

    sno name rank

    --------------------

    11 gopal 1

    11 gopal 2

    12 nitin 1

    12 nitin 2

    12 nitin 3

    13 naran 1

    14 gopi 1

    In this subquery u have to mention all the columns from the table trademark

    ROW_NUMBER() gives the row number for the rows for every different sno column (ie., partition by sno)

    So, required required query is

    DELETE FROM (SELECT sno,name,ROW_NUMBER() OVER(PARTITION BY sno ORDER BY SNO) AS rank FROM tradeamrk )WHERE rank > 1;

    This query is tested in DB2 9 Express Edition.

    Its 100% working......

  • Anonymous
    1 decade ago

    Hm....

    Yes, Raja is right.

    U cannot delete duplicate record.

    If u delete, then both of records = del.

    Algo:

    1. create new table with same structure, and add 1 field = autonumber

    2. create query to append all record to new table

    3. seek duplicate record. And u see, the duplicate rec have diffr autonumber

    4. Now del 1 of the duplicate

    5. create query to move new table to old table, but first delete all rec on old tabel.

    6. finish

  • 1 decade ago

    I didn’t test with mssql for below queries, but I tested with mysql. Can you use the below query and remove the duplicate records.

    1. Create temporary table, “create table temp_tradeMark as select * from trademark group by SerialNumber”

    2. Now the temp table contain records without duplicate entries. Double check with new table, backup the old table, drop it.

    3. Rename the temp table to old one, or create table with select query.

    Rename table temp_tradeMark to trademark;

    Or

    Create table trademark as select * from temp_tradeMark;

    4. drop the temp table

    “drop table temp_tradeMark”

  • Anonymous
    1 decade ago

    If the records are identical, the best you can do is select distinct, so you get a record of all the duplicate records, delete them all and insert them properly.

    If they have differences, delete the records you don't want, based on the serial number and the field that's different. (You can delete WHERE a field is not equal to the record you want to keep.)

    How you do which depends on the language you're using to do it.

Still have questions? Get your answers by asking now.