Wednesday, March 7, 2012

Help to deleting n numbers of records

Hi
One of our users have by mistake created a number of dublicate records in
the database. Now I need to delete the "faulty" records from the table and
just keep one of them. The records are linked to a customer table, so I have
a Recordid column in the table that I can use to group on. E.g. if I in the
table have 10 records with RecordID 1, I need to delete 9 of them and keep
1, 8 records with RecordID 2 I need to delete 7 and keep 1 etc.
I'll have to search the customer table to find the recordID's that is used
to link to the child table, so my plan is to use a cursor to find these and
put them into a variable. I'll then use this to find the record(s) in the
child table, get the number of records and then either delete them one by
one until I've only 1 pr. RecordID left or use SELECT Top x based on the
number of records found with each recordid and then delete all but 1 record.
It's not a huge number of records I need to delete so from a practical
and/or performance point of wiev it's not critical how I do it. It's more
that I'm currious to hear if my approach is the best one or if any of you
have any other ideas?
TIA
Regards
SteenTo answer this properly we'll need to know the keys and constraints in
your tables. Please post DDL (CREATE TABLE) and some sample data
(INSERTs) if you want help with the actual code.
You missed out one critical step from your solution: Add a new unique
constraint so that this can't happen again.
David Portas
SQL Server MVP
--|||Hi
First of all, it's a vendor application, so I can't change anything in the
database or application. I this case it's not a problem though, since it's
ok to create several records as they did, but in this case it's just because
they had some problems with a printer and therefore thay ran a wizard
several times which generated a number of records that where baiscally same.
It's these records thay now want to get deleted.
The 2 tables that's involved is called Lejer and Note.
CREATE TABLE [Lejer] (
[EjendomNr] [EjendomNr] NOT NULL ,
[LejemaalNr] [LejemaalNr] NOT NULL ,
[LejerNr] [LejerNr] NOT NULL ,
[LejerID] [RecordID] IDENTITY (1, 1) NOT NULL ,
....and about 200 more column definitions
CREATE TABLE [Note] (
[NoteID] [RecordID] NOT NULL ,
[Tabelnavn] [TabelNavn] NOT NULL ,
[RecordID] [RecordID] NOT NULL ,
[Dato] [Dato] NOT NULL ,
[NoteType] [KodeId] NOT NULL ,
....and some more column definitions.
The fields that links the tables are Lejer.LejerID and Note.RecordID.
Below is ans example of sample data
Lejer:
Ejendomnr Lejemaalnr Lejernr LejerID
1 1 1 1000
1 2 2 1001
1 3 3 1002
2 1 1 1003
2 2 2 1004
3 1 1 1005
3 2 2 1006
Note
NoteID RecordID NoteType
1 1000 29000
2 1000 29000
3 1000 29000
4 1000 29000
5 1001 29000
6 1001 29000
7 1001 29000
8 1002 29000
9 1002 29000
10 1002 29000
11 1003 29000
12 1003 29000
I'd like to find the notes where the type is e.g. 29000 and are linked a
record in the Lejer table with the Ejendomnr of e.g. 1.
and then delete all notes but 1.
In the above example, it means that if I use Ejendomnr = 1, I'll have the
Note records with NoteID 1 to 10. Out of these I'd like to delete 3 of the
ones with RecordID 1000, 2 of them with RecordID 1001 and 2 of them with
RecordID 1002.
As mentioned in my original post, I could get all the RecordID's into a
cursor and then e.g. count the number of occurences for each of them and
then do a delete n-1 times. I just don't know if that's the smartest way to
do it or if there's a better approach?
Regards
Steen
David Portas wrote:
> To answer this properly we'll need to know the keys and constraints in
> your tables. Please post DDL (CREATE TABLE) and some sample data
> (INSERTs) if you want help with the actual code.
> You missed out one critical step from your solution: Add a new unique
> constraint so that this can't happen again.
> --
> David Portas
> SQL Server MVP|||How about
Delete Note
Where NoteType = '29000'
And NoteId <> (
Select Min(N1.NoteId)
From Note As N1
Where Note.RecordId = N1.RecordId
And N1.NoteType = Note.NoteType
)
Here I'm assuming that a given Note.RecordId must exist in Lejer.LejerId.
However, it is possible that this is not the case and you want to ensure tha
t
the record does exist in the Lejer table then you could add an Exists clause
like so:
Delete Note
Where NoteType = '29000'
And Exists(
Select *
From Lejer As L1
Where L1.LegerId = Note.RecordId
)
And NoteId <> (
Select Min(N1.NoteId)
From Note As N1
Where Note.RecordId = N1.RecordId
And N1.NoteType = Note.NoteType
)
Obviously, you should execute this code carefully to ensure that it is produ
ce
the results you want before you commit against production data.
HTH
Thomas
"Steen Persson" <SPE@.REMOVEdatea.dk> wrote in message
news:OfARdISYFHA.2348@.TK2MSFTNGP14.phx.gbl...
> Hi
> First of all, it's a vendor application, so I can't change anything in the
> database or application. I this case it's not a problem though, since it's
ok
> to create several records as they did, but in this case it's just because
they
> had some problems with a printer and therefore thay ran a wizard several t
imes
> which generated a number of records that where baiscally same. It's these
> records thay now want to get deleted.
> The 2 tables that's involved is called Lejer and Note.
>
> CREATE TABLE [Lejer] (
> [EjendomNr] [EjendomNr] NOT NULL ,
> [LejemaalNr] [LejemaalNr] NOT NULL ,
> [LejerNr] [LejerNr] NOT NULL ,
> [LejerID] [RecordID] IDENTITY (1, 1) NOT NULL ,
> .....and about 200 more column definitions
>
> CREATE TABLE [Note] (
> [NoteID] [RecordID] NOT NULL ,
> [Tabelnavn] [TabelNavn] NOT NULL ,
> [RecordID] [RecordID] NOT NULL ,
> [Dato] [Dato] NOT NULL ,
> [NoteType] [KodeId] NOT NULL ,
> ....and some more column definitions.
> The fields that links the tables are Lejer.LejerID and Note.RecordID.
>
> Below is ans example of sample data
> Lejer:
> Ejendomnr Lejemaalnr Lejernr LejerID
> 1 1 1 1000
> 1 2 2 1001
> 1 3 3 1002
> 2 1 1 1003
> 2 2 2 1004
> 3 1 1 1005
> 3 2 2 1006
> Note
> NoteID RecordID NoteType
> 1 1000 29000
> 2 1000 29000
> 3 1000 29000
> 4 1000 29000
> 5 1001 29000
> 6 1001 29000
> 7 1001 29000
> 8 1002 29000
> 9 1002 29000
> 10 1002 29000
> 11 1003 29000
> 12 1003 29000
>
> I'd like to find the notes where the type is e.g. 29000 and are linked a
> record in the Lejer table with the Ejendomnr of e.g. 1.
> and then delete all notes but 1.
> In the above example, it means that if I use Ejendomnr = 1, I'll have the
Note
> records with NoteID 1 to 10. Out of these I'd like to delete 3 of the ones
> with RecordID 1000, 2 of them with RecordID 1001 and 2 of them with Record
ID
> 1002.
> As mentioned in my original post, I could get all the RecordID's into a cu
rsor
> and then e.g. count the number of occurences for each of them and then do
a
> delete n-1 times. I just don't know if that's the smartest way to do it or
if
> there's a better approach?
> Regards
> Steen
>
> David Portas wrote:
>|||Hi Thomas
Thanks for you input. That was another way of doing it than I had in my
mind. I'll try it out in my test db.
Regards
Steen
Thomas Coleman wrote:
> How about
> Delete Note
> Where NoteType = '29000'
> And NoteId <> (
> Select Min(N1.NoteId)
> From Note As N1
> Where Note.RecordId = N1.RecordId
> And N1.NoteType = Note.NoteType
> )
> Here I'm assuming that a given Note.RecordId must exist in
> Lejer.LejerId. However, it is possible that this is not the case and
> you want to ensure that the record does exist in the Lejer table then
> you could add an Exists clause like so:
> Delete Note
> Where NoteType = '29000'
> And Exists(
> Select *
> From Lejer As L1
> Where L1.LegerId = Note.RecordId
> )
> And NoteId <> (
> Select Min(N1.NoteId)
> From Note As N1
> Where Note.RecordId = N1.RecordId
> And N1.NoteType = Note.NoteType
> )
> Obviously, you should execute this code carefully to ensure that it
> is produce the results you want before you commit against production
> data.
> HTH
>
> Thomas
> "Steen Persson" <SPE@.REMOVEdatea.dk> wrote in message
> news:OfARdISYFHA.2348@.TK2MSFTNGP14.phx.gbl...

No comments:

Post a Comment