i need help with this task i can't think how to do better.
I have a table with MyField1, MyField2, Cost, SomeOtherField
For each couple of MyField1, MyField2 i have several rows.
I need to delete some rows so that:
- for each couple of MyField1, MyField2 i have at most @.MaxNumber rows remaining and
- for each pair MyField1,MyField2 the Cost in the remaining rows is never more than a certain coefficient @.K times the cost of the least costly row of the pair (each MyField1,MyField2 has its own minimum cost).
I did this easily with a Cursor but i needed better performances.
I tried with this:
delete from MyTable from MyTable as p1 where
0 = (select (1 + sign(@.MaxNumber -1-count(*))) * (1 + sign(min(p2.Cost) * @.K- p1.Cost) ) 
from MyTable p2 where p2.MyField1 = p1.MyField1 and p2.MyField2 = p1.MyField2 and p2.Cost< p1.Cost) 
What's happening here ? for each row i count how many rows are there with the same MyFields that have cost less than @.K*MinValue , i multiply the two difference between the value i need and the real value and only when one of these values are zero i know that the product is zero and so i can choose the row to be deleted. Note that the double "FROM" in the query is no mistake.
This is terribly involuted, it works better than the cursor but i am sure there must be a better way to do this !!
Could You please help me ?
Thankx
Wentuhi,
can you post a sample data?|||
Try This...
Code Snippet
Delete
From mytable
From mytable as Main
Join
(
Select p1.*
From mytable p1
Join mytable p2 Onp2.MyField1 = p1.MyField1
and p2.MyField2 = p1.MyField2
and p2.Cost< p1.Cost
Group By
p1.MyField1, p1.MyField2, p1.Cost
Having
(1 + sign(@.MaxNumber -1-count(*))) * (1 + sign(min(p2.Cost) * @.K- p1.Cost)) =0
) as Sub
OnSub.MyField1 = Main.MyField1 and Sub.MyField2 = Main.MyField2 and Sub.Cost = Main.Cost
|||Manni's query looks better; nonetheless, I put together a mockup. The mockup used the SMALL_ITERATOR and DBO.RAND() objects that can be found here:
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1330536&SiteID=1
The table definition that I used is:
create table dbo.myTable
( rid integer,
myField1 integer,
myField2 integer,
cost numeric (9,2),constraint pk_myTable primary key (myField1, myField2, cost, rid)
)
goinsert into dbo.myTable
select iter,
1 + (iter/35),
5*dbo.rand(),
33*dbo.rand() + 33*dbo.rand()
+ 34*dbo.rand()
from small_iterator (nolock)
where iter <= 40
select * from dbo.myTable
And the query is:
|||Wentu,--select * from myTable where myField1 = 1 and myField2 = 1 order by cost
/*
rid myField1 myField2 cost
-- -- --
8 1 1 32.97
6 1 1 42.40
25 1 1 44.39
3 1 1 50.12
34 1 1 56.03
5 1 1 58.37
17 1 1 64.41
13 1 1 65.66
24 1 1 75.05
*/declare @.maxRows integer set @.maxRows = 4
declare @.boundary numeric (6,2) set @.boundary = 1.5;with theMinCost as
( select myField1,
myField2,
@.boundary * min(cost) as maxCost
from myTable
group by myField1, myField2
), seqTable as
( select row_number () over
( partition by myField1, myField2
order by cost, rid
) as seq,
myField1,
myField2,
cost,
rid
from myTable
), candidates as
( select a.myField1,
a.myField2,
a.seq,
a.cost,
b.maxCost,
a.rid
from seqTable a
inner join theMinCost b
on a.myField1 = b.myField1
and a.myField2 = b.myField2
and ( a.seq > @.maxRows or
a.cost > b.maxCost
)
)--select * from theMinCost where myField1=1 and myField2=1
/*
myField1 myField2 maxCost
-- -- -
1 1 49.4550
*//*
select * from candidates
where myField1 = 1
and myField2 = 1myField1 myField2 seq cost maxCost rid
-- -- -- -- --
1 1 4 50.12 49.4550 3
1 1 5 56.03 49.4550 34
1 1 6 58.37 49.4550 5
1 1 7 64.41 49.4550 17
1 1 8 65.66 49.4550 13
1 1 9 75.05 49.4550 24
*/delete from myTable
from myTable a
inner join candidates b
on a.myField1 = b.myField1
and a.myField2 = b.myField2
and a.rid = b.ridselect * from myTable where myField1 = 1 and myField2 = 1 order by cost
/*
rid myField1 myField2 cost
- -- -- --
8 1 1 32.97
6 1 1 42.40
25 1 1 44.39
*/
If you are using SQL Server 2005, try this. It may be efficient and relatively easy to understand and maintain. (I'm using Kent's table structure.)
Code Snippet
with WithMins(rid, myField1, myField2, minCost, cost) as (
 select
 rid,
 myField1,
 myField2,
 min(cost) over (
 partition by myField1, myField2
 ) as minCost,
 cost
 from myTable
), Cheaps(rid, myField1, myField2, cost, minCost, n) as (
 select
 rid,
 myField1,
 myField2,
 cost,
 minCost,
 row_number() over (
 partition by myField1, myField2
 order by myField1, myField2, cost
 -- cost <= @.K*minCost rows are numbered lowest
 ) as n
 from WithMins
)
 delete from Cheaps
 where cost > @.K*minCost
 or n > @.MaxNumber
Steve Kass
Drew University
http://www.stevekass.com
|||:-) (Admiring the work)|||Hi all
let me understand all Your codes, try them , and i'll let You know.
I want to thankx all of You for Your efforts, time and help !
c ya soon
Wentu
 
No comments:
Post a Comment