Thursday, March 29, 2012

Help with complicated splitting and inserting

Ok, I don't come here unless for desperation since my head is splitting thinking about this and not knowing SQL enough to so something this hidious, I need some help! Being a C# developer and NOT a DBA, I'm about to blow my head off. Excuse my french but it's a bad day when I have to do crap like this that should have been designed right the first time :).

Ok, bear with me.

Story: We have a Product table. Stupidly, before I was here at this company, in the product table we have a ProductDescription. In the product Description, there is HTML and text. the HTML contains the actual child ProductIDs that are related to the product record. Yea, and so here it goes.

Goal: Split out the product and it's related child IDs into our new ProductRelationship table which we should have done 1.5 years ago

Shortened Schema for posting purposes goes like this:

Product
-
ProductID
ProductDescription


ProductRelationship
-
ProductID
RelatedProductID
Description

Ok, so in the Product table, here's an example from ProductDescription:

'These fully-orchestrated royalty free music tracks invoke the spirit of some of the great themes from 1970's and 1980's television and film productions and offer majestic brass, string and guitar melodies that will make a memorable addition to projects as background music and production music.<br><br><span class='product-name-no-link'>You can also purchase the individual tracks:</span><br><br>01. American Plains <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105234">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105235">WAV</a><br>02. Sultry Summer Night <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105236">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105237">WAV</a><br>03. Ocean Skyline <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105238">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105239">WAV</a><br>04. Wistful Lover <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105240">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105241">WAV</a><br>05. Final Choice <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105242">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105243">WAV</a><br>06. Fun and Free <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105244">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105245">WAV</a><br>07. Wayward Strangers <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105246">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105247">WAV</a><br>08. Savored Moments <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105248">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105249">WAV</a><br>09. Endless Searcher <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105250">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105251">WAV</a><br>10. Bach Piano <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105252">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105253">WAV</a><br>11. Fog Bound Mornings <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105254">MP3</a> | <a href="http://links.10026.com/?link=ProductInfo.aspx?ProductID=105255">WAV</a><br>'

Ok, so, as you can see, the fu**ing product IDs are in the damn HTML! I need to take the current ProductID and it's corresponding ProductIDs found in this memo field and put into the ProductRelationship table like they sh ould have been in the first place!

In other words, I should get these records (assuming a ProductID of 100000 for this record) put into my ProductRelationship table for this example:

ProductID RelatedProductID
100000 105234
100000 105235
100000 105236
100000 105237
100000 105238
100000 105239
100000 105240
100000 105241
100000 105242
100000 105243
100000 105244
100000 105245
100000 105246
100000 105247
100000 105248
100000 105249
100000 105250
100000 105251
100000 105252
100000 105253
100000 105254
100000 105255
…..next product

How the hell do I even attack this? Should I just use C#?!?!?!? How would I do this in SQL first then maybe I'll try attacking this using maybe a C# form or console program. Hell if I know.

Hello,

You've sort of answered your own question I think. This is obviously going to involve string manipulation, so c# or similar would be better suited (especially if .net 2 has some sort of html parser you can use). This is not to say you can't do this in tsql, but it will be slow and ugly.

I see the ProductDescription also has single and double quotes embedded in the html, so will make working with tsql even more cumbersome.

My vote's in for c#...

Cheers,
Rob

|||

alright, thanks. I guess what I'll do is first to this in SQL as it's gonna take me longer to setup & Code in C# since I've in the past mostly done ASP.NET, not client-side data manipulation to this extent

so any help with my T-SQL here?

|||


-- Using a table of number as in
-- http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html
-- you can do this. Note that this assumes ProductID=NNNNN" and won't work if the format differs in any way.
-- This sort of thing is better done in your C# client.


SELECT ProductID,
SUBSTRING(ProductDescription,
Number+LEN('ProductID='),
CHARINDEX('"',ProductDescription,Number+LEN('ProductID='))-Number-LEN('ProductID=')) as RelatedProductID
FROM Product
INNER JOIN Numbers ON Number BETWEEN 1 AND LEN(ProductDescription)
WHERE SUBSTRING(ProductDescription,Number,LEN('ProductID='))='ProductID='


|||

Ok, thanks but I need to loop through my entire set of records and for each pull out the ProductID and it's ChildIDs for that record and repeat for each record. Insert as stated for each ProductID and it's corresponding Childs in our new ProductRelationship Table.

I possibly see 2 loops here. One iterates through my recordset and an internal iteration that then spits each ProductID/chid ProductID (from description) into our table.

but 2 nested loops is probably an inefficient way to code this.

No comments:

Post a Comment