Thursday, March 29, 2012

Help with cupe partitioning/processing

Hello guys

I have an issue with one of our cubes which i hoped someone could help me with... The problem is a historical cube which is currently growing quite drastically, and includes daily history from 2004 until today. The cube is just processed full every single day, and this means processing 80+ million rows every day, when actually most of the data is not changed at all. I decided I want to split my cube in data for 2004, 2005, 2006, 2007 and then the current month. Once the current month is over then i would merge it with the year to date partition and create a new partition for the new month.

My problem is how to implement the processing in a simple way? Right now the processing is done in SSIS by just processing full the db, so I guessed i have to make 2 tasks, one to process full all the dimensions first , and one to process full the cubes afterwards in which i process for this specific cube only the latest partition. But this means at the end of the month I would have to change the package manually, since i need to merge some partitions and then process full only the newest partition again...

Is there any simpler way of doing this? Is my approach correct in the partitioning way and processing full of only the newest partition, except at the end of the month when i would need to merge and process the ytd partition?

I think you are headed down the right path, but I'd suggest you read through the partitioning white paper from Project REAL. It provides some good examples and thoughts related to partitioning within SSAS (and within SQL Server) and the ETL implementation for the solution there. I think you could probably use some of the same ideas and design patterns to solve your problem...

The partitioning white paper can be found at the following URL:

http://www.microsoft.com/technet/prodtechnol/sql/2005/realpart.mspx

Good luck!

Dave Fackler

No comments:

Post a Comment