Building dimensions and loading data-separate TI better?

Post Reply
telula
Posts: 99
Joined: Tue Nov 18, 2008 5:40 am

Building dimensions and loading data-separate TI better?

Post by telula »

Hello,
I remember reading somewhere on this forum that it is better to build dimensions before loading the cube and not as we are loading the cube. Is this the common understanding or did I remember incorrectly?
Would the situation be different if the source of your dimension (text file) is the same source as your data load?That is, can we build the dimension in the metadata tab and load in the Data tab since the source file is the same?
tomok
MVP
Posts: 2836
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: Building dimensions and loading data-separate TI better?

Post by tomok »

This may be just personal preference but I almost always separate metadata processes (dimension maintenance, attributes, etc.) from data loading processes. This is because I find there are many times I want to just load/reload a cube without changing dimensionality. If you create a kitchen-sink type TI process then you can't do this. It does lead to process proliferation but there are ways to help manage this with chores and/or nested processes.
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/
ajain86
Community Contributor
Posts: 132
Joined: Thu Oct 15, 2009 7:45 pm
OLAP Product: TM1
Version: 9.4.1 9.5 9.5.1
Excel Version: 2003 2007

Re: Building dimensions and loading data-separate TI better?

Post by ajain86 »

For the most part, I prefer to do dimension maintenance as a separate process prior to the data load.

I do, however, still have steps in the metadata for the load process to add any new members to avoid any errors during the load. This section has a check to first make sure if the element does not exist.
Ankur Jain
lotsaram
MVP
Posts: 3702
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: Building dimensions and loading data-separate TI better?

Post by lotsaram »

My opinion is also to always separate dimension updates from cube loads. (Generally the data source will be different anyway, being a full fact table or hierarchy extract for dimension updates versus summarised transactional data for cube loads.) Also making sure meta data is updated first saves significant time versus processing data load extracts twice.
ajain86 wrote:For the most part, I prefer to do dimension maintenance as a separate process prior to the data load.

I do, however, still have steps in the metadata for the load process to add any new members to avoid any errors during the load. This section has a check to first make sure if the element does not exist.
Personally I would only ever do this as a last resort, why process data twice when you don't have to. Whether this is desirable or even feasible depends on data load volumes, for small data sets such as GL sure it's no problem (but how often does a new cost centre or account get set up and posted to in between dimension updates?), but if the data set is very large, say 10s of millions of records, and/or if the data loads is multi-threaded then any code on the meta data tab adds to much processing time overhead, or would cause other issues such as locking.
User avatar
Martin Ryan
Site Admin
Posts: 1989
Joined: Sat May 10, 2008 9:08 am
OLAP Product: TM1
Version: 10.1
Excel Version: 2010
Location: Wellington, New Zealand
Contact:

Re: Building dimensions and loading data-separate TI better?

Post by Martin Ryan »

I take a case by case basis. If, as lotsa mentions, your dimension source is different from your data source then it certainly makes sense to have them separately rather than doing messy joins to make them one data source.

If it's all in one place though I'd do it all in one process. This is mostly to cut down on the proliferation of TI processes. Lately I've seen a couple of relatively small TM1 models with 50+ TI processes and it makes it hard to find what you're looking for (which is why I wrote some VBA that's coming out in the new TM1 Tools edition very soon to help find things).

Technically speaking I don't think it really makes much difference. Sure you could do some error catching between two processes, but for me I'd rather have my model a little smaller instead.

Martin
Please do not send technical questions via private message or email. Post them in the forum where you'll probably get a faster reply, and everyone can benefit from the answers.
Jodi Ryan Family Lawyer
ajain86
Community Contributor
Posts: 132
Joined: Thu Oct 15, 2009 7:45 pm
OLAP Product: TM1
Version: 9.4.1 9.5 9.5.1
Excel Version: 2003 2007

Re: Building dimensions and loading data-separate TI better?

Post by ajain86 »

I agree with Martin in that it is a case by case basis.

In my limited experience, I have noticed that regular dimension maintenance has been of a lower priority and most dimension updates are actually caught during data loads. For this, I create an unmapped bucket in the dimension and add any new members there. This way, the new members can be easily identified and be properly mapped for the future. This also helps the person doing tieouts after the load as they no longer have to worry about any record not being loaded.

This does increase processing time but I have found it to be safer specially when there is little to none regular maintenance done by the customer.
Ankur Jain
Post Reply