Dimension Order

John Hobson · Post by **John Hobson** » Tue Feb 03, 2009 7:35 pm

Can someone just remind me - dimension order - In my head I have it as short to long then sparse to dense - is that right - having a blonde (ok grey) moment.

The documentation just tells you how to change it, not why you should change it.

J

Mike Cowie · Post by **Mike Cowie** » Tue Feb 03, 2009 8:29 pm

Hi John,

The short answer is that I've tended to go with short -> long as the primary rule, and then sparse -> dense as a second rule. Often the two rules of thumb contradict each other since dimensions like SKU/customer can have fairly sparse data population throughout the cube (compared to something dense like a time period dimension) yet be very long. It usually ends up working out that my time and account dimensions are toward the bottom and any measures/string dimension is always at the very end. Version, region and other much smaller dimensions tend to be toward the very top.

I've always wanted to invest some energy in better understanding what what matters wrt dimension order and whether the smallest possible cube size is what I want or if that doesn't necessarily mean the cube will calculate optimally. The dimension re-order wizard can give some helpful/surprising guidance as far as cube RAM footprint, but that's about the only tool I've really used in this regard. I know the optimal order can change as more data is populated in a cube, and I'd imagine that there is also some smaller fluctuation in optimal order across TM1 versions.

Regards,
Mike

John Hobson · Post by **John Hobson** » Tue Feb 03, 2009 8:44 pm

Thanks Mike

Alan Kirk · Post by **Alan Kirk** » Tue Nov 17, 2009 10:32 pm

I'm not sure how I missed this thread but I'd been thinking of posting one about dimension order to add to the FAQ. (I for one often find it faster to get information via the FAQ than to dig through the manuals for it.) Thanks to Lotsaram referring to it.

For the information of new users who may not be familiar with the "sparse/dense" concept, this extract from the 9.4 developer's manual explains:

As a first step toward ordering dimensions, divide the dimensions into two groups: sparse and dense dimensions. A dense dimension has a high percentage of values for its elements. You can estimate the density by answering this question: If one element in the dimension has a value, keeping the elements of the other dimensions constant, what is the probability that the other elements in the dimension have values?

For example, if you have a budget in January for a given account and region, you probably also have a value for the remaining months. Therefore, the Month dimension is probably dense. Similarly, if you have a budget value for a given month, account, and region, you probably also have an actual value, making ActVsBud a dense dimension.

However, in a worldwide sales cube, you probably do not sell every product in every region. Therefore, you would treat Product and Region as sparse dimensions.

We generally recommend that you order the dimensions as follows: smallest sparse to largest sparse, followed by smallest dense to largest dense.

(Yes, that differs from Mike Cowie's method, but you pays your money and you takes your chances.)

You would think that Iboglix would have provided a cube optimiser, but of course they haven't; the only known one is a commercial product, discussed here.

One thing to watch out for is that as your cube evolves, the optimal order of the dimensions may change. There's really no way of predicting that in advance, no matter which rules you go with.

Tilo · Post by **Tilo** » Wed Mar 26, 2014 8:00 pm

Hello everyone,

the dimension reordering wizard - as far as I can see it - optimizes the usage of RAM.
But what if I do not care about RAM?
My primary goal is to optimize the performance and not the memory usage.

How do both goals interact when changing dimension order?
How do I change the order for best performance?

Thanks in advance
Tilo

Tilo · Post by **Tilo** » Wed Mar 26, 2014 8:06 pm

I forgot to add something:
I know that parts of the cube are sparse and some are not - eg. the past years are dense and the actual years are not.
The usage of cube cells could have a correlation to that density/sparsity - eg. the actual year is used for input and showing actual figures, the past years are used to report figures.

If i create a copy cube without the data that is not used, would an optimization of that cube transferred to my real cube make it faster in praxi although the wizard would show me a different optimization order?

Tilo · Post by **Tilo** » Wed Mar 26, 2014 8:08 pm

Best Practice Guide by Bedrock for Desinging Cubes:
http://www.bedrocktm1.org/resources/fil ... Design.pdf

Tilo · Post by **Tilo** » Wed Mar 26, 2014 8:21 pm

Sorry, again something came to my mind afterwards.

How about optimizing dimension order in system cubes?
According to the proposition of "system" our large element attribute cubes could save 40-80% of memory...
Is there a risk/gain optimizing there dimensions order?

Thanks
Tilo

declanr · Post by **declanr** » Wed Mar 26, 2014 8:37 pm

Tilo wrote:Hello everyone,

the dimension reordering wizard - as far as I can see it - optimizes the usage of RAM.
But what if I do not care about RAM?
My primary goal is to optimize the performance and not the memory usage.

How do both goals interact when changing dimension order?
How do I change the order for best performance?

Thanks in advance
Tilo

To answer your original question; at a basic concept level - a cube when reordered to take less RAM will be able to do so by doing less work (for consolidations etc); less work will mean that view retrieval will be quicker.
View retrieval times equate to what most people would commonly consider "optimal performance" but it might be best to specify what YOU consider optimal performance; how you measure it and so such.

lotsaram · Post by **lotsaram** » Wed Mar 26, 2014 11:36 pm

Tilo wrote:I forgot to add something:
I know that parts of the cube are sparse and some are not - eg. the past years are dense and the actual years are not.
The usage of cube cells could have a correlation to that density/sparsity - eg. the actual year is used for input and showing actual figures, the past years are used to report figures.

If i create a copy cube without the data that is not used, would an optimization of that cube transferred to my real cube make it faster in praxi although the wizard would show me a different optimization order?

If you don't have data then by definition an empty cube is 100% sparse. The data in the cube determines the density (or inversely sparsity), ergo without data you cannot optimize properly as all you could do is smallest -> largest. As data patterns in a cube change over time so can the optimal dimension order. Optimizing an empty cube simply does not make sense.

Tilo wrote:the dimension reordering wizard - as far as I can see it - optimizes the usage of RAM.
But what if I do not care about RAM?
My primary goal is to optimize the performance and not the memory usage.

How do both goals interact when changing dimension order?

There is an direct relationship between RAM footprint, disk footprint and query time. Memory goes down then query time goes down (that is performance gets better). There can be some exceptions with rule calculations but generally that's it. Someone with hardcore programming experience might be able to explain why by describing how the data is structured with arrays, trees and vectors and the efficiency gains with the correct dimension order. But the simple explanation less memory good = faster query has always been enough for me.

Tilo wrote:Sorry, again something came to my mind afterwards.

How about optimizing dimension order in system cubes?
According to the proposition of "system" our large element attribute cubes could save 40-80% of memory...
Is there a risk/gain optimizing there dimensions order?

No. This would be absolute nonsense. System cubes like attributes have only 2 dimensions and so are dense to start with. Anyway they cannot be optimized (the last dimension is fixed since attributes are strings and that leaves no dimensions to move).

PS how about for next time hold off for 5 minutes to allow some thinking time to avoid the string of "I forgot ...", "It just occurred to me ...", "something came to mind afterwards ..."

Tilo · Post by **Tilo** » Thu Mar 27, 2014 11:47 am

Hello lotsaram and declanr,

thank you for your answers.

****************
lotsaram rote:
If you don't have data then by definition an empty cube is 100% sparse. The data in the cube determines the density (or inversely sparsity), ergo without data you cannot optimize properly as all you could do is smallest -> largest. As data patterns in a cube change over time so can the optimal dimension order. Optimizing an empty cube simply does not make sense.
****************
I did not mean "empty" but wanted to leave out “the data that is not used".
As data patterns change I thought it could make sense to optimize a copy cube without the data that is not (or seldomly) used and then overtake the optimized dimension order to the original cube.

****************
lotsaram rote:
No. This would be absolute nonsense. System cubes like attributes have only 2 dimensions and so are dense to start with. Anyway they cannot be optimized (the last dimension is fixed since attributes are strings and that leaves no dimensions to move).
****************
It seems that the definition of density that I overtook from IBM is not yours? Or I misunderstand IBMs.
You say a cube with two dimensions is "dense to start with".
According to IBM a Cube with 2 dimensons respectively the dimensions itself in such a cube can be sparse or dense.

As far as I know attributes can be numbers and also reordering of string dimensions is possible in system cubes.
IBM states that you should not do so - they mean normal cubes in the documentation I think. (http://www-01.ibm.com/support/docview.w ... wg27035784) But that does not mean that it is right for system cubes, too.

IBM states "when you optimize the order of dimensions in a cube, you should not move the string dimensions FROM the last position, nor move the string dimensions TO the last position."
I just changed the dimensions orders for several system cubes, also with string attributes.
Or TM1 is just saying that it did so but did not do so - Who knows.
I guess if you change dimension order in a cube so that always a string dimension is the last one, that could work then.

According to IBM strings will be treated as numeric when they are not in the last dimension of the cube - but this seems to be said for dimension order when creating cubes – if reordering has the same effect I do not know. IBM states that creation order and system order are different things.
If it is true for reordering, too, then reordering could do harm, I guess – eg. Treating S-Elements as numbers causing errors.
I have }ElementAttributes-Cubes that have only N-Elements in both dimensions although all attributes are strings - I can enter values without problems (How this works? - I do not know).

IBM-Definition of Density as I found it:

An informal definition of density is as follows:
A dimension D is dense in a cube C if, "most" of the D-siblings of each nonempty cell in C, are also nonempty. Where two cells are D-siblings if they differ only in dimension D.
A more precise definition and computational algorithm is as follows:
To determine the density of dimension D, sort all nonempty (simple) cells in the cube so that D is the lowest order sort key. Run through the resulting "file" and for each group Gk of cells where the cells differ only in dimension D, count the number of cells in the group N(Gk). Let M be the number of simple elements in dimension D. The density of D would be given as:

(Sum over k of N(Gk)*N(Gk)/M) / (total number of nonempty cells)
In an example where you have 100 accounts, 100 divisions, and 12 months, and you populate one account and one division for 12 months then the results are as follows:
For the months dimension, there is only one group with 12 cells, the density is:
(12*12/12)/12 = 1 = 100%.

For the accounts and division dimensions there are 12 groups, each with only one cell, their density is:
(1*1/100 + 1*1/100 +...+ 1*1/100)/12 or
12*(1/100)/12 or
1/100 = 1%

I always question the documentations and technotes of IBM and often there are contradicting or unclear statements.
So it is in this case here I think.
(Just yesterday I got one documentation clarification ticket answered.)

Sorry for posting several times.
Often things come to my mind in the moment when I ask for them or when I send the mail/post or when I move when playing chess. I try to work around it (automatic delay rule with a minute delay for sending mails helps for instance) but thinking for hours does not change that. I hate that when playing chess...

Nevertheless I am sorry as it seems to me it made you angry and you are right about the outcome of additional posts that could have been included into the first one.
But maybe you think that answering when being angry can also cause unhelpful answers.

Cheers
Tilo

lotsaram · Post by **lotsaram** » Mon Mar 31, 2014 4:59 pm

Tilo wrote:It seems that the definition of density that I overtook from IBM is not yours? Or I misunderstand IBMs.
You say a cube with two dimensions is "dense to start with".
According to IBM a Cube with 2 dimensons respectively the dimensions itself in such a cube can be sparse or dense.

Hi Tilo,
Es ist ja möglich etwas zu verstehen ohne es richtig kapiert zu haben!
um zu erweitern ...
A 2 dimensional system attribute cube even if sparsely populated will still be relatively very densely populated compared to a typical multi-dimensional cube. Consider a sales cube with dimensions customer, product, sales rep, region, year, week, measure. Even with modest dimension sizes the data will naturally be very, very sparse. If additional product or customer based dimensions are added as they often are (brand, size, life-cycle, etc.) that do not add additional data points due to 1:1 relationship but add considerably potential cube size then sparsity blows out again by several more orders of magnitude. Sparsity in a TM1 cube is often 1 x 10e-6 or considerably greater still. When we talk about sparsity for TM1 cubes we are not talking about 10% or 1% or even 0,1% of leaf cells containing data, all of these scenarios would still be dense in relative terms.

Yes attribute cubes can contain numeric data but attribute cubes are kind of "special", for instance numeric attribute data can be populated to consolidated nodes. Really numeric attributes are best thought of as "strings in disguise".

HTH

Wim Gielis · Post by **Wim Gielis** » Mon Mar 31, 2014 5:51 pm

lotsaram wrote:Hi Tilo,
Es ist ja möglich etwas zu verstehen ohne es richtig kapiert zu haben!
um zu erweitern ...

This person MUST be located in a German-speaking country

TM1 Forum

Dimension Order

Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order

Re: Dimension Order