Page 1 of 1

Multiple Hierarchies - expensive?

Posted: Mon Aug 16, 2010 11:06 pm
by GPC
Lets say we have an Employee Details cube with 100,000 employees and 2 salary band dimensions (2 different types of salary bands) and say 10 measures.
The number of cells is 100,000 * 10 = 1,000,000.
Now lets say we replace the 2 salary band dimensions with 1, with 2 hierarchies.
The number of cells is then 100,000 * 10 * 2 = 2,000,000.
So, while the design is more "correct", the cost is double the number of cells, a corresponding increase in the amount of memory used etc.
Is this right?

cheers,

Gregory

Re: Multiple Hierarchies - expensive?

Posted: Tue Aug 17, 2010 3:43 am
by Martin Ryan
TM1 doesn't store blanks, so if you have a hyper sparse cube (lots of zeroes), then it doesn't matter. Populated cells matter, not potential cells.

I'd think the increased flexibility of three dimensions would vastly outweigh the minor performance gains you might get with a 2d approach.

Martin

Re: Multiple Hierarchies - expensive?

Posted: Tue Aug 17, 2010 8:56 am
by mce
If each employee belongs to maximum one salary band, it might be good idea to treat salary band as a text formatted data item in your measure dimension, rather than treating it as a dimension.
If the salary band is not assumed to change for an employee, you may even consider treating it as an attribute and a parallel hierarchy in Employee dimension.
In this approach you may be able to handle it in a cheeper and more effective way.

Re: Multiple Hierarchies - expensive?

Posted: Wed Aug 18, 2010 11:07 am
by Jeroen Eynikel
Your math seems very weird to me.

Keep in mind that far less new cells than you imagine will be created. On the leaf level no new elements will be added. So the only new cells would be the 'consolidated' cells according to your second hierarchy.

Re: Multiple Hierarchies - expensive?

Posted: Wed Aug 18, 2010 1:26 pm
by lotsaram
In a TM1 cube data is only stored in leaf cells. As consolidations are calculated on the fly there is no exhaustive pre-calculation of all possible intersections required i.e TM1 is immune from "OLAP data explosion". Therefore additional hierarchies cost nothing (other than the additional maintenance to maintain them and some additional complexity for users to navigate - but the 2nd point is more often than not a good thing, users want additional hierarchies as this reflects the real life complexity of business data.)

Re: Multiple Hierarchies - expensive?

Posted: Thu Aug 19, 2010 2:29 am
by GPC
Perhaps I didn't make the question clear enough.

What I am considering is the relative cost in populated cells & memory usage between 2 design alternatives;

1. An employee cube with 2 different Salary Band dimensions and
2. The same cube but only 1 Salary Band dimension instead of 2, and with 2 hierarchies to reflect the 2 different types of salary bands.

Now, in the 1st design, if there are 100,000 employees and 10 measures are populated for each the usage is 100,000 * 10 = 1,000,000 cells.
In the 2nd design, there are the same 100,000 employees and 10 measures are populated for each. However, because each measure has to be populated for each of the 2 Salary Band hierarchies, there are now 100,000 * 10 * 2 populated cells.

So the cost is double IF all measures are populated for each hierarchy.
I've actually since made the change from 2 dims to 1 and the result was exactly as above, which means, yes, it is expensive.
The only way to reduce the number of populated cells is to not write all measures against both hierarchies.

cheers,

Gregory

Re: Multiple Hierarchies - expensive?

Posted: Thu Aug 19, 2010 3:08 am
by lotsaram
GPC wrote:In the 2nd design, there are the same 100,000 employees and 10 measures are populated for each. However, because each measure has to be populated for each of the 2 Salary Band hierarchies, there are now 100,000 * 10 * 2 populated cells.

So the cost is double IF all measures are populated for each hierarchy.
Greg

I think you misunderstand the meaning of the term "hierarchy" in TM1 dimensions. As TM1 supports multiple alternate hierarchies the base elements or N level leaves do not need to be uniquely defined for each hierarchy. Rather you just assign the same leaf element to multiple parents. Therefore there is no need for duplication of data.

Does this apply to your case?

Re: Multiple Hierarchies - expensive?

Posted: Thu Aug 19, 2010 4:58 am
by GPC
Hi Lotsaram,

each hierarchy has unique base elements e.g.

Hierarchy 1, Elements; "$8000-$36000", "$36000-$48000" etc.
Hierarchy 2, Elements; "$8000-$34500", "$34500-$45000" etc.

as they are based on different Salary Award scales.

cheers,

Gregory

Re: Multiple Hierarchies - expensive?

Posted: Thu Aug 19, 2010 5:00 am
by Alan Kirk
GPC wrote:Hi Lotsaram,

each hierarchy has unique base elements e.g.

Hierarchy 1, Elements; "$8000-$36000", "$36000-$48000" etc.
Hierarchy 2, Elements; "$8000-$34500", "$34500-$45000" etc.

as they are based on different Salary Award scales.
Surely only one award applies to one individual?

Re: Multiple Hierarchies - expensive?

Posted: Thu Aug 19, 2010 11:52 pm
by GPC
Hi Alan,

certainly a single hiearchy would work if that was the case, but alas, no, they are different Types of awards and each individual needs to be classified according to Each Type. Hence they both start at $8000 but the banding is different.

cheers,

Gregory

Re: Multiple Hierarchies - expensive?

Posted: Fri Aug 20, 2010 1:28 am
by Alan Kirk
GPC wrote: certainly a single hiearchy would work if that was the case, but alas, no, they are different Types of awards and each individual needs to be classified according to Each Type. Hence they both start at $8000 but the banding is different.
This isn't really a question of hierarchy memory consumption, it's just a design issue; and frankly having two bands, where each individual has to fit into two separate elements within one dimension, would be a bad design. That's because the reason for the huge memory jump is that you would have to be duplicating the data in the 10 measures for what is essentially the same person. It doesn't matter whether the system is relational or OLAP, that kind of duplication is bad.

As Lotsaram indicated, this is not at all the same thing as having two hierarchies. If this were a hierarchy based solution then you'd have a dimension with the person's actual salary as the element, not the band that it falls into. You would then have two hierarchies of consolidations, each of which consolidates all of the salary elements which fall into that particular band. This wouldn't be any more expensive than having the two dimensions since you'd still have one "row" per person and would have their 10 measures stored only once. Not that this method is without its drawbacks either, but storing the same values for 10 measures twice in the one cube is definitely not the way to go since it involves both data redundancy and the risk of update anomalies.

Re: Multiple Hierarchies - expensive?

Posted: Fri Aug 20, 2010 6:09 am
by Michel Zijlema
I totally agree with Alan here. In addition to the above comments: if you want to be able to cross-filter on both bands (select band X from the first band and band Y from the second band) you're gone with the single dimension (double hierarchy) approach.

Michel

Re: Multiple Hierarchies - expensive?

Posted: Mon Aug 23, 2010 2:11 am
by GPC
Hi Alan (& others),

Thankyou for your analysis.
You would then have two hierarchies of consolidations, each of which consolidates all of the salary elements which fall into that particular band.
I'm not aware of a way that TM1 can consolidate into "Bands". If it can, I would be very interested to hear how.

It seems then, that multiple dimensions, while they would seem to be logically incorrect, as they are both "banding" the same data item, is the more economical solution as far as cell usgage/memory consumption are concerned.

In the multiple hieararchy design (with multiple base elements), at least consumption can be reduced by only storing certain required measures against the additional hierarchies.

Michel, we have no need to "cross-filter" across the 2 bands. We allways look at the data in terms of 1 band at a time.

Re: Multiple Hierarchies - expensive?

Posted: Mon Aug 23, 2010 3:01 am
by Alan Kirk
GPC wrote:
You would then have two hierarchies of consolidations, each of which consolidates all of the salary elements which fall into that particular band.
I'm not aware of a way that TM1 can consolidate into "Bands". If it can, I would be very interested to hear how.
Like that:
SalaryBands.jpg
SalaryBands.jpg (28.29 KiB) Viewed 9754 times
Joe Smith has a salary of $70K. His data is stored against the element 70000. He therefore falls into Award 1 Band 66K to 85K and Award 2 Band $56K to $70K. You can select either of those elements, and if every other selection is the same you still end up with good old Joe's values.

Sally Brown has a salary of $80K. Her data is stored against the element 80000. If you select Award 1 Band 66K to 85K you'll get both Joe and Sally's numbers, but if you select Award 2 Band $56K to $70K you get Joe's alone. In terms of being able to select the bands that you want, there's no real difference between that and having the bands as N level elements themselves.

As Lotsaram was alluding to, it would be worthwhile to go through the manuals to get a better handle on what hierarchies actually are.
GPC wrote: It seems then, that multiple dimensions, while they would seem to be logically incorrect, as they are both "banding" the same data item, is the more economical solution as far as cell usgage/memory consumption are concerned.
As I said, it has nothing to do with the number of dimensions. The only thing it has to do with is whether you're duplicating data or not. If you are, you use more memory. If you're not, you don't.

Re: Multiple Hierarchies - expensive?

Posted: Tue Aug 24, 2010 1:07 am
by GPC
Alan,

of course, if the Salaries were all multiples of $5,000 as in your example, we could use simple alternate hierarchies. Unfortunately they are not, there are thousands of discrete individual salaries.

cheers,

Gregory

Re: Multiple Hierarchies - expensive?

Posted: Tue Aug 24, 2010 1:15 am
by Alan Kirk
GPC wrote: of course, if the Salaries were all multiples of $5,000 as in your example, we could use simple alternate hierarchies. Unfortunately they are not, there are thousands of discrete individual salaries.
That makes what difference, exactly? So Joe's salary is 72426.85. That's the name of your element instead of 70000. Such an element can still consolidate into two different hierarchies. All of the thousands of salaries can still consolidate into bands. All you need to do is have the TI that uploads the data work out which bands they fall into and assign the elements accordingly. It's hardly an excessively complex task.

Re: Multiple Hierarchies - expensive?

Posted: Tue Aug 24, 2010 4:36 am
by GPC
Hi Alan,

we have not been storing individual actual salaries up to this point but clearly it's more efficient to do so, so that's what I'll do.

Thankyou for your help.

cheers,

Gregory