TM1 Forum

Posted: **Tue Jul 10, 2012 5:16 pm**

All,

Using 9.5.2 FP2 I have come across something a little odd.

I have a TI process that is, from a flat file building, six broadly similar copies of the same dimension.

It’s fairly complicated several while loops and many attributes being sprayed around.

I’ve been experiencing some differences in performance depending on how I was running the code.

If I run the code with a DimensionDeleteAllElements against each of the dimensions I’m building then the code runs in about 55 seconds.

If I run the code without a DimensionDeleteAllElements then the code takes 3.5 minutes to run.

All other code is identical.

Even stranger it is the attribute update that is where the performance hit is rather than on the meta data tab. (Logging is forced off on the attribute cubes in the TI.)

I’m at a loss to explain this and can’t figure away to spoof my way round. Anyone have a clue what might be going on under the hood? It all seems rather counter intuitive to me...

I refuse to use DimensionDeleteAllElements in TIs that are going to get released into the wild so I guess I’ll have to take the performance hit…

Cheers,
Steve

Posted: **Tue Jul 10, 2012 5:22 pm**

When you run the TI without the DimensionDeleteAllElements have you manually deleted the elements before hand?

If you are running the process to update dims that already contain the elements are you presumably running some sort of reference to see if they are already there (e.g. a Dimix)?

Posted: **Tue Jul 10, 2012 7:13 pm**

Hi,

No I don't delete the elements manually.

Yes I do a Dimix before I do an element insert but.
1. I would expect the TI to run slower on the MD tab since my Dimix fails more often and I'm doing more element inserts, I've not tested explcitly but
2. It's the attribute adding that runs slowly on the D tab if I don't do a dimension delete all elements. Again there is a Dimix here, whicever way I run it. So with the delete in place I'm running the same number of Dimix request but writing more attributes and getting much faster performance.... hence my confusion...

Cheers,

Posted: **Tue Jul 10, 2012 8:15 pm**

Steve Rowe wrote:Hi,

No I don't delete the elements manually.

Yes I do a Dimix before I do an element insert but.

I wouldn't bother with that unless your objective is to flag duplicates; otherwise it's a waste of processor cycles. If the element exists then DimensionElementInsert just ignores it.

Steve Rowe wrote: 1. I would expect the TI to run slower on the MD tab since my Dimix fails more often and I'm doing more element inserts, I've not tested explcitly but
2. It's the attribute adding that runs slowly on the D tab if I don't do a dimension delete all elements. Again there is a Dimix here, whicever way I run it. So with the delete in place I'm running the same number of Dimix request but writing more attributes and getting much faster performance.... hence my confusion...

This is just a WAG[1], but are those attributes aliases? If so then somewhere down in the bowels of the functions that the TI calls there would need to be some sort of routine which validates uniqueness (across all aliases as well as the principal name). In all probability this involves a scan of the dimension and its aliases. If you blow away all of the elements and recreate in metadata then the only things it needs to check are the principal names and the aliases which had already been added by the data tab of that process; all of the other values will be known to be null. If you don't, then there are probably preexisting aliases that it needs to check for every element. That would take substantially longer since there's the potential for a shedload of string comparisons to need to be done. (Probably even longer than normal since TM1 is mostly case and space insensitive, so the comparison wouldn't be a simple binary one.)

Of course if they're plain ol' text or numeric attributes, that won't account for it.

[1] Wild A$$ed Guess

Posted: **Tue Jul 10, 2012 8:38 pm**

Another WAG: perhaps it has something to do with validating the rules as it goes. I have seen an occasional error where a change to the dimension fails because the rule doesn't like it so obviously TI does check with the rules as it goes. It may be that starting from a clean slate enables this to be done faster.

Perhaps try removing the rules in the cubes that this dimension uses and see if that makes any difference.

Posted: **Tue Jul 10, 2012 9:18 pm**

I ran into something similar recently updating a dimension from an ODBC source.

Alan Kirk wrote:This is just a WAG[1], but are those attributes aliases? If so then somewhere down in the bowels of the functions that the TI calls there would need to be some sort of routine which validates uniqueness (across all aliases as well as the principal name). In all probability this involves a scan of the dimension and its aliases.

This is roughly what I concluded. Updating just a single alias caused the data tab to run approximately 100 times slower on a dimension of ~50000 elements. Interestingly, I managed to work around it by setting a string attribute via TI which was referenced in a rule to populate the alias. I was a little bit hesitant about this approach but haven't had any problems.

Martin Ryan wrote:Perhaps try removing the rules in the cubes that this dimension uses and see if that makes any difference.

I did try this actually but it had no effect. I'd considered whether it might be that I was somehow triggering a load of feeders too but this didn't seem to be the case.

Posted: **Wed Jul 11, 2012 7:57 am**

Good idea on the alias I'll try and narrow it down a bit further.

asutcliffe wrote
This is roughly what I concluded. Updating just a single alias caused the data tab to run approximately 100 times slower on a dimension of ~50000 elements. Interestingly, I managed to work around it by setting a string attribute via TI which was referenced in a rule to populate the alias. I was a little bit hesitant about this approach but haven't had any problems.

Thanks for taking the time out to confirm Alan's WAG(!), not sure I like the ruled approach since strings aren't cached and I'd be concerned that I'm swapping a dimension build performance issue for a small lag in alias referencing. If I have the time to figure an alternative approach I'll let you know.

Deleting and reinserting the alias is a non-starter as it will break an subsets that reference the alias. I guess the other thing to try is to VZO the alias before I start writing to them. This should mean that the testing process is "like" it is when the dimension has jsut been created.

In tracking down that it was the dimension delete that was causing the issue I'd already deleted all the cubes on the server so there are no rules.

Alan Kirk wrote:
Steve Rowe wrote:
Hi,

No I don't delete the elements manually.

Yes I do a Dimix before I do an element insert but.
I wouldn't bother with that unless your objective is to flag duplicates; otherwise it's a waste of processor cycles. If the element exists then DimensionElementInsert just ignores it.

I've done some testing on this and doing a Dimix is faster than doing DimensionElementInsert, even if the element already exists. At a guess I would say that a DimensionElementInsert is not ignored for elements that already exist its just that as a user you can't tell the difference between re-insertion and ignore for an element that exists. Looking at a dimension via Dimix vs updating a dimension has a significant time difference.

In the special case where you build a dimension from scratch then the Dimix is wasted I agree, in the more general case where you are applying small incremental updates there is a performance benefit from doing a Dimix before your insert.

Anyway, thanks to the forum as ever, sounds like I have an answer. I'll post my findings.

Cheers,

Posted: **Wed Jul 11, 2012 8:37 am**

Hi Steve,

The downside of using DIMIX in the Meta Data is, it won't update the hierarchy if elements are moving between different consolidations, thereby chances are you will end up either:
1) Leaving the existing element orpahaned, assuming you have unwinded the consolidation where it rolls up to
2) Having the same element rolling in two different places in the absence of unwind

Unfortunately I won't be able to provide any solution for your problem other than perhaps trying loading the attrs through and seperate TI, if not wanted to just share my passing thoughts on the issue.

Posted: **Wed Jul 11, 2012 9:02 am**

Steve Rowe wrote:Thanks for taking the time out to confirm Alan's WAG(!), not sure I like the ruled approach since strings aren't cached and I'd be concerned that I'm swapping a dimension build performance issue for a small lag in alias referencing. If I have the time to figure an alternative approach I'll let you know.

No problem. I almost posted about it the other day but was trying to remember whether alias setting had always suffered from poor performance.

FWIW, I haven't noticed any significant lag, albeit in a relatively simple model. My concern was more that I might introduce duplicate aliases if I was somehow bypassing the uniqueness test. I'm concatenating the element name and a description though so pretty comfortable it will be unique. I'll look at it a bit more thoroughly before putting in production. I'd be interested if you do come up with a better alternative.

Posted: **Wed Jul 11, 2012 9:21 am**

Well doing a VZO in a separate TI had an impact. It appeared to increase the processing time from 3.5 min to 4.5 mins…. This excludes any overhead from the VZO itself.

Strange.

Tried to do an AttrInsert of an string attribute with the same name as the alias to see if I could convert the alias to an attribute and then do the update and convert it back to an alias. The AttrInsert ran but didn’t change the alias to a string attribute so that didn’t work.

Anyway I don’t have the time to do anymore investigation of this. Thanks for the pointers.

(Amin good point on using Dimix on the dimension updates, you do need to have a lot of confidence in your source)

Cheers,

Posted: **Thu Jul 12, 2012 3:01 am**

Steve Rowe wrote:
Alan Kirk wrote:
Steve Rowe wrote:
Yes I do a Dimix before I do an element insert but.
I wouldn't bother with that unless your objective is to flag duplicates; otherwise it's a waste of processor cycles. If the element exists then DimensionElementInsert just ignores it.
I've done some testing on this and doing a Dimix is faster than doing DimensionElementInsert, even if the element already exists. At a guess I would say that a DimensionElementInsert is not ignored for elements that already exist its just that as a user you can't tell the difference between re-insertion and ignore for an element that exists. Looking at a dimension via Dimix vs updating a dimension has a significant time difference.

In the special case where you build a dimension from scratch then the Dimix is wasted I agree, in the more general case where you are applying small incremental updates there is a performance benefit from doing a Dimix before your insert.

I'm not sure that I'd say significant but yes, I agree that my tests (under 10.1) confirm yours. I'm surprised as it's not just a DimIx but an If() as well which I would have expected to slow things down a bit. That's why I prefer to avoid them if not absolutely necessary. (That may just be my VBA background kicking in, though.)

I created a chore which:
- Process 0: Deleted and recreated a dimension with only a single element;
- Process 1: Inserted elements (1 million elements in the first test, 2 million in the second; the numbers being chosen mainly to slow it down enough to time it);
- Process 2: Inserted the same elements;
- Process 3: Wrapped the insert command inside an If DimIx = 0 block.

In all cases these were done in the Prolog to take any data source speed issues out of the equation. Aside from the If block in process 3 the code in all 3 processes was identical.

Test 1 was 11 seconds, 11 seconds, 9 seconds for processes 1, 2 and 3 respectively.
Test 2 was 22 seconds, 22 seconds, 19 seconds.

So the difference isn't huge, and if you're expecting that the majority of the elements won't be there already (a variation on the special case that you mention) you'd probably save a few cycles and code lines by not checking... but for the far more common case of refreshing an existing dimension it may be worth it just to squeeze that little bit of extra speed out of it.

Posted: **Thu Jul 12, 2012 2:03 pm**

Steve

I had a similar issue with 9.5.1 where a big dimension was taking a long time to update. Again the MetaData was taking seconds but the Data tab was taking minutes and sometimes it was taking hours.

I did a ViewZeroOut of all the Attributes on the Prolog and we didn't get the problem after that. Perhaps it needs to be done in the Prolog rather than in a separate Process? I have had some strange results when mixing and matching dimension activity in a process with updates in processes called from the main process. In those cases I had to change things so that the main process didn't do any dimension updates and only called other processes.

If you have 10.1 then you could try removing code from the MetaData altogether and then using the new DimensionInsertElementDirect function within an IF( DIMIX test on the Data Tab)

Regards

Paul Simon

Posted: **Thu Jul 12, 2012 4:16 pm**

Thanks for the tip Paul,
Tested with the VZO in the prolog and it did not improve things, didn't make things worse though.

@Alan,
I'm pretty sure that the difference was less marginal than you describe but I don't remeber what they were or the version I tested on, maybe they made inserts faster....

Cheers,

Posted: **Thu Jul 12, 2012 7:44 pm**

Steve Rowe wrote: @Alan,
I'm pretty sure that the difference was less marginal than you describe but I don't remeber what they were or the version I tested on, maybe they made inserts faster....

Or If() blocks slower, possibly by wrapping them inside Java code.

Posted: **Mon Jul 16, 2012 5:03 pm**

Steve

The other time that I have seen this issue is when reading from Oracle. The process was reading from a very complex Oracle View. For some reason it was not too bad the first time it read from the view on the MetaData tab but this caused some sort of locking in Oracle and it was then very slow the second time the View was read on the Data Tab. Switching to a materialised view (actually a table that was effectively a materialised view) cured the problem.

Is it possible that you have a similar issue in the database end? It was counter-intuitive that it was slower the second time we read the SQL View but that is what was happening. An Oracle expert did explain why, but I was more interested in the way around it than the reason, so I don't remember the details.

Regards

Paul Simon

Posted: **Mon Jul 16, 2012 9:20 pm**

Hi Paul, Thanks for the tip but in this case the source was another TM1 dimension.
Cheers

Posted: **Wed Sep 18, 2013 9:09 pm**

Just wanted to update this thread to say that a colleague (Kuljeet Dhillon) tracked down the cause of the slow performance.

Basically the original TI was looping through multiple dimensions in the data tab whilst updating the attributes and this was causing the hit, not 100% sure why but when the code was changed so that just one dimension was being addressed at a time the performance was normal.

Cheers,

TM1 Forum

TI Speed in Dimension creation

TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation

Re: TI Speed in Dimension creation