Page 1 of 1

DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Mon Mar 23, 2015 2:56 pm
by elee123
Hi Everyone,
What's the difference between DimensionElementDeleteDIRECT and DimensionElementDelete ?
Is one better than the other?
Thank you.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Mon Mar 23, 2015 3:04 pm
by qml
The 'Direct' functions (there are others too) does a quick change to the dimension without recompiling it. Unlike standard metadata manipulation functions it can be used on the Data tab and is intended to be used when you have relatively few changes to the dimension and you don't want to execute the Metadata routine for some reason (e.g. it might take too long). Direct changes can leave the dimension organized in a suboptimal way, so they should not be abused, but they definitely have their uses.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Mon Mar 23, 2015 3:08 pm
by elee123
So does that mean it doesn't matter which one to use if we're in metadata or prolog tab?
It'll only make a difference when using it in Data tab?

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Mon Mar 23, 2015 3:33 pm
by qml
elee123 wrote:So does that mean it doesn't matter which one to use if we're in metadata or prolog tab?
It'll only make a difference when using it in Data tab?
Not quite. It means that unless you know exactly why you are doing something different, go with DimensionElementDelete and use it either in Prolog or in Metadata.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Mon Mar 23, 2015 7:55 pm
by lotsaram
qml wrote:The 'Direct' functions (there are others too) does a quick change to the dimension without recompiling it. Unlike standard metadata manipulation functions it can be used on the Data tab and is intended to be used when you have relatively few changes to the dimension and you don't want to execute the Metadata routine for some reason (e.g. it might take too long). Direct changes can leave the dimension organized in a suboptimal way, so they should not be abused, but they definitely have their uses.
I don't know that you can make a change to a dimension without recompiling. I think that rather than keeping a temporary copy of the dimension as per traditional DimensionElementInsert and ..ComponentAdd which is then compiled and committed on completion of the (metadata or prolog) tab the Direct functions instantly commit or recompile any changes on each function call regardless of which code tab.

If changes are few and the data source large then the Direct functions can make sense due to saving the overhead of needing to process the data source twice. But if there are frequent and plentiful metadata changes then processing is likely to be slower using the Direct functions due to the repeated saves of the dimension on each change in which case stick to the traditional metadata processing followed by data.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Mon Mar 23, 2015 8:49 pm
by qml
lotsaram wrote:I think that rather than keeping a temporary copy of the dimension as per traditional DimensionElementInsert and ..ComponentAdd which is then compiled and committed on completion of the (metadata or prolog) tab the Direct functions instantly commit or recompile any changes on each function call regardless of which code tab.
I believe the Direct functions work differently than the standard ones in more ways than that. It seems to be evidenced by the existence of DimensionUpdateDirect. From this function's description I conclude that the entire dimension is not recompiled/rewritten/rebuilt (pick whichever term you like, IBM seem to favour 'whole-copy editing' which focuses on the copy-change-commit aspect) when using Direct functions and that it can have some negative effects when changes made in this fashion accumulate. Of course I'm always happy to be proved wrong.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Mon Mar 23, 2015 10:23 pm
by lotsaram
qml wrote:I believe the Direct functions work differently than the standard ones in more ways than that. It seems to be evidenced by the existence of DimensionUpdateDirect. From this function's description I conclude that the entire dimension is not recompiled/rewritten/rebuilt (pick whichever term you like, IBM seem to favour 'whole-copy editing' which focuses on the copy-change-commit aspect) when using Direct functions and that it can have some negative effects when changes made in this fashion accumulate. Of course I'm always happy to be proved wrong.
I don't have the faintest clue what that is even supposed to mean :?
(x2 the IBM function documentation and what you wrote)

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Mon Mar 23, 2015 11:45 pm
by paulsimon
Hi

As I understand it the main reason for introducing the Direct Dimension set of functions was for the situation that often arises where you are loading data, and you encounter a small number of elements in the source data that are new. The fact that you can use the DimensionElementInsertDirect function on the Data Tab means that you can insert these new elements immediately and then load data into them.

Without this, you would need to put the older DimensionElementInsert statement on the Metadata tab. This would then force TM1 to do two passes over the source data, one on the Metadata Tab and one on the Data Tab. At the end of the Metadata tab, there would also be a full recompile of the dimension, before the second pass for the Data Tab started.

Particularly when you are reading data from a SQL source, and particularly where this involves network traffic and group by operations, the saving of a second pass over the data can be a considerable performance benefit.

The other advantage is from the point of view of locking. If you use the old style DimensionElementInsert function on the Metadata tab, then the process will lock the dimension throughout the Metadata and Data Tabs.

By comparison the DimensionElementInsertDirect statement on the Data Tab would typically be enclosed in an IF test along the lines of

IF( dimix( SomeDim, vSomeElem ) = 0 ) ;
DimensionElementInsertDirect( SomeDim, '' , vSomeElem , 'n' ) ;
ENDIF ;

and therefore the DimensionElementInsertDirect statement might not need to be executed at all and I believe that this therefore means that no locking will occur unless the data load encounters an element that doesn't already exist in one of the dimensions.

I think that the rule of thumb from IBM is along the lines of, if you are updating more than 90% of the elements in the dimension, then it is more efficient to use the Metadata Tab with DimensionElementInsert, but if you are updating a smaller set of elements then use just DimensionElementInsertDirect on the Data Tab.

So typically you would use DimensionElementInsert on the Metadata Tab when doing a full dimension rebuild from eg a Parent-Child table, which might be only done overnight or on an adhoc basis as new accounts, etc, are added. By comparison you would use DimensionElementInsertDirect on the Data Tab when you encounter new elements during a data load, when you are perhaps frequently refreshing the cube during the day as entries are posted into a General Ledger.

As for the original question about DimensionElementDeleteDirect, that is slightly different to Insert. I mostly use this when I have to update very large dimensions in a 24x5 operation. In that case I do a full update on a Temp dimension, and then compare this to the real dimension, and use DimensionElementDelete to remove consolidations that no longer exist in the Temp dimension from the real dimension. The full update on the Temp dimension doesn't cause locking as it is not used in any cubes. By only updating changes on the real dimension, locking is kept to a minimum.

As I understand it, the downside of using the Direct functions is that they add to the size of the Dimension which can use more memory and make things slower. However, if you are loading data and updating a dimension of 1000 elements and only encountering 3 new elements then this is not going to be significant.

You can use the DimensionUpdateDirect statement in the Epilog to compress the dimension, and perhaps run this every 2 hours, or alternatively, you can wait until the overnight full dimension update runs using the Non-Direct statements and the Metadata tab which will also have the effect of compressing the dimension down.

Regards

Paul Simon

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Tue Mar 24, 2015 5:16 pm
by qml
lotsaram wrote:I don't have the faintest clue what that is even supposed to mean :?
(x2 the IBM function documentation and what you wrote)
I sincerely apologise for being unclear. My only excuse, even if it's a weak one, is that English is not my first language and clearly my grasp of it leaves a lot to be desired.

Dimensions updated using Direct functions take up more RAM than equivalent dimensions updated using only non-Direct functions (this is easy to test). Therefore there must be something different in the way the dimension itself is updated i.e. it is not fully rewritten when Direct functions are used. Also, the *.dim file on disk is not updated right away (and neither is *.dim$) each time when a Direct function is used, but in the commit phase of the process in Epilog. Whatever you wish to call what happens when you update a dimension using any of the Direct functions, it is not the same as the dimension recompilation you get between the Metadata and Data tabs when using non-Direct functions only.

Interestingly, the structure of the *.dim file is not affected by the use of Direct functions (i.e. functionally identical dimensions will have identical *.dim files regardless of which functions you use to build them), but the in-memory structure of the dimension is clearly affected.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Tue Mar 24, 2015 7:04 pm
by lotsaram
Well Kamil my Polish is non-existant and I think your grasp of English is pretty top notch. The detail from you and Paul is pretty interesting. I never looked into the Direct functions that deeply since when I first tried to use them in 10.1 there was a bug with "TopElements" which meant doing DimensionElementComponentAdd didn't work for the top level consolidation of a dimension, which meant I had to abandon that approach and stick with metadata tab updates. I'm not sure if this is fixed in 10.2 of if it is a "built in feature" of the function.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Tue Mar 24, 2015 7:23 pm
by BrianL
Here's my understanding of the "direct" functions versus the "standard" functions.

Standard functions need to be used in the prolog/metadata tabs. They will do a full copy and recompile of the entire dimension. Since the copy and then subsequent recompile (at the end of the metadata tab) can be expensive for large dimensions this isn't always optimal.

The direct functions skip the copy and recompile and instead make incremental edits. The primary reasons to do this are because it's cheaper for small updates to large dimensions and because it can be done on any tab of the TI. The disadvantages (as noted by qml) are that the dimension consumes more memory and isn't as efficient at doing bulk updates.

The DimensionUpdateDirect function will trigger the full recompile of the dimension. Again, this can be expensive, but it will shrink the memory footprint down to the same level you'd see if you avoided the direct style functions in the first place.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Wed Mar 25, 2015 1:08 pm
by elee123
Thank you all for your contributions. I appreciate it.

Re: DimensionElementDeleteDirect vs DimensionElementDelete

Posted: Sat Apr 04, 2015 1:24 pm
by BariAbdul
by paulsimon ยป Mon Mar 23, 2015 11:45 pm

As I understand it the main reason for introducing the Direct Dimension set of functions was for the situation that often arises where you are loading data, and you encounter a small number of elements in the source data that are new. The fact that you can use the DimensionElementInsertDirect function on the Data Tab means that you can insert these new elements immediately and then load data into them.

Without this, you would need to put the older DimensionElementInsert statement on the Metadata tab. This would then force TM1 to do two passes over the source data, one on the Metadata Tab and one on the Data Tab. At the end of the Metadata tab, there would also be a full recompile of the dimension, before the second pass for the Data Tab started.
Hi Paul/Gurus ,What I understand when there is DimensionElementInsert statement on the Metadata tab ,there would be one pass being made by TM1 on source data for metadata changes each source record and end of the metadata the changes would be committed.The second pass would to data changes for the new elements inserted through Metadata tab.

Appreciate if you could please clarify.Thanks a lot.