Compare subsets of 2 dimensions

Helmar · Post by **Helmar** » Tue Aug 18, 2015 9:09 am

Hello,

in a TI process, I'd like to compare the elements of a subset (ss1) against a subset (ss2) of another dimension aiming at filtering missing elements in ss2.
Already tried the mdx 'EXCEPT' function, but this works for subsets of the same dimension only (according to TI error message).

Another approach was to eliminate those elements in ss1, which are identical in ss2, by using SubsetElementDelete function. As a result, missing ss2 elements then would remain in the ss1 subset.
However, SubsetElementDelete function requests the index of the element to be deleted from the subset (instead of the element name itself), which makes the procedure complicated. As the 2 subsets to be compared might contain thousands of elements, a 'brute force' procedure like this (based on the 'subix' function introduced to this forum recently) would be pretty much timeconsuming...

I would be most grateful for other ideas or solutions.
BRGDS, Helmar

Using TM1 version 10.2.2

declanr · Post by **declanr** » Tue Aug 18, 2015 9:24 am

Even across big dimensions you will find just looping the subsets pretty quick; below code shows a way to delete elements in subset 1 that exist in subset 2. You could do something similar with just 1 loop by setting a TI with the datasource of 1 of your giving subsets etc.

Code: Select all

sDim1= 'Dimension1';
sSubset1 = 'Subset1';
sDim2='Dimension2';
sSubset2 = 'Subset2';
nSubSiz2 = SubsetGetSize ( sDim2, sSubset2 );
iCount = SubsetGetSize ( sDim1, sSubset1 );
While ( iCount > 0 );
          sElement1 = SubsetGetElementName ( sDim1, sSubset1, iCount );
          iCountInner = nSubSiz2;
          While ( iCountInner > 0 );
                    sElement2 = SubsetGetElementName ( sDim2, sSubset2, iCountInner );
                    If ( sElement2 @= sElement1 );
                            SubsetElementDelete ( sDim1, sSubset1, iCount );
                            iCountInner = 0;
                    Else;
                            iCountInner = iCountInner - 1;
                    EndIf;
           End;
           iCount = iCount -1;
End;

Other options are pushing out to a DB and doing a Where count = 1 there and reimporting (but that sounds over-engineered to me.)

The subix function you mentioned I believe is still just a "we would like" function as opposed to an actual one (stand to be corrected if I have missed something though) personally I understand people's desire for something like that but since you can have the same element a million times in a single subset I'm not convinced it will ever come to be and if you did have multiples of the same element; say if it always returned the first index - you would still have to do another check within a loop until it returned 0 to say they were all gone... as such I'm happy just sticking with loops (although I am sure many forumers will be disgusted that I have said that

)

Edit - usual caveats of above code being knocked up quickly and not put into a TI window so it could be full of typos - if you do use it make sure to check that the loops work as to not get caught in an eternal loop of doom.

lotsaram · Post by **lotsaram** » Tue Aug 18, 2015 11:09 am

The other tried and true alternative to looping on a subset in order to find an element is to use the subset as a data source to create a temporary dimension. This step is fast. You can then use DIMIX on the temp dimension to test if the element exists in the subset. This is more likely going to be faster but due to creating and destroying a temp dimension is more prone to causing lock contention on a busy server.

(note: if elements exist multiple times in the subset then dimix != subix)

Helmar · Post by **Helmar** » Tue Aug 18, 2015 12:21 pm

Thanks a lot to both of you for the quick responses,

first approach ('loops') is more or less the one I already had in my mind, when speaking of 'brute force'

Using temporary dimensions to compare the subsets is another smart idea, although the locking might be an issue.

I am just working on another solution, based on an auxiliary numeric attribute in the reference dimension (dim1), which is generated (and deleted when finished) in the TI.
Then simply processing the subset to be checked (in dim2), by writing a different attr value (e.g. 1) to each element when found in dim1.
Those elements in the subset, which are missed, will remain to show an attr value of '0', which can be put to another 'resulting' subset and finally output to a file.

It's generally working, at least for smaller subset sizes. Hope the ATTRPUTN will also work in a reasonable time even for bigger subset sizes...

Thanks again and best regards!
Helmar

TM1 Forum

Compare subsets of 2 dimensions

Compare subsets of 2 dimensions

Re: Compare subsets of 2 dimensions

Re: Compare subsets of 2 dimensions

Re: Compare subsets of 2 dimensions