Page 1 of 1

DimensionElementDelete performance

Posted: Thu Feb 23, 2017 1:37 pm
by vladkon
Hi,
I am trying to improve performance on dimension element delete. I have a fairly large cube (around 60 GB), with one dimension being about 600k elements. I need to delete all the children of one particular consolidation, which are about 250k elements. I have a simple process running on on subset "All" and checking in metadata if an element is a child of particular elements, and if so, perform DimensionElementDelete. The problem is that it takes hours to do it - about an hour for 1000 elements. I tried using DimensionElementDeleteDirect to no significant effect. CubeSetLogChanges set to zero in the prolog. Are there any other ways to increase the deletion speed under this circumstances? Any ideas greatly appreciated.

Re: DimensionElementDelete performance

Posted: Thu Feb 23, 2017 2:40 pm
by gtonkin
I would create a static subset based on the elements to be deleted - use the subset editor or MDX to create this, save without expression. Not sure if you can do this but that would be my first point of call.

Re: DimensionElementDelete performance

Posted: Thu Feb 23, 2017 3:37 pm
by vladkon
Created static subset, but it seems to be even a bit slower. Did anyone try to edit dimension in a separate server and then copy it it back to original server?

Re: DimensionElementDelete performance

Posted: Thu Feb 23, 2017 8:44 pm
by lotsaram
vladkon wrote:Hi,
I am trying to improve performance on dimension element delete. I have a fairly large cube (around 60 GB), with one dimension being about 600k elements. I need to delete all the children of one particular consolidation, which are about 250k elements. I have a simple process running on on subset "All" and checking in metadata if an element is a child of particular elements, and if so, perform DimensionElementDelete. The problem is that it takes hours to do it - about an hour for 1000 elements. I tried using DimensionElementDeleteDirect to no significant effect. CubeSetLogChanges set to zero in the prolog. Are there any other ways to increase the deletion speed under this circumstances? Any ideas greatly appreciated.
I think you need to post some code here. There's no way that if you are doing this right that it should be taking that long. A minute or 2 maybe, but not more. The cube isn't particularly huge, neither the dimension.

The elements being deleted, .. they are C elements? Otherwise if the elements are type N you would be losing data.

Re: DimensionElementDelete performance

Posted: Fri Feb 24, 2017 7:46 am
by pandinus
My suggestion is to keep this as simple as possible by using the following code in the prolog of your process instead of using subsets as datasources and using metadata tab to run through them.

I estimate this will not take longer than 5 seconds to run. Of course you could perform additional checks on when to delete.

Code: Select all

vDim = 'Dimension name';
vParent = 'Parent to delete from';
i = DIMSIZ(vDim);
WHILE(i > 0);
  vElement = DIMNM(vDim, i);
  IF(ELISANC(vDim, vParent, vElement) = 1);
    DimensionElementDelete(vDim, vElement);
  ENDIF;
  i = i - 1;
END;


Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 7:42 am
by vladkon
That is what I am using:

Code: Select all

#Prolog

DimName = 'ZZZZZ';
CubeName = 'YYYYY'

CubeSetLogChanges(CubeName, 0);

MDX = '{TM1FILTERBYLEVEL({TM1DRILLDOWNMEMBER({[' | DimName | '].[' | Parent_Element | ']},ALL,RECURSIVE)}, 0)}';

SubName = GetProcessName() | '_Source';

IF(SubsetExists(DimName, SubName) = 1);
	SubsetDestroy(DimName, SubName);
	SubsetCreate(DimName, SubName);
ELSE;
	SubsetCreate(DimName, SubName);
ENDIF;

SubsetMDXSet( DimName, SubName, MDX); 

DatasourceDimensionSubset= SubName;


#Metadata

IF(ELISPAR(DimName, Parent_Element, Fin_Customers) = 1);

	DimensionElementDeleteDirect('Fin_Customers', Fin_Customers);

ENDIF;

#Epilog

CubeSetLogChanges(CubeName, 1);

Tried to use pandinus suggestion - already running for an hour with no end in sight..

Could it be that the same dimension is used in several other small cubes degrades performance?

Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 2:22 pm
by Wim Gielis
Hello,

I am with Pandinus where he suggests to keep the code as simple as possible.
Then again, I am sure it will not be done in seconds, nor a couple of minutes as Lotsaram suggests.
Based on past experiences, it can take quite some time, sad but true.

I assume this is an exercise that should be done only once ?
Can you describe the cube that is about 60 GB in size ?
Do you delete data on N elements, C elements, or a mix ?
Do you perhaps have MDX-based subsets on the same dimension ?
What about rules and feeders, or primarily base level data that is loaded ?
Other big dimensions in the same cube ?
Does it help if you zero out the data first, then remove elements ?
What if you replicate the cube and its dimensions to separate server, and perform the same operations there ? Just to compare / isolate things. Also with and without rules, if applicable.

Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 2:51 pm
by Wim Gielis
Also, in the code you posted above, you do not loop over a static subset.
To me, this is all the code you need:

Code: Select all

DimName = 'ZZZZZ';
SubName = GetProcessName() | '_Source';
Parent_Element = '';

SubsetDestroy(DimName, SubName);
SubsetCreateByMDX(SubName, '{TM1FilterbyLevel( {[' | DimName | '].[' | Parent_Element | '].Children}, 0)}');
SubsetMDXSet( DimName, SubName, '');

i = SubsetGetSize( DimName, SubName);
While(i > 0);
  DimensionElementDelete(DimName, SubsetGetElementName(DimName, SubName, i) );
  i = i - 1;
End;

Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 3:13 pm
by vladkon
1. Correct, this needs to be done only once
2. Cube has sales data on customer and product dimensions. 20 dimensions in total. The purpose is to bring the cube to a more manageable size.
3. Deleted elements are all n-level
4. Yes, there are MDX subsets on the same dimension
5. There are rules and feeders on the cube, mostly for currency translations
6. Next largest dimension is Product dimension - ~ 17K elements.
7. Zeroout does not help (at least not in a noticeable way)
8. Doing it right now. Removed the rules, does not seem to be helping.
On other notice, I modified Pandinus code to add asciioutput inside the loop, to monitor what is going on. It runs about 2.5 hours on all 250K elements, exits the loop (.csv contains all relevant elements), but then continues to run (already 2 hours and going). Probably saving the changes or what?

Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 3:15 pm
by gtonkin
There are at least three things here that still concern me about this requirement:
1) Number of Elements - 250k plus - What I have noticed is that as the number of elements increases, so does the time to deal with them, however this seems to be a geometric/exponential relationship. You would immediately see this issue if you try and page down in the subset editor in a small vs very large subset/dimension.
2) MDX in the recent post - this is creating a dynamic MDX subset, not static
3) If the MDX is effectively getting the children of the parent in question, why would you need to do an ELISPAR/ELISANC again? You have the N level elements in question to delete.

So the code I used (and just saw Wim's post along the exact same lines) is as follows:

Code: Select all

MDX = '{TM1FILTERBYLEVEL({TM1DRILLDOWNMEMBER({[' | DimName | '].[' | Parent_Element | ']},ALL,RECURSIVE)}, 0)}';

SubName = GetProcessName() | '_Source';
IF(SubsetExists(DimName, SubName) = 1);
   SubsetDestroy(DimName, SubName);
ENDIF;

#--Create subset via MDX then make it Static
SubsetCreateByMDX(SubName, MDX);
SubsetMDXSet( DimName, SubName,'');

#--Looping from last to first-no need here as we only have N levels but habit
iCount=SubsetGetSize(DimName,SubName);
WHILE(iCount>0);
sElement=SubsetGetElementName(DimName, SubName, iCount);
DimensionElementDelete(DimName,sElement);
iCount=iCount-1;
END;
I then created a process to add elements based on a set of tranches and then delete them again to test what I referred to in point 1 above. I just tweaked the original MDX to limit the number of elements to delete i.e.
MDX = 'TOPCOUNT({TM1FILTERBYLEVEL({TM1DRILLDOWNMEMBER({[' | DimName | '].[' | Parent_Element | ']},ALL,RECURSIVE)}, 0)},1000)';

The results are per this attachment
LargeDimensionDelete.PNG
LargeDimensionDelete.PNG (18.61 KiB) Viewed 11164 times
My conclusion is that the larger the dimension is, the more patience you will need (and probably weekend/overtime to run these cleanups)
Not sure that it hepls your plight but at least explains things.

Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 3:16 pm
by vladkon
Also, in the code you posted above, you do not loop over a static subset.
I used:
SubsetMDXSet( DimName, SubName, MDX);
AFAIK it does create a static subset. Am i wrong?

Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 3:24 pm
by gtonkin
vladkon wrote:
Also, in the code you posted above, you do not loop over a static subset.
I used:
SubsetMDXSet( DimName, SubName, MDX);
AFAIK it does create a static subset. Am i wrong?
It does not, AFAIK, which is why Wim and I probably used the same approach, create then make static via the set with no MDX.

Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 3:30 pm
by vladkon
Thanks guys, reread the reference guide, my mistake - should not use MDX parameter in SubsetMDXSet. Anyway I'll leave Pandinus code running overnight, will report tomorrow what it will give. Thanks all for your inputs.

Re: DimensionElementDelete performance

Posted: Sun Feb 26, 2017 3:56 pm
by Wim Gielis
My guess is that the MDX based subsets are also part of the durations you are seeing.
In such a large cube, why do you use rules (and feeders) to do currency conversions ?
I agree with George about the non-linear relationship between number of elements to delete - total number of elements in the dimension - total processing time.

Re: DimensionElementDelete performance

Posted: Mon Feb 27, 2017 10:40 am
by vladkon
So, reporting back the results. I deleted all the history to make cube smaller, leaving only current year - around 8Gb. Even after that, deleting 250k (actually closer to 300K) elements took 11.5 hours. I will try to remove all MDX subsets and rerun the deletion to check if it will improve the performance.

Regarding Wim's question - the cube structure is far from ideal, and one of the improvements planned is to get rid of feeders. It is planned for the next stage. I was thinking of going consolidations way, making local currency child of reporting currencies.