Running a TI only if the data has changed
Posted: Tue Apr 02, 2013 7:01 am
Hi all. I am on v10.1...
Apologies for the long e-mail. If you haven't got time to read it all skip to the question right at the bottom! If you have got time you might be able to point out if I'm asking the wrong question - like a design fault perhaps.
Setting the scene:
I have 2 cubes - an input cube and a reporting cube. The input cube only has 3 dims - version/geographic area/measures. The in-putters are global. There is element level security on all three dimensions - so in-putters can only input against the "Current Month" version, only have write access to their geographic area, and the ownership of the elements within the measures dim are split between 2 user groups (although both groups need READ access to other groups elements.
Some of the measures are string (some free text but most pick lists to categorise data), whilst others are numeric.
The reporting cube has 9 dimensions. Some of the measures in the input cube (the string measures that have pick lists against them) are turned into dimensions in the reporting cube. For example, in the input cube a measure called Project Manager has a pick list in it containing a list of Project Managers. In the Reporting cube, there is a dimension for project manager. The main reason for this is because there are certain measures (from the input cube) that end users wanted to be able to filter on in the BI reporting that sits on top of the reporting cube - for example all data relating to one project manager. I am a TM1 developer, not a BI report developer, but when the BI developer and I worked together to get the BI reporting off the TM1 reporting cube, it became apparent that it is difficult to filter string measures in BI (it expects a measure to be numeric). The only way around this was to dimensionalise any measure that the users wanted to filter in the reports. This all works well and the BI reporting from the TM1 reporting cube performs very well (when all values are static).
My problem - how do I get data from input cube to reporting cube?
Rule
Originally I used a rule to get the data from the input cube to the reporting cube - purely because "real time reporting" was a requirement. The dimension order of the cube was optimal and the rule derived content was fed efficiently (in my opinion - perhaps it could have been even more efficient though). However, the BI reporting didn't perform fast, due to the calc time of the cube. I reasoned that since the data wouldn't change that often - the users are suffering a performance hit for real time data unnecessarily.
TI - version 1
I changed from a rule to a TI to get the data from the input cube to the reporting cube. Because none of the cells were rule derived/all static values in the reporting cube, the BI reporting off the cube performed very well - much faster.
The first version of this was based on the following:
1. End user inputs data into input cube in TM1 web
2. Clicks action button that runs a TI (and then waits for it to complete)
3. TI does the following:
a) Deletes all numeric and string data in the reporting cube for geographic area that end user is inputting data for
b) takes data from the input cube to the reporting cube for the geographic area that end user is inputting data for
This didn't work for the following reasons:
1. It took a long time - roughly 30 seconds (because the TI is run on demand as the user clicks the action button, anything more than 5-10 seconds is not really acceptable). CubeClearData couldn't be used because the user would not have sufficient security to copy all of the input cubes data into the reporting cube. CellPutProportional spread could be used on numeric data, but the code to clear out string data uses a while loop - which takes longer.
2. The user would never have the correct security access to clear out all data in the measures dim of the reporting cube due to the element security needed.
TI - version 2
The second version of this was based on the following:
1. End user inputs data into input cube in TM1 web - doesn't have to run a TI so no need to wait for it to complete.
2. a chore runs every 2 mins to run a ti that does the following
a) CubeClearData
b) takes ALL data from the input cube to the reporting cube
This runs very quickly since the CubeClearData operates faster than clearing out data piecemeal like the first version of the TI. It takes about 10 seconds to complete. The end user only has to wait a maximum of 2 mins to see latest changed data in the reports - so is practically "real time reporting".
However - 95% of the time the TI doesn't need to run because no data has changed. The logs quickly build up which means the SaveDataAll chore that I have running every 30 mins takes a longer time.
So the question is:
How can I make it so that the TI only runs if the data has changed in the input cube?
I know that I cannot interrogate the current transaction log as it is in use, but I could interrogate the last transaction log after a SaveDataAll has been completed. Not sure that would help since the time lag it would introduce for "real time reporting".
Perhaps I need to back to a rule and suffer the performance hit on the BI reporting side.
Thanks for anyone who has taken the time to read the above and has any suggestions would be most welcomed.
Kind regards.
Apologies for the long e-mail. If you haven't got time to read it all skip to the question right at the bottom! If you have got time you might be able to point out if I'm asking the wrong question - like a design fault perhaps.
Setting the scene:
I have 2 cubes - an input cube and a reporting cube. The input cube only has 3 dims - version/geographic area/measures. The in-putters are global. There is element level security on all three dimensions - so in-putters can only input against the "Current Month" version, only have write access to their geographic area, and the ownership of the elements within the measures dim are split between 2 user groups (although both groups need READ access to other groups elements.
Some of the measures are string (some free text but most pick lists to categorise data), whilst others are numeric.
The reporting cube has 9 dimensions. Some of the measures in the input cube (the string measures that have pick lists against them) are turned into dimensions in the reporting cube. For example, in the input cube a measure called Project Manager has a pick list in it containing a list of Project Managers. In the Reporting cube, there is a dimension for project manager. The main reason for this is because there are certain measures (from the input cube) that end users wanted to be able to filter on in the BI reporting that sits on top of the reporting cube - for example all data relating to one project manager. I am a TM1 developer, not a BI report developer, but when the BI developer and I worked together to get the BI reporting off the TM1 reporting cube, it became apparent that it is difficult to filter string measures in BI (it expects a measure to be numeric). The only way around this was to dimensionalise any measure that the users wanted to filter in the reports. This all works well and the BI reporting from the TM1 reporting cube performs very well (when all values are static).
My problem - how do I get data from input cube to reporting cube?
Rule
Originally I used a rule to get the data from the input cube to the reporting cube - purely because "real time reporting" was a requirement. The dimension order of the cube was optimal and the rule derived content was fed efficiently (in my opinion - perhaps it could have been even more efficient though). However, the BI reporting didn't perform fast, due to the calc time of the cube. I reasoned that since the data wouldn't change that often - the users are suffering a performance hit for real time data unnecessarily.
TI - version 1
I changed from a rule to a TI to get the data from the input cube to the reporting cube. Because none of the cells were rule derived/all static values in the reporting cube, the BI reporting off the cube performed very well - much faster.
The first version of this was based on the following:
1. End user inputs data into input cube in TM1 web
2. Clicks action button that runs a TI (and then waits for it to complete)
3. TI does the following:
a) Deletes all numeric and string data in the reporting cube for geographic area that end user is inputting data for
b) takes data from the input cube to the reporting cube for the geographic area that end user is inputting data for
This didn't work for the following reasons:
1. It took a long time - roughly 30 seconds (because the TI is run on demand as the user clicks the action button, anything more than 5-10 seconds is not really acceptable). CubeClearData couldn't be used because the user would not have sufficient security to copy all of the input cubes data into the reporting cube. CellPutProportional spread could be used on numeric data, but the code to clear out string data uses a while loop - which takes longer.
2. The user would never have the correct security access to clear out all data in the measures dim of the reporting cube due to the element security needed.
TI - version 2
The second version of this was based on the following:
1. End user inputs data into input cube in TM1 web - doesn't have to run a TI so no need to wait for it to complete.
2. a chore runs every 2 mins to run a ti that does the following
a) CubeClearData
b) takes ALL data from the input cube to the reporting cube
This runs very quickly since the CubeClearData operates faster than clearing out data piecemeal like the first version of the TI. It takes about 10 seconds to complete. The end user only has to wait a maximum of 2 mins to see latest changed data in the reports - so is practically "real time reporting".
However - 95% of the time the TI doesn't need to run because no data has changed. The logs quickly build up which means the SaveDataAll chore that I have running every 30 mins takes a longer time.
So the question is:
How can I make it so that the TI only runs if the data has changed in the input cube?
I know that I cannot interrogate the current transaction log as it is in use, but I could interrogate the last transaction log after a SaveDataAll has been completed. Not sure that would help since the time lag it would introduce for "real time reporting".
Perhaps I need to back to a rule and suffer the performance hit on the BI reporting side.
Thanks for anyone who has taken the time to read the above and has any suggestions would be most welcomed.
Kind regards.