Read a TM1 Cube through ODBO

normanbobo · Post by **normanbobo** » Tue Apr 12, 2011 8:24 pm

All,

We have two TM1 remote servers running on the same admin server. We would like to read the data from a cube in remote server A and store it into a cube on Remote Server B.

At first, we considered replication. Several issues with that: 1) The replication architecture is less than 100% stable and reliable according to the scuttlebutt on both this Forum and others and 2) we really don't want to clutter up remote server B with copies of objects from remote server A. Some of the source cubes are extremely large (multiple GB) and we don't want to use up that much memory either. All we want to do is read the data (actually just a few slices) and store it in a cube on remote server B. (BTW, the cubes share common elements and where they are different, we have mappings we will use to translate.)

We've been playing around with TI ODBO. We've been able to build a TI process on remote server B which connects to a cube on remote server A and reads the data, but we're having issues with it.

As some of you may know, the settings on the first three tabs in the TI ODBO process (Connection, Load ODBO Cube and Cube Dimension) drive the contents of the 4th tab (MDX Query) and also create generated code in the Prolog, Metadata, Data and Epilog tabs of the TI process. Some settings also cause the dimensions and cubes from the source remote server (A) to be re-created in the target remote server (B), which we want to avoid. What we envision happening is reading a slice of a source cube and create some logic in the data tab to perform some element-mappings and then post the data to the target cube. This means we need the MDX query to return one row of data per combination of leaf level elements

From what I have been able to learn by playing around with the settings (the documentation leave much to be desired), the only way to avoid getting a lot of generated code and also avoid re-creating objects in the target server is to use the "No Action" setting on the Load ODBO cube tab and to use "Data Load Only" on the Cube Dimensions tab.

The default MDX statement initially generated by the process is as follows:

Code: Select all

WITH 
 SET [QuestScenario Set] AS 
        '{ [QuestScenario].DEFAULTMEMBER }' 
 SET [QuestAccount Set] AS 
        '{ [QuestAccount].DEFAULTMEMBER }' 
 SET [QuestActivityCenter Set] AS 
        '{ [QuestActivityCenter].DEFAULTMEMBER }' 
 SET [QuestEmployeeOpCenter Set] AS 
        '{ [QuestEmployeeOpCenter].DEFAULTMEMBER }' 
 SET [QuestPeriod Set] AS 
        '{ [QuestPeriod].DEFAULTMEMBER }' 
 SET [QuestProgram Set] AS 
        '{ [QuestProgram].DEFAULTMEMBER }' 
 SET [QuestProjectOpCenter Set] AS 
        '{ [QuestProjectOpCenter].DEFAULTMEMBER }' 
 SET [QuestProjectType Set] AS 
        '{ [QuestProjectType].DEFAULTMEMBER }' 
 SET [QueSTView Set] AS 
        '{ [QueSTView].DEFAULTMEMBER }' 
 SET [QuestSubmissionVersion Set] AS 
        '{ [QuestSubmissionVersion].DEFAULTMEMBER }' 
 SELECT NON EMPTY { 
  [QuestScenario Set]  *  [QuestAccount Set]  *  [QuestActivityCenter Set]  *  [QuestEmployeeOpCenter Set]  *  [QuestPeriod Set]  *  [QuestProgram Set]  *  [QuestProjectOpCenter Set]  *  [QuestProjectType Set]  *  [QueSTView Set] 
 *  [QuestSubmissionVersion Set] 
 } ON COLUMNS FROM [QueSTAnalysis]

We added some statements to the data tab to count the number of records processed, which was only 1. This is what we expected since the default code used the "DEFAULTMEMBER" function on the sets. To read records at the leaf levels (essentially the cells of the source cube which contained values), we modified the MDX as follows:

Code: Select all

WITH 
 SET [QuestScenario Set] AS 
        '{ FILTER( { [QuestScenario].MEMBERS }, ISLEAF( [QuestScenario].CURRENTMEMBER ) ) }' 
 SET [QuestAccount Set] AS 
        '{ FILTER( { [QuestAccount].MEMBERS }, ISLEAF( [QuestAccount].CURRENTMEMBER ) ) }' 
 SET [QuestActivityCenter Set] AS 
        '{ FILTER( { [QuestActivityCenter].MEMBERS }, ISLEAF( [QuestActivityCenter].CURRENTMEMBER ) ) }' 
 SET [QuestEmployeeOpCenter Set] AS 
        '{ FILTER( { [QuestEmployeeOpCenter].MEMBERS }, ISLEAF( [QuestEmployeeOpCenter].CURRENTMEMBER ) ) }' 
 SET [QuestPeriod Set] AS 
        '{ FILTER( { [QuestPeriod].MEMBERS }, ISLEAF( [QuestPeriod].CURRENTMEMBER ) ) }' 
 SET [QuestProgram Set] AS 
        '{ FILTER( { [QuestProgram].MEMBERS }, ISLEAF( [QuestProgram].CURRENTMEMBER ) ) }' 
 SET [QuestProjectOpCenter Set] AS 
        '{ FILTER( { [QuestProjectOpCenter].MEMBERS }, ISLEAF( [QuestProjectOpCenter].CURRENTMEMBER ) ) }' 
 SET [QuestProjectType Set] AS 
        '{ FILTER( { [QuestProjectType].MEMBERS }, ISLEAF( [QuestProjectType].CURRENTMEMBER ) ) }' 
 SET [QueSTView Set] AS 
        '{ FILTER( { [QueSTView].MEMBERS }, ISLEAF( [QueSTView].CURRENTMEMBER ) ) }' 
 SET [QuestSubmissionVersion Set] AS  
        '{ [QuestSubmissionVersion].[FY10 Provisional Submission Pass 3 Official] }' 
 SELECT NON EMPTY {  [QuestScenario Set]  *  [QuestAccount Set]  *  [QuestActivityCenter Set]  *  [QuestEmployeeOpCenter Set]  *
                                  [QuestPeriod Set]  *  [QuestProgram Set]  *  [QuestProjectOpCenter Set]  *  [QuestProjectType Set]  *  
                                  [QueSTView Set]  *  [QuestSubmissionVersion Set] 
 } ON COLUMNS FROM [QueSTAnalysis]

However, with this code, we still get only one record in the output.

All of this to ask the following questions ...

1) Are we using the right settings on the TM1 ODBO tabs?

2) How can we get one record per cell value in the MDX query? We need each cell value from the source so that we can process the values through the mappings and then post the values to the target cube.

3) We have set local variables in the Prolog tab to control the data source settings for ODBC sources. ODBO sources have similar local variables with "ODBO" in their names. However, I cannot find a variable which allows you to set the MDX query string. Does an ODBO TI process use the same variable DataSourceQuery for MDX queries (this variable is used in ODBC TI processes).

4) Anyone have any alternate techniques to achieve the same overall results?

lotsaram · Post by **lotsaram** » Tue Apr 12, 2011 9:23 pm

The oldest trick in the book on this one is to place an IF test that will always evaluate to False around the wizard generated code then write your own code below.

Code: Select all

IF( 1 = 0 );
#Generated Statement Start
....rubbish wizard generated code here that you want to skip
#Generated Statement End
EndIF;

On using ODBO. It's not 100% reliable and for large queries it also isn't that fast. although is seems clunky triggering a text file dump from one server to another then picking up the results and loading in from a flat file is much faster and more robust and can be just as automated. It's just a matter of the extra moving parts that make it seem clunky.

normanbobo · Post by **normanbobo** » Tue Apr 12, 2011 9:58 pm

lotsaram wrote:The oldest trick in the book on this one is to place an IF test that will always evaluate to False around the wizard generated code then write your own code below.
Code: Select all
IF( 1 = 0 );
#Generated Statement Start
....rubbish wizard generated code here that you want to skip
#Generated Statement End
EndIF;
I found myself laughing about this. Very clever. And very useful. I will have to try this. You get a for that one (whether original to you or not).

lotsaram wrote:On using ODBO. It's not 100% reliable and for large queries it also isn't that fast. although is seems clunky triggering a text file dump from one server to another then picking up the results and loading in from a flat file is much faster and more robust and can be just as automated. It's just a matter of the extra moving parts that make it seem clunky.

OK ... replication not that reliable/fast. ODBO not that reliable/fast. Dumping to flat files is clunky and has extra moving parts. Running out of options ....

Was there anything wrong the MDX query? Should it not have worked?

About your suggestion ... how does one automate moving data from one server to another through flat file dumps/reads.

Martin Ryan · Post by **Martin Ryan** » Tue Apr 12, 2011 10:24 pm

normanbobo wrote:About your suggestion ... how does one automate moving data from one server to another through flat file dumps/reads

The export is straigthforward enough, just create a chore that dumps out the file. You might want to make use of consolidations/attributes to summarise the data a little bit at this stage.

For the import you want to have a chore that's running every so often and checks whether there is an extract file, if there is then it loads it.

This is along the lines of the prolog in a process I have that does this. Obviously will require a bit of modification for your situation.

Code: Select all

vPath='\\path\to\export\directory\';
vArchivePath'\\path\to\archive\directory\';
vmove =  '\\Path\to\some\script\that\moves\files';
vFileStyle = 'EXPT????.csv';

vPathname = vPath| vFileStyle ;
vPrior = '' ;
vFile = WildcardFileSearch ( vPathname , vPrior ) ;
while ( vFile @<> '' ) ;
    ExecuteProcess ('ImportData', 'Fpath' , vPath, 'Fname' , vFile);
    ExecuteCommand (  vmove | ' ' | vPath| vFile | ' ' | vArchivePath | vFile, 1 ) ;
end ;

lotsaram · Post by **lotsaram** » Wed Apr 13, 2011 6:42 am

Rather than a chore running at very regular intervals scanning a directory for a file my preference would be to use a batch file (which TI could generate on the fly if you really wanted) combined with a command line exe to call a process. That way the export on one server and import on the other can be co-ordinated via a single action.

I don't take credit for the IF statement to skip the wizard code. It's an old trick and a necessary evil as some data source options (ODBO and Package Connector come to mind) don't allow for an option of skipping variable mapping and wizard code.

normanbobo · Post by **normanbobo** » Wed Apr 13, 2011 1:44 pm

lotsaram wrote:Rather than a chore running at very regular intervals scanning a directory for a file my preference would be to use a batch file (which TI could generate on the fly if you really wanted) combined with a command line exe to call a process. That way the export on one server and import on the other can be co-ordinated via a single action..

We will be soon reaching the point where we need to start scheduling our loads. The TM1 scheduler, though functional, does not have enough features (and it probably shouldn't -- leave it up to the enterprise schedulers!). I have been doing some research on how to call a TI process from the command line (or more precisely a scheduler). As you know, there is no "out-of-the-box" solution for doing this (truly frustrating and surprising). What have you found to be the easiest, most reliable way to execute a TI process and/or chore? A pointer to another forum link, if you know of a good one, is all we need.

lotsaram wrote:I don't take credit for the IF statement to skip the wizard code. It's an old trick and a necessary evil as some data source options (ODBO and Package Connector come to mind) don't allow for an option of skipping variable mapping and wizard code.

You may not have developed it, but you get credit for being on the forum and "paying it forward"!

I have been asked to continue researching /trying the ODBO options. There is another issue involved...two different teams are developing the two applications. We really do not want to modify the source application in any way if possible.

So, my question to everyone ... What's up with the MDX? Why are we getting only one "row" (cell) of read through the process?

Mike Cowie · Post by **Mike Cowie** » Wed Apr 13, 2011 2:44 pm

normanbobo wrote:We will be soon reaching the point where we need to start scheduling our loads. The TM1 scheduler, though functional, does not have enough features (and it probably shouldn't -- leave it up to the enterprise schedulers!). I have been doing some research on how to call a TI process from the command line (or more precisely a scheduler). As you know, there is no "out-of-the-box" solution for doing this (truly frustrating and surprising). What have you found to be the easiest, most reliable way to execute a TI process and/or chore? A pointer to another forum link, if you know of a good one, is all we need.

Hi:

There are a few options mentioned within the Useful Code/Tips forum (http://www.tm1forum.com/viewforum.php?f=21):
http://www.tm1forum.com/viewtopic.php?f=21&t=3486
http://www.tm1forum.com/viewtopic.php?f=21&t=548 (a bit older)

Either of these posts have some EXEs that could potentially work with enterprise schedulers. Feel free to PM me if you have specific questions about them - as posted they are free, as-is and not officially supported, but we've had interest in and are willing to provide an officially supported version (not for free).

Regards,
Mike

Marcus Scherer · Post by **Marcus Scherer** » Wed Apr 13, 2011 8:25 pm

I don't know what the TI wizard is doing, I never use it. You may alternatively use ODBO "MDX query" as a source instead of ODBO cube. Then write a freehand MDX query in the "MDX query" tab similar to the following example for a 3D cube:

Code: Select all

SELECT 
{[model].[total].children} * {[region].[world].members }  ON ROWS,
{[account1].members} on COLUMNS
FROM [CUBE]

You will get more than one row and one column for each measure (here account1). This layout may be useful for further actions in the advanced tabs.

But you need to write very simple statements, the whole Microsoft-MDX language is not available if you are querying TM1 objects. Also TM1 functions like TM1filterbylevel are ignored in this TI interface - maybe you experiment yourself.

I wouldn't use MDX in a scenario like yours. There would be for example no control over rule calculated cells on the source side.

regards,

Andy Key · Post by **Andy Key** » Thu Apr 14, 2011 1:37 am

normanbobo wrote:As you know, there is no "out-of-the-box" solution for doing this

And further to Mike's reply, TM1RunTI.exe is now included with 9.5.2 HF1.

Mike Cowie · Post by **Mike Cowie** » Thu Apr 14, 2011 2:01 am

Andy Key wrote:
normanbobo wrote:As you know, there is no "out-of-the-box" solution for doing this
And further to Mike's reply, TM1RunTI.exe is now included with 9.5.2 HF1.

Andy:

Looks interesting - spot any documentation for it (or notes on 9.5.2 HF1) yet? Not all the command options are that obvious from the command-line help that it spits back...

In case anyone's interested, this EXE may work in prior TM1 versions - I copied it into my 9.5.1 BIN folder and it seems to run just fine.

Some initial observations: on a simple process with no params it seems to succeed (based on server message log), but note that it doesn't report back (in the command prompt) that it succeeded either... hopefully there's a return code that can be picked up from other apps when executing this. I also tried to run an existing process (with params), but I skipped/passed in invalid parameters - unfortunately, I got no error messages in the command prompt and the process did not execute. Again, I sure hope there's a return code indicating this failure. If you get something like a server or process name wrong, it does appear to return some error messages in the command prompt, FWIW.

That's enough playing for one day... hopefully there's some documentation forthcoming.

Regards,
Mike

ParisHilton · Post by **ParisHilton** » Thu Apr 14, 2011 7:57 pm

If it helps, I've got a simple .net console app that can start a TM1 process.
There's a funky way of kicking it into action from a remote server too.

If the existing apps can't do what you want, let me know and I'll find some way of getting it to you.

P xx

ps. I'm a huge fan of dumping text files and reading them in again.

Andy Key · Post by **Andy Key** » Fri Apr 15, 2011 7:35 am

Mike Cowie wrote: Looks interesting - spot any documentation for it (or notes on 9.5.2 HF1) yet? Not all the command options are that obvious from the command-line help that it spits back...

Documentation...? Er, No, just the command line help.

I orginally saw a passing reference to TM1RunTI being included in HF1 in a technote that came up whilst I was searching for something else a couple of weeks ago. Of course, the IBM Web Labyrinth being what it is, I can't find that reference again, and HF1 wasn't on the FTP site at the time anyway.

normanbobo · Post by **normanbobo** » Thu Apr 21, 2011 5:24 pm

All ... thanks for your continued help. Returning to the original topic ... we are eager to see the ODBO work, but are having trouble with the MDX statements.

After much trial, we have been able to get the following query (and queries similar to it) to work:

Code: Select all


SELECT 
               {[QuestScenario].[Actual-to-date plus Forecast]} 
               ON COLUMNS, 
               CROSSJOIN
                   ( CROSSJOIN 
                         ( CROSSJOIN 
                              (  FILTER( { [QuestAccount].MEMBERS}, ISLEAF ([QuestAccount].CURRENTMEMBER)) 
                               , FILTER( { [QuestActivityCenter].MEMBERS}, ISLEAF ([QuestActivityCenter].CURRENTMEMBER)) 
                              )
                          , FILTER( { [QuestEmployeeOpCenter].MEMBERS}, ISLEAF ([QuestEmployeeOpCenter].CURRENTMEMBER))
                         )
                     ,  { [QuestPeriod].[FY 2010].CHILDREN }
                    )
               ON ROWS 
FROM     [QuestAnalysis] 
WHERE [QueSTView].[View All Org/SSIn/SSOut]

Here are some comments about this query:

1) What we need from the query: one row of data for each cell of data in the source cube. Each row needs to contain the dimensional elements for that cell (allowing us to map the dimensional elements to the target cube).
2) The MDX above is returning multiple columns of dimensional elements (ON ROWS) along with a final column with a measure elements (ON COLUMNS).
3) It appears from our testing that TM1 requires that the MDX query have at least one set on "COLUMNS". We selected one of our non-measure dimensions in which we wanted only one element (though many sources we read indicated that ON COLUMNS was actually optional). Under normal circumstances, one might use the measure dimension ON COLUMNS. However, the source cube's measure dimension requires some mapping logic to match the measures to the target cube, so we really needed those dimensional elements to be returned like the other dimensions.
4) We are using the CROSSJOIN because we want every combination of dimensional elements.
5) The alternate crossjoin syntax of using an * between the sets does not seem to work with TM1. We tried many different queries and never got it to work. (Does not mean it won't. There may have been other issues with our queries.)
6) Another alternate syntax is to list all of the sets in a single CROSSJOIN statement. However, TM1 would never allow us to add more than two sets to the CROSSJOIN statement). That is when we discovered you could nest CROSSJOINs. That got us to three levels of nesting.
6) For each set ON ROWS, we used the ISLEAF function to limit the results to one row per leaf. If we had not done this, we would have had resulting rows for consolidated elements. We wanted rows for only leaf elements. The ISLEAF function seems to work correctly, in that we get one row per leaf, but with a bit of a twist. At first we thought this would return just one column per dimension (the leaf elements listed in a single column). Instead, TM1 is returning one column per level in the source dimension. A 4-level dimension returns 4 columns. A 5-level dimension returns 5 levels. Across the three dimensions ON ROWS, we are getting 11 columns of dimensional elements and one column with the amount (the one element from the dimension ON COLUMNS).
7) Our source cube has ragged dimensions -- the leaf elements are at different levels. (This is a pet peeve of mine ... I think cube designers should normalize hierarchies so that all leaf elements are at the same level ... to simplify queries and reports ... but I stray from the main topic). This means that our leaf elements are in different columns. If a leaf element is at level 4 in a 5-level dimension, it appears in the 4th column and the 5th column is empty. We can handle this in the TI logic by simply checking for blanks in the last column and then choosing to process the element in the 4th column instead. But it sure would be nice to have all of the leaf elements in the same column.

As any FYI, we did notice the queries taking longer and longer to execute as we increased the number of CROSSJOINs. This was not surprising. The source cube is fairly substantial (a couple of GB's) and we are retrieving a large number of cells. The preview button returns results in a few minutes on the three-level queries.

We were greatly encouraged by our progress to that point. We needed to add just one more level of nesting to the CROSSJOIN to get what we needed. Alas! We could not get it to work. The following query executes (or seems to):

Code: Select all


SELECT 
               {[QuestScenario].[Actual-to-date plus Forecast]} 
               ON COLUMNS, 
               CROSSJOIN
                   ( CROSSJOIN
                         ( CROSSJOIN 
                               ( CROSSJOIN 
                                    (  FILTER( { [QuestAccount].MEMBERS}, ISLEAF ([QuestAccount].CURRENTMEMBER)) 
                                     , FILTER( { [QuestActivityCenter].MEMBERS}, ISLEAF ([QuestActivityCenter].CURRENTMEMBER)) 
                                    )
                                    , FILTER( { [QuestEmployeeOpCenter].MEMBERS}, ISLEAF ([QuestEmployeeOpCenter].CURRENTMEMBER))
                               )
                              , FILTER( { [QuestProjectType].MEMBERS}, ISLEAF ([QuestProjectType].CURRENTMEMBER)) 
                         )
                         ,  { [QuestPeriod].[FY 2010].CHILDREN }
                   )
               ON ROWS
FROM     [QuestAnalysis] 
WHERE [QueSTView].[View All Org/SSIn/SSOut]

Notice the only difference is the number of CROSSJOINS. Not all of the dimensions appear in the results; only three of the dimensions appear. The missing dimension varied. Sometimes it was one of the inner nested columns. Other times it was the last nested column. We tried chaning the order of the nested sets. There did not appear to be a pattern. Other times the Preview returned no results at all. We are slowly reaching the conclusion that that TM1 is limiting us to only three levels of CROSSJOINS. But this just doesn't make sense.

Any suggestions? Ideas? Better ways to form an MDX query to get the results we need?

Marcus Scherer · Post by **Marcus Scherer** » Sat Apr 23, 2011 10:18 am

normanbobo,
you're really tough on this approach...

4) We are using the CROSSJOIN because we want every combination of dimensional elements.

I wouldn't use the CROSSJOIN at all. Look at your first post, if even the wizard isn't using it, I wouldn't use it. See the alternative is in this post also. I repeat that I only would use very simple statements that TM1-MDX is able to interpret.

6) For each set ON ROWS, we used the ISLEAF function to limit the results to one row per leaf. If we had not done this, we would have had resulting rows for consolidated elements. We wanted rows for only leaf elements. The ISLEAF function seems to work correctly, in that we get one row per leaf, but with a bit of a twist. At first we thought this would return just one column per dimension (the leaf elements listed in a single column). Instead, TM1 is returning one column per level in the source dimension. A 4-level dimension returns 4 columns. A 5-level dimension returns 5 levels. Across the three dimensions ON ROWS, we are getting 11 columns of dimensional elements and one column with the amount (the one element from the dimension ON COLUMNS).
7) Our source cube has ragged dimensions -- the leaf elements are at different levels. (This is a pet peeve of mine ... I think cube designers should normalize hierarchies so that all leaf elements are at the same level ... to simplify queries and reports ... but I stray from the main topic). This means that our leaf elements are in different columns. If a leaf element is at level 4 in a 5-level dimension, it appears in the 4th column and the 5th column is empty. We can handle this in the TI logic by simply checking for blanks in the last column and then choosing to process the element in the 4th column instead. But it sure would be nice to have all of the leaf elements in the same column.

If ragged hierarchies coming into play now, remember where MDX is coming from? In e.g. SSAS all leaf elements are at the same level (seen from above), you only can hide intermediate levels for end user purposes and thus present a "ragged" hierarchy view. Different in TM1 with it's bottom up level view approach.
Thus from an MS-MDX viewpoint nothing is wrong with your leaf elements for one dimension in different columns.

happy Easter to all forumers!

normanbobo · Post by **normanbobo** » Mon May 02, 2011 11:10 pm

All....

We had to take a bit of a sabbatical from this issue to focus on something else...but we're back .... with a vengeance!! We are going to tame this beast!

Lessons learned since the last post:

1) You can use TM1-specific MDX functions (e.g., the functions available when you create dynamic subsets) in the MDX you pass through ODBO. TM1 receives and correctly interprets the functions. THIS IS HUGE!
2) You can use named subsets in the MDX. This, too is HUGE! It adds much flexibility.
3) NON EMPTY screws everything up. It seems that TM1 breaks the query up into two parts. The first part grabs all of the dimensional elements. The second part returns the cell values. It seems the two parts are "stitched" together to form the final results. The NON EMPTY seems to affect the cell part of the query, but does not affect the dimensional part of the query. If you have zeros in one or more cells, you will get more dimensional elements than you get values and the values will be matched with the wrong dimensional elements.

We are obtaining the correct results from the following query:

Code: Select all


SELECT  {[QuestScenario].[Actual-to-date plus Forecast]}    
         ON COLUMNS,    
        (
              {TM1FILTERBYLEVEL( { HIERARCHIZE( {TM1SUBSETALL( [QuestProjectType] )} ) }, 0)}
               )
               ON ROWS     
FROM     [QuestAnalysis]     
WHERE ([QueSTView].[View All Org/SSIn],[QueSTSubmissionVersion].[FY10 Provisional Submission Pass 3 Official],[QuestAccount].[FTEs], [QuestPeriod].[FY 2010])

The TM1FILTERBYLEVEL gives us the same result as using ISLEAF. We are not sure if the HIERARCHIZE is needed. Just a habit when building dynamic subsets. We are now pushing forward to get a query that returns all dimensional combinations. In a correct to a prior post, we have learned that the '*' crossjoin notation does work. We had used incorrect syntax when attempting it before. We will let you know if we can get the final query working...

normanbobo · Post by **normanbobo** » Wed May 18, 2011 2:54 pm

All...

I am finally reporting back on this issue...we have it working and I thought I would share a couple of more lessons learned .. to pay it forward, as the saying goes.

Here is the final, operational MDX query we are using:

Code: Select all

SELECT {[QuestScenario].[Actual-to-date plus Forecast]}         
         ON COLUMNS,         
              FILTER(         
              {TM1FILTERBYLEVEL( { TM1SUBSETALL( [QuestEmployeeOpCenter]  )}, 0)} *         
              {TM1FILTERBYLEVEL( { TM1SUBSETALL( [QuestProjectOpCenter]   )}, 0)} *         
              {TM1FILTERBYLEVEL( { TM1SUBSETALL( [QuestProjectType]       )}, 0)} *         
              {TM1FILTERBYLEVEL( { TM1SUBSETALL( [QuestActivityCenter]    )}, 0)} *         
              {TM1FILTERBYLEVEL( {TM1DRILLDOWNMEMBER( {[QuestAccount].[GRID Accounts]}, ALL, RECURSIVE  )}, 0)}                     
              ,[QuestScenario].[Actual-to-date plus Forecast] <> 0) *         
              {[QuestPeriod].[FY ?pstrFiscalYear?].CHILDREN}         
         ON ROWS         
FROM  [QuestAnalysis]         
WHERE ([QueSTView].[View All Org/SSIn],[QueSTSubmissionVersion].[?pstrSubmission?])

The syntax {[QuestAccount].[GRID Accounts]} leverages a subset in the source cube.

You will notice that we are using parameters in the MDX. They are supported here just like they are in ODBC sources.

Things to be aware of should you try this on your own:

1) Unfortunately, we were not able to get a query that returned only the leaf level elements. It may be possible, but if it is, it is beyond our knowledge and capability. The query above returns all of the levels for each hierarchy, resulting in x columns of dimensional elements for each dimension. Something you must be careful about is that the query may return a different x number of columns based on the results of the query. For instance, if your filters are such that they query returns leaf elements that are at the third level down (starting from the top), then you will get three columns back for that dimension. But if your filters are modified and the query starts returning leaf elements that are at the fourth level down (starting from the top), you will get back four columns for that dimension. This changes the number and ordering of the variables in the TI process, which can obviously play havoc with your code because you are expecting certain data in certain columns. This issue occurs only if your query can return elements at different levels, which is the case in our situation because the source cube's dimensions have leaf elements at different levels (they are "ragged hierarchies). If your query is known to return elements at only certain levels, this should not be an issue for you.

2) As discussed in a prior post, we are not using NONEMPTY. That particular MDX clause causes the query to break. Instead, we are filtering out non-zero values. At first, we applied the non-zero filter around the entire crossjoin. The result was like the issue of applying a suppress zeros in Analysis Studio -- all of the combinations were built first and then the filter was applied. The number of combinations in the source cube we are reading was astronomical, causing the query to run forever (hours). Based on the knowledge we have about our data (sparsity and density), we applied a non-zero filter to a selected set of cross-join dimensions. This significantly cut down on sparse combinations. We then cross-joined that result with the remaining dimension, Fiscal Weeks. The result was a query that run in just a few minutes. We are still doing some fine-tuning to the query. It is still returning some zero records and we think we can get it to run even faster with a little more tweaking.

The process is reading and loading well over 100K records from the MDX. We have many "overhead" tasks that run in the process (zero'ing out portions of the target cube, validating the input values, mapping the input dimensional values to the target dimensional values, adjusting target dimensions as needed so that no records are dropped, logging "trace" records to output files, as well as reading and storing control totals), which take up most of the run time. Our overall run time is less than 30 minutes. I estimate the actual load portion to be 5 minutes or less.

Though it took a lot of determination, we are now reading a portion of source cube directly from another TM1 application and posting the data directly to a target TM1 cube. It works and everything balances. Unless something else unexpected pops up, we will be taking this implementation to production.

We're interested in everyone's feedback on this...

lotsaram · Post by **lotsaram** » Thu May 19, 2011 6:48 am

Thanks for posting back. I'm glad you got it to work as the MDX interface is a vastly underutilized part of TM1.

It is interesting that you can use TM1DRILLDOWNMEMBER on a subset as opposed to a consolidation without any ill effect but maybe this is not so surprising as in many ways TM1 does seem to just treat subsets as hidded consolidations.

Shame that Select NonEmpty did not work although your workaround seems like a good one.

Key issue would seem to be the risk of changes in dimension structure (in particular change to the number of levels in a dimension) impacting the number fo columns in the output and therefore data being out of sync with variables in the recieving process. Seems you woudl have a lot more control using flat files or an intermediate RDBMS to keep the 2 TM1 servers in sync.

I would be interested in the development time and effort and the rationale behind using ODBO vs flat file dumps. (Are you doing it because it would otherwise be a waste of time and effort or because it is a better solution in this case?)

normanbobo · Post by **normanbobo** » Thu May 19, 2011 4:51 pm

lotsaram wrote: Thanks for posting back. I'm glad you got it to work as the MDX interface is a vastly underutilized part of TM1.

It is vastlly underutilized, as evidenced by the lack of information on the internet and on IBM support. We stumped the IBM support staff immediately. They started doing information requests to development right off the bat. We still have an ongoing call with some open questions from this effort.

lotsaram wrote: It is interesting that you can use TM1DRILLDOWNMEMBER on a subset as opposed to a consolidation without any ill effect but maybe this is not so surprising as in many ways TM1 does seem to just treat subsets as hidded consolidations.

In our situation, the query had to return leaf elements only. There may be other uses / scenarios for this technique in which summary data would be used, but we needed the details to do our mappings between the two systems. Technically, the elements in the subset are already leaf elements, but using the TM1DRILLDOWNMEMBER was a guarantee against mistakes since the subset we are using is built manually in the source system.

lotsaram wrote: Shame that Select NonEmpty did not work although your workaround seems like a good one.

This was one the most frustrating parts of the experience for us. It took us a couple of days to figure out that NONEMPTY was not applied to the dimensions but was applied to the fact cells. Because of this, the numbers in the query were being assigned to the wrong dimensional elements. Of course, we later find out that NONEMPTY is not a supported MDX function in TM1 (there is a list of the supported functions in the TM1 Reference Guide.) So we can only blame ourselves on this one.

lotsaram wrote: Key issue would seem to be the risk of changes in dimension structure (in particular change to the number of levels in a dimension) impacting the number fo columns in the output and therefore data being out of sync with variables in the recieving process. Seems you woudl have a lot more control using flat files or an intermediate RDBMS to keep the 2 TM1 servers in sync.

This is definitely a big concern and was discovered late in the process as we were testing. In our case, the source systems's dimensions are well documented and stable and when we run our query in a non-test mode, we will be pulling all of the elements in all but a few of the dimensions. Thus we are secure in knowing that the columns will not vary as we run the process in a production mode.

Also, this is really not an issue if you have simple hierarchies. This particular source system uses ragged hierarchies, a design technique which I strongly discourage in all OLAP technologies (and Data Warehouses for that matter). We have actively avoided ragged hierarchies in the TM1 system we are building and hope to one day re-visit the source TM1 system to remove ragged hierarchies.

Also, if the MDX query returns only consolidated elements at a certain level, this is not an issue either.

Our situation, I believe, was unusual. I just don't think this issue will occur frequently as the ODBO technique is re-used in other applications / companies.

lotsaram wrote: I would be interested in the development time and effort and the rationale behind using ODBO vs flat file dumps. (Are you doing it because it would otherwise be a waste of time and effort or because it is a better solution in this case?)

Certainly, there was a huge learning curve as we worked through this process, which dwarfed the effort it would have taken to just implement the flat file technique. Maybe it is this learning curve that has kept others from using this technique. However, now that we know how to do it and know the "gotchas", I think this process will be just as quick as doing flat file dumps and has other advantages, which I will discuss below.

It does require at least basic MDX knowledge, but with the sample provided, I think most developers could write another query. Any many developers are somewhat familiar with MDX from creating subsets.

There were several motivations for us driving to make this technique work:

1) Our biggest motivation was to be as independent of the source system as possible, leaving the source system "untouched". The other source system is not owned by our group. It is managed and maintained by others. Coordinating those changes would have been difficult, particularly since that group is "thin" on staffing and extremely busy. Their system is also fairly complicated. Adding more objects / processes to it would have just added to the complexity. Also, that source system needs some work and will likely receive a significant re-write in the next year. By keeping the code in our system, we avoided potential harm to our code during that process. In the end, all we needed was an ID in the system and one private subset on one dimension (which we actually could have avoided as well, but chose to leave).

2) Using flat files creates multiple objects spread across at least two systems. You have processes in the source and processes in the target. You also must maintain (and secure) a place to hold the flat files. The applications are currently on the same Windows server (which is virtualized, BTW). However, that will not be true going forward. Because of processing loads, we are going to have to split the applications across Windows servers (but still have them on the same admin server). This means that one or the other server would have to have a drive letter mapping or UNC mapping. It is these mappings which we have discovered to be fragile over time. System admins move drives or change network security or other things which unknowingly break these mappings. Flat files are just a fragile process, especially in a multi-server environment.

3) We could have dumped flat files and written the data to a database table (skipping the step of mapping the flat file directories across servers). But that was just more code objects and processes to create, secure and maintain. And we would have then had to coordinate with another group (the DBA's), who are also thin on staffing and involved in a huge database upgrade.

4) One issue we are trying to address is how each TM1 application loads the same actuals data (and other measures) across many applications. Many of them have specialized data marts to prep their data for the TM1 loads. The applicaiton we have built loads all of the actuals data needed by all of the other applications and we load more history. Long term, the other TM1 applications in our environment will be able to use this TM1-to-TM1 ODBO technique to get the actuals data straight from our application very quickly and easily. This will significanlty reduce the load on the databases plus ensure consistency across the TM1 applications. BTW, one of the things we will be working towards is the synchronization of the design of dimensions across TM1 applications to make these connections even easier (not requiring mappings like we had to do). Yes, the other applications could use flat files to get the data, but now that we know how to do this, we believe this technique is more "stable" and reliable than a flat file process.

To summarize, it is the "directness" of the process which is most appealing. You want the data? Just get an id on the source system and go get it! Simple. No coordination is needed with network admins or DBA's. You also leave no footprint in the source system (other than a user id). You just get permission from the other group, get an ID, write your query and GO! Of course, you will also want to set up a formal process where the other team notifies you of any significant changes to the source system, but you would have that with any technique. Is this process better than flat files in all cases. Of course not. But it is a very useful "tool in the toolbag".

There was another, unexpected, side benefit of this work ... we learned how to access TM1 cubes through ODBO connections in Excel and get the data into pivot tables quickly and easily. This has been helpful as a development / validation tool and will continue to help us going forward.

This effort has also opened up the possibilities of third-party applications reading data captured in TM1 (e.g., budgets/plans) straight into relational databases for use in other other applications (e.g., data warehouse, ERP, etc.)

lotsaram · Post by **lotsaram** » Fri May 20, 2011 5:27 am

Thanks for the detailed response. It all makes sense except for your point about ragged hierarchies which I would have to disagree quite strongly on. Support for ragged hierarchies and multiple hierarchies is one of TM1's strengths (and palo and InforPM and other similar technologies for that matter) as it means the database can reflect real world structures and is much more intuitive for users to navigate. Deep singe child hierarchies are best avoided. Balanced hierarchies might be better for mdx but the focus should always be on end user useability.

normanbobo · Post by **normanbobo** » Fri May 20, 2011 2:56 pm

Lotsaram,

I knew the comment regarding ragged hierarchies would be controversial, but it was not the main point of the post, so I did not go into details. But since you bring it back up .....

You are absolutely correct that usability is important. However, there are many use cases that must be considered.

The use case that most people first think of is when a user drills a single element only to get back another single element with a similar name and with the same data values. This can be extremely frustrating and confusing, especially if the user must drill multiple times to finally get multiple children to appear. The worst case scenario is the when the user drills multiple times and never gets anything more than a single element all the way to the bottom of the hiearchy. One method to resolve this issue is to design the hierarchy so that when a user drills, they always get more than one element or they go straight to the leaf elements -- in other words to implement a ragged hierarchy.

One frequently occurring use of ragged hierarchies is for "orphan" elements. These are dimensional elements which appear in the fact data loads (when you are populating cells in the cube) but which do not already exist in the dimensional data (they are in the transactions, but not in the reference tables used to build the dimensions -- which is an all-too-frequent reality). So that transactions are not dropped in the loads, "orphan" dimensional elements are frequently added as children to a high-level "orphan owning" element in the dimension until they can be moved to the correct part of the dimensional tree at a later time. These orphan structures frequently do not have the same number of levels as the rest of the dimension and thus cause ragged hierarchies. But there are other places where ragged hierarchies can occur as well (e.g., organization structures and parts explosions where the number of levels varies).

Ragged hierarchies can be used to improve the interface in the use case above, but they can work against you in other use cases.

For instance, in Analysis Studio (which we are using for this application until we can move to Cognos 10 Business Insight), our users are frequently taking advantage of the Analysis Studio feature called "down a level". This feature drills all of the elements in the display simultaneously. It allows the user to take a broader view of the data at a lower level. If we were using ragged hierarchies, there is no telling what the results would be when this feature is used. My guess is that the interface would simply ignore the data which did not have more levels and thus would report less than the full dimension back to the user -- which could lead to decisions being made on less than the full set of data -- a perilous issue. What's interesting is that some of our users had been using Analysis Studio before and simply expected this to work even though they were using a different underlying data source in this application. This is because they have not had to deal with ragged hierarchies in the other applications accessed through Analysis Studio.

Another use case is building reports. We are using Report Studio for reporting in this application. Many of our reports call for all of the children at a selected level of a dimension. If the dimensions were ragged, we would have to code exceptions into the report to ensure that all of the data was included across the ragged hierarchy.

Yet another use case is Perspectives in Excel. Similar issues to those mentioned in Analysis Studio and Report Studion can occur in Excel. Of course, you can avoid the issues by taking the time to build subsets to resolve the fact that leaf data (and other elements which would normally be "at the same level") are actually at different levels in the hierarcy. Note, though, that many third party applications (even Cognos BI), do not understand or use subsets.

Another use case is data extraction. The reality of our current world is that TM1 cubes will more and more be the source of data for other applications. As we encountered in this TM1-to-TM1 ODBO example, ragged hierarchies can wreak havoc on your data extraction logic. Many tools simply do not "understand" ragged hierarchies. And in this particular scenario, we were not able to get TM1 to return a result where only one column was returned for each dimension where that column contained only the leaf level elements.

it is my opinion that the issues encountered in these other cases "trumps" the drilling use case and that ragged hierarchies thus should not be used.

In the end, I think the issue of avoiding the drill down to a single child element should be resolved in the interface rather than in the cube design. The developers of PowerPlay took this view. They implemented "suppression" in PowerPlay way back in the early 90's. This allowed the cube designer to mark elements (they are called categories in Transformer and PowerPlay) to be "suppressed". When the user drilled to a suppressed element, that element was simply skipped and the next level children under the suppressed element were displayed. This implementation is the best of both worlds. It solved the interface issue, but also kept the cube design clean -- not ragged. This made it much simpler for the loading the data and other functions (such as building alternate hierarchies based on certain levels of categories). What is nice is that the desire for suppression was stored in the metadata of the cube, meaning that other third party interfaces could take advantage of it as well. A sophisticated interface would allow a user to turn on / off this suppression mode, but if I remember correctly, not even PowerPlay Series 7 or PowerPlay Studio do this.

But, alas, we are not the ones developing the software and we are therefore dependent on what the software vendors give us. In some tools and with particular use cases, ragged hierarchies are the only way to solve the issue. But I avoid ragged hierarchies like the plague whenever possible.

TM1 Forum

Read a TM1 Cube through ODBO

Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO

Re: Read a TM1 Cube through ODBO