Insert Data from TM1 into SQLServer incrementally

dwhcognos8 · Post by **dwhcognos8** » Wed Mar 21, 2012 8:59 pm

Hi All,
I am interested in knowing the options for Inserting Data incrementally(Newly updated daily data) from TM1 to SQLServer instead truncate and re-load the whole cube into DB everytime.Please let me know your thoughts.

Thanks,
TM1Guy

tomok · Post by **tomok** » Wed Mar 21, 2012 10:33 pm

Your options are whatever you want them to be. Sorry to be so obtuse but there is no "interface" between TM1 and any relational database except the ability to pass SQL commands through an ODBC connection. So, basically you can do whatever you want in SQL, as long as you can create the proper SQL command to do it. Whether you issue a TRUNCATE command on your table before you load data from a cube into it is entirely up to you. TM1 doesn't have any skin in the game on that. However, you need to consider the ramifications of just adding new records to a table without making sure they are unique, or have some other key piece of information to allow you to query against them. You mentioned "newly updated data". If you are pulling that data from a cube, how are you going to know which data in the cube was "newly updated" and which data is old? TM1 isn't natively going to know that. You would have to have something in your measure dimension to store when the data was last updated and use that to limit the view to only those cells.

rmackenzie · Post by **rmackenzie** » Thu Mar 22, 2012 3:22 am

dwhcognos8 wrote:I am interested in knowing the options for Inserting Data incrementally(Newly updated daily data) from TM1 to SQLServer

There is reasonably good documentation available on the functions to achieve this.

The point made by tomok is absolutely critical - you need some way of identifying the 'fresh' data in the cube. You could achieve this by implementing a 'batch' or 'transaction' dimension in order to track updates. However, in my experience, it seems this scenario is back-to-front in that typically you would query for fresh data in the source system and then incrementally load that into TM1; the point of using an OLAP tool, in fact, would be to remove the transactional nature of the data in order to facilitate its analysis...

I suppose, reasonably, you might do some calculations in a TM1 cube and then need to export for some Reporting Services/ Analysis Services reports... but then you weren't asking for advice on your application design, right?

dwhcognos8 · Post by **dwhcognos8** » Thu Mar 22, 2012 7:21 am

Thanks Tomok and mackenzie for your valuable comments.

Right now we are daily truncating the existing records and reloading the whole cube data into Oracle from tm1 cube.
The Reason behind going for incremental update of daily data from tm1 cube into oracle is the data in the cube is huge and dont want to load that huge amount with less daily updates happening in the cube.

I am exploring the options may be to use the transaction log file and get the new data which is loaded into the existing cube.Correct me if i am wrong!

tomok · Post by **tomok** » Thu Mar 22, 2012 11:46 am

Loading from the transaction log is certainly a possibility but this can't be done with TM1. You will need to use some other tool like VB, VBA, or something else, to parse the files and execute SQL commands against your relation store. Certainly doable, but it will be quite a task.

qml · Post by **qml** » Thu Mar 22, 2012 12:02 pm

tomok wrote:Loading from the transaction log is certainly a possibility but this can't be done with TM1.

Why not? Not that I think it's a great idea, but the TI scripting language is surely capable of scanning through structured text files (in this case timestamped transaction log files) and loading data through an ODBC connection based on specific criteria.

Again, I'm not advocating this as the best approach, but it definitely seems doable to me and I would at least consider it as one possible solution.

Duncan P · Post by **Duncan P** » Thu Mar 22, 2012 1:55 pm

I have done this to implement differential load into a data warehouse from TM1. If there is interest I can write this up for the hints and tips section.

qml · Post by **qml** » Thu Mar 22, 2012 2:11 pm

Duncan, go for it.

I would also love to hear your comments on the performance and reliability of this solution and how it stacks up against Replication. ODBC connections are generally quite slow, which coupled with the need to scan potentially quite large transaction logs makes me a bit worried about how scalable this whole thing might be.

dwhcognos8 · Post by **dwhcognos8** » Thu Mar 22, 2012 8:46 pm

Thanks for the replies.

Duncan it would be great if you could help us with tips and techniques.

jim wood · Post by **jim wood** » Thu Mar 22, 2012 9:07 pm

Indeed Duncan. I would like to see how you did it. Write back via TI is something I'm asked about on a regular basis. I've not been asked about incremental loads yet. Saying that, having a decent referral method will help offering it as a viable option.

David Usherwood · Post by **David Usherwood** » Mon Mar 26, 2012 5:55 pm

I've been digging further into the idea of reading the transaction logs to generate an incremental load for an RDBMS.
Under 9.5.2 (at least) the format seems to have changed eg:

Code: Select all

#LOG_FORMAT=1
#LOGID=1c
"1d","20120326165545","20120326165545","Admin","N","65636.","0.","PandLSum3","01","Budget","1010 Sales Written","Jan","Year02",""
<etc, etc>
"1d","20120326165545","20120326165545","Admin","N","65646.","0.","PandLSum3","01","Budget","1010 Sales Written","Feb","Year02",""
#"1d","20120326165545","Change set 1d complete : 12"

Have others worked with this? It's still manageable but far less clean than it used to be. Insights welcome.
NB I didn't have an active sandbox when I spread this data.

Duncan P · Post by **Duncan P** » Mon Mar 26, 2012 7:27 pm

The format changed in 9.5 and although it is less clean, if you ignore all lines beginning with # and if you ignore the first field you can do exactly as you did before.

If you are on 9.4 or before and are planning to upgrade at any time in the future it is probably worth writing a single generic pair of TIs, one to iterate over the files with WildCardFileSearch, and one that rips through a log file using metadata held in dimension attributes to guide how to build a SQL statement from the dimension members and the values on each valid row. This is more complex to write initially but it means that you only have to replace one TI when you upgrade.

I am writing up an example but I have a day job too so it is not going as quickly as I would like.

Briefly the process is this :-

1. Identifying the log files to use and iterating over them. This uses the recorded timestamp from the last log file processed in the previous run.
2. Identifying and interpreting the relevant records of the log file to pinpoint the cells that have changed in the cube of interest
3. Grouping changes to multiple measures with the same key so that a single SQL statement can represent them all
4. Mapping the dimension members of the changed cellls with the, possibly multiple, key values in the corresponding dimensions of the warehouse
4. Creating and executing a SQL statement for each row to be inserted in the fact table
5. Storing information about the last log file that has been processed so that the next run of the process can take up where this one left off

If the data warehouse is managed then you will be writing to a staging table. Typically a data warehouse staging table for a fact import has multiple columns of data and the insert statement needs values for them all. However each record of the log file only records the change to one measure. So there needs to be a way of grouping them, and also of reading the values for adjacent measures that might not have changed but that still need to be included in the staging table row.

Also there may be multiple key columns in the staging table for each dimension. It is even possible for a key column to be shared between dimensions, in which case it would only appear once in the staging table, creating an interdependence between the dimensions that you need to take account of in the export.

As far as performance goes if ODBC is slow then writing out to a text file and bulk loading it is just as good an option. The complexity is not in the transport but in the generation of the data for the rows to be exported.

As I said I will put more detail in a post in hints and tips - hopefully soon.

HTH,
Duncan.

lotsaram · Post by **lotsaram** » Mon Mar 26, 2012 7:41 pm

David Usherwood wrote:I've been digging further into the idea of reading the transaction logs to generate an incremental load for an RDBMS.
Under 9.5.2 (at least) the format seems to have changed eg:
Code: Select all
#LOG_FORMAT=1
#LOGID=1c
"1d","20120326165545","20120326165545","Admin","N","65636.","0.","PandLSum3","01","Budget","1010 Sales Written","Jan","Year02",""
<etc, etc>
"1d","20120326165545","20120326165545","Admin","N","65646.","0.","PandLSum3","01","Budget","1010 Sales Written","Feb","Year02",""
#"1d","20120326165545","Change set 1d complete : 12"
Have others worked with this? It's still manageable but far less clean than it used to be. Insights welcome.
NB I didn't have an active sandbox when I spread this data.

I think the only difference in format that matters from the purpose of interoperability with a TI that read logs from previous versions is the addition of the first field which I think holds the transaction batch ID. It seems to be blank much of the time, but for the purposes of reading the individual cell by cell transactions the ID field can be ignored anyway, as can any other field to the right of the last dimension in the cell address.

dwhcognos8 · Post by **dwhcognos8** » Mon Mar 26, 2012 9:12 pm

Thanks Guys for ur Valuable Suggestions

TM1 Forum

Insert Data from TM1 into SQLServer incrementally

Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally

Re: Insert Data from TM1 into SQLServer incrementally