Page 1 of 1

Load time - ODBC vs text file

Posted: Sat May 07, 2022 11:49 pm
by JohnO
Just wondering if there is any known performance hit when loading from an ODBC source vs text. Having just implemented HANA we are now going through the process of moving from relying on text file loads enabling direct querying from TM1 to BW on HANA (Yes we are moving from the dark ages). I started off with a simple read with a count accumulation in the data tab and an export of the total count in the epilog. That was fast - no issues.

However having redeveloped the existing load process (No fundamental changes) to use the ODBC data source the load time is about double what it was from text. The number of rows is approximately 15 million and it is taking 1 hr10 mins to load. I have looked at the HANA conneciton property options but I don't see anything obvious which might be causing an issue, and as I say when I do a simple read and count accumulation it is very fast - just a few minutes. I know most people don't deal with these volumes and hence maybe I am experiencing something most wouldn't notice.

As a general rule should we expect ODBC sourced loads to be so much slower? If so, any suggestion as to why?

When I performed the original data read test I also tested a parallel read using Runprocess. I found no material difference in the total read time. I don't know if that has to do with TM1 not being able to perform concurrent ODBC sourced loads or if it's the HANA driver. I have not tried TM1RUNTI.

Should I expect to be able to perform parallel ODBC loads with performance gains? If so what about with HANA?

Re: Load time - ODBC vs text file

Posted: Sun May 08, 2022 8:40 am
by declanr
If you ignore TM1 for a minute.
Querying data in a DB will always take some time - if you take the query that you are using for this source and just execute it in a GUI, I expect it will be far from instant in showing you all of the results.
Add to this potential network latency between where your DB is housed and the TM1 server - then you can see an impact.
You can do things to make it quicker, if the query is "complex"... you could create it in the DB as a materialized view that is pre-calculated and then no calculations are needed when you run the TI process.

Without knowing what the process is doing, it is hard to say whether it is taking "too long" or not BUT 15 million records is not really a lot. And if we assume it was taking 30/35 minutes to load that from a TEXT file - I would be inclined to say you could also probably make some savings on the TI side itself.

Re: Load time - ODBC vs text file

Posted: Sun May 08, 2022 10:13 pm
by JohnO
It's a pretty simple load TI. There is some necessary conditional logic and we write each value to 2 places in the one cube, in a few cases more than 2. Other than that it's not complex so no obvious gains to be had.

Re: Load time - ODBC vs text file

Posted: Mon May 09, 2022 7:33 am
by lotsaram
JohnO wrote: Sat May 07, 2022 11:49 pm As a general rule should we expect ODBC sourced loads to be so much slower? If so, any suggestion as to why?
Yes, almost always. It is stating the obvious but when the data source is ODBC rather than flat file you have 2 additional potential bottlenecks
- the database itself
- the ODBC driver

In the database layer; what's going on in the DB at the time the date is queried (users, concurrency etc.)? The complexity of the query (number and size of tables, joins, pivoting, etc.)? Optimization / indexing of the tables? Query vs materialized view? Security to be applied?

In the ODBC driver; memory buffer, batch size, time taken for authentication, etc. etc.

Even with a high performing ODBC driver and everything optimized in the DB. All things being equal you would still expect the performance to be significantly slower vs. a local file stored on SSD. Where a ODBC source may be quicker is where you have old and crappy network storage with slow HD.

SAP is a notoriously slow data provider. HANA is significantly better but still not as fast as your typical DB2 or SQLServer connection.

If you have 15 M records then you could save some significant time by breaking the query up into smaller slices and loading them concurrently. How much you can save depends on the degree of concurrency that both the DB and your TM1 instance can support, but often you can get close to linear improvement. This strategy goes for both ODBC and text files.