ODBCOutput Vs Google Big Query - One Person's Experience
Posted: Fri Sep 08, 2023 11:47 pm
This is probably somewhat of a niche post since I don't know how widespread the use of Google Big Query (BQ) is in the wider TM1-osphere, much less how prevalent the need to write to it is.
However I'll put my experience on the searchable record in the hope that it may save some future user from as much teeth grinding and hair pulling as my experience.
We were asked to send our forecast figures (created in TM1) to a Google Big Query table on a regular basis.
This is easy, right? We pull data from BQ all the time so the driver is in place, create a source view, a bit of ODBCOutput, and wham, bam, it's done ma'am.
Eeeeeh... not so much.
So, the ODBCOpen is naturally done in the Prolog, the ODBCOutput is done in the Data, the ODBCClose is done in the Epilog.
Time to test it. Wow, this is taking a while. Time to go downstairs and get a coffee. I get back... sip, OK, it's done, but... looks at the time calc in my control cube... it took how long? I had this in debug mode so I was tracking each write. About 3 seconds per record. That smells like indexing, but BQ isn't my domain, I just feed into and out of it.
For a new year which doesn't have the future months populated yet, that's about 4,000 rows so... not great, but survivable.
The problem is that a full year forecast is several hundred thousand rows. I talk to the BQ guys.
"Maybe it's network traffic; can you shift things like the timestamp and data source name to an Update Where clause after the load has been done?"
OK, I think, not a bad idea. So I do.
3 seconds per record again, although the Update Where which updated every record that had been uploaded took... 3 seconds.
Then things started getting worse. I played around with this a bit further, and randomly and occasionally would get:
Not always for the same line. Not always for the same elements. And not always.
I talk to the BQ guys again and they suggest that maybe it has something to do with the long run time; "can you do it by month, with each call using a new connection?"
OK, worth a shot. I create a while loop in the master process that calls each month in turn.
It still happens. Sometimes. Not all the time. Never on the same rows. Rarely in the same months. As soon as the error occurs the process drops like a brick back to the master one.
I try something else:
How is this happening? I have no idea. The BQ guys have no idea. Bear in mind that I'm still in debug mode here and have text logs showing EVERY SINGLE RELEVANT LINE, including the ODBCOutput so I can see which lines had been written. On each attempt the duplicated lines have been written once, and once only. Some of the duplicated lines had been written on the first attempt, some had not.
Also, I still have the time problem.
It occurs to me that since the Update statement was executing in 3 seconds for {mumble} thousand rows, maybe I should try to aggregate the data and send it in batches rather than one row at a time. So...
And, incidentally, I haven't had any of the idiopathic command failures since I made that change either. Why? I can't be certain, but my suspicion falls into two areas:
However I'll put my experience on the searchable record in the hope that it may save some future user from as much teeth grinding and hair pulling as my experience.
We were asked to send our forecast figures (created in TM1) to a Google Big Query table on a regular basis.
This is easy, right? We pull data from BQ all the time so the driver is in place, create a source view, a bit of ODBCOutput, and wham, bam, it's done ma'am.
Eeeeeh... not so much.
So, the ODBCOpen is naturally done in the Prolog, the ODBCOutput is done in the Data, the ODBCClose is done in the Epilog.
Time to test it. Wow, this is taking a while. Time to go downstairs and get a coffee. I get back... sip, OK, it's done, but... looks at the time calc in my control cube... it took how long? I had this in debug mode so I was tracking each write. About 3 seconds per record. That smells like indexing, but BQ isn't my domain, I just feed into and out of it.
For a new year which doesn't have the future months populated yet, that's about 4,000 rows so... not great, but survivable.
The problem is that a full year forecast is several hundred thousand rows. I talk to the BQ guys.
"Maybe it's network traffic; can you shift things like the timestamp and data source name to an Update Where clause after the load has been done?"
OK, I think, not a bad idea. So I do.
3 seconds per record again, although the Update Where which updated every record that had been uploaded took... 3 seconds.
Then things started getting worse. I played around with this a bit further, and randomly and occasionally would get:
I haven't redacted the error. It was literally "..."."Forecast (Working)", "AUD","2023","M09","Redacted","Redacted","Redacted","Redacted","Redacted","Redacted","Redacted","Redacted","Redacted","Reportable Amount","AmountRedacted",Data Source line (301) Error: Data procedure line (80): Error executing SQL query: "..."
Not always for the same line. Not always for the same elements. And not always.
I talk to the BQ guys again and they suggest that maybe it has something to do with the long run time; "can you do it by month, with each call using a new connection?"
OK, worth a shot. I create a while loop in the master process that calls each month in turn.
It still happens. Sometimes. Not all the time. Never on the same rows. Rarely in the same months. As soon as the error occurs the process drops like a brick back to the master one.
I try something else:
- If the number of completed records (stored in a global variable) does not equal the expected number of records for that month, run a delete query to purge the incomplete month data. (Incomplete data = where the Timestamp field Is Null, since the Epilog, where the timestamp was updated, was never reached on the first call), then
- Have another crack at doing the month.
- Quit if it still fails after 3 attempts.
How is this happening? I have no idea. The BQ guys have no idea. Bear in mind that I'm still in debug mode here and have text logs showing EVERY SINGLE RELEVANT LINE, including the ODBCOutput so I can see which lines had been written. On each attempt the duplicated lines have been written once, and once only. Some of the duplicated lines had been written on the first attempt, some had not.
Also, I still have the time problem.
It occurs to me that since the Update statement was executing in 3 seconds for {mumble} thousand rows, maybe I should try to aggregate the data and send it in batches rather than one row at a time. So...
- In Metadata I work out the number of non-skipped rows. (Skipped = "Written to an invalid combination" or "Zero when rounded to less than a couple of decimal places")
- I find that for this interface, I can comfortably fit 300 Value clauses into a string variable with a bit of room to spare, so I work out the number of blocks of 300 rows;
- I work out the remainder from .
Code: Select all
Total - (300n)
- In data, if there are > 300 rows I start aggregating the values until I hit 300 rows, then commit to the database and repeat until I run out of blocks of 300.
- I then aggregate the remainder, and punch that in as a single block.
And, incidentally, I haven't had any of the idiopathic command failures since I made that change either. Why? I can't be certain, but my suspicion falls into two areas:
- The amount of time that the ODBC connection was being held open for. With the "bulked up" Outputs the full year output was taking only about 3 minutes per month. Line by line? Well over a quarter of an hour. That's plenty of time for unexpected timeouts (though the BQ guys checked the specs and said that there shouldn't be any), network gremlins, interference by other processes on the BQ side, whatever, to interfere with the upload. 3 minutes is a much smaller target window.
- The speed of doing line by line. Yes, it was taking 3 seconds for the Output to happen but the connection was being hit with the next Output request pretty much instantaneously. Granted, it takes barely a second for the 300 rows to be compiled on the TM1 side, but at least that is some breather for the connection.
- I reiterate that this is speculation on my part.
- If you need to write to BQ via ODBC, don't do it one row at a time. Aggregate the Values clause into as many rows as you can get away with, and hit the sucker with them then.
- Write common values with an Update Where clause at the end so that you can cram as much variable data as possible into the Insert Into statement.
- Be assiduous about keeping the time that the ODBC connection is open to the absolute minimum that you can.
- If you get an idiopathic error, don't assume that you can work around it by re-running part of the upload since you can't be sure that the "do over" will be "clean". Drop that sucker like it's radioactive, purge the uploaded data, and spit an error log saying "Sorry, it didn't work, try again". (But if you follow the first three points, this should very much be an exception, not a rule.)
- Thus, your hair follicles will be safe, and you should save a few bucks on dental work.