Parallel data loads and TM1runTI, Hustle

Post Reply
Wim Gielis
MVP
Posts: 3113
Joined: Mon Dec 29, 2008 6:26 pm
OLAP Product: TM1, Jedox
Version: PAL 2.0.9.18
Excel Version: Microsoft 365
Location: Brussels, Belgium
Contact:

Parallel data loads and TM1runTI, Hustle

Post by Wim Gielis »

Hello,

When it comes to loading data in a parallel fashion, we can speed up the loads by using TM1runti.exe - given that no metadata is changed and so on? As far as I know, we have the following possibilities:

1) schedule chores whereby TM1runti is called, with the correct process names and parameter names/values.
2) not using chores but executing batch scripts, which in turn, contain the TM1runti calls
3) have 1 batch file with TM1runti.exe and use the Hustle.exe application from Cubewise to load data in a parallel fashion.

Option 1 is not really a viable solution so let's forget about it.
The question is, what will Hustly bring as an advantage in option 3 when compared to option 2 ? I mean, what would be the benefit of using the .exe file ? Does it split loads over different cores in a better/optimised way ? What has been programmed inside Hustle that can benefit the TM1 users or TM1 developers ?

I understand that this will be a question for Cubewise staff but if anyone else knows the answer, feel free to reply.

Thanks,

Wim
Best regards,

Wim Gielis

IBM Champion 2024
Excel Most Valuable Professional, 2011-2014
https://www.wimgielis.com ==> 121 TM1 articles and a lot of custom code
Newest blog article: Deleting elements quickly
lotsaram
MVP
Posts: 3652
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: Parallel data loads and TM1runTI, Hustle

Post by lotsaram »

Have you read the 1 page documentation at https://github.com/cubewise-code/hustle ?

I am a big fan of hustle, but hey what do you expect :)

Hustle just makes it really easy to manage executing lots of TI threads in parallel in a well managed way. All you do is pass to hustle 2 parameters i) the address of a text file which contains the list of cmd line calls (usually to tm1runti but could be to any command line executable) and ii) the max number of cores the queue should be managed to.

So say your server has 12 cores, you can have a file which contains 50 tm1runti jobs and pass in 10 as the number of cores to use. Hustle will launch the 1st 10 threads and watch the queue and as each thread finished start a new one, keeping 10 cores running until all 50 jobs are completed. So to kick off a parallel load it's very easy, I just need to do one ExecuteCommand to hustle.

Of course you can get more fancy (and of course I do) by automatically generating the hustle job files via asciioutput depending on the load to perform and looking up a system cube for how many cores to dedicate to the load. I imagine you would also probably do something similar.
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.
Wim Gielis
MVP
Posts: 3113
Joined: Mon Dec 29, 2008 6:26 pm
OLAP Product: TM1, Jedox
Version: PAL 2.0.9.18
Excel Version: Microsoft 365
Location: Brussels, Belgium
Contact:

Re: Parallel data loads and TM1runTI, Hustle

Post by Wim Gielis »

lotsaram wrote: Fri Jan 25, 2019 3:38 pmHave you read the 1 page documentation at https://github.com/cubewise-code/hustle ?
I did, and I read information I already knew from the Cubewise website:
Hustle is a small utility that can be used to manage threads when executing command line tools. The tool was built to take advantage parallel loading in IBM Cognos TM1, specifically the tm1runti.exe. Hustle enables you to specify the number of concurrent threads you want to be executed at any one time and pass a batch of commands to be executed on these threads.

Although Hustle was designed to be used with TM1 it can be used to execute any command line executable. It is great for run batch processes concurrently allowing you take advantage of all of your CPU cores.
But that did not give me the information, for instance, what you wrote in the previous reply (thanks for that !)
For example, that Hustle would start number 11 after that any of the first 10 calls to TM1runti finishes. That would also happen in a situation where I call TM1runTI 50 times in the Prolog of a process, no ?
But I understand that Hustle would make sure that all 10 cores are used efficiently ?

I had assumed that there were benefits of using this tool rather than not using it, but I wanted to have more detailed explanations.
lotsaram wrote: Fri Jan 25, 2019 3:38 pmautomatically generating the hustle job files via asciioutput depending on the load to perform and looking up a system cube for how many cores to dedicate to the load
CellGetN to a system cube and AsciiOutput to text files are indeed 2 indispensable ingredients of any TM1 model ;-)
Best regards,

Wim Gielis

IBM Champion 2024
Excel Most Valuable Professional, 2011-2014
https://www.wimgielis.com ==> 121 TM1 articles and a lot of custom code
Newest blog article: Deleting elements quickly
lotsaram
MVP
Posts: 3652
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: Parallel data loads and TM1runTI, Hustle

Post by lotsaram »

Wim Gielis wrote: Fri Jan 25, 2019 4:01 pm For example, that Hustle would start number 11 after that any of the first 10 calls to TM1runti finishes. That would also happen in a situation where I call TM1runTI 50 times in the Prolog of a process, no ?
But I understand that Hustle would make sure that all 10 cores are used efficiently ?
Well if you are calling TM1RunTI 50 times on the Prolog how are you going to to ensure that only 10 cores get used and 10 cores are always used and kept busy? (at least until the job queue is extinguished and there are less than 10 jobs left). If you wait for the ExecuteCommand then you are only running one tm1runti, if you don't wait then you need a mechanism to release a controlled number of threads and not all 50. If writing back to a control cube for the number of threads in use then this is actually devilishly tricky due the fact that the TI commits changes on completion of the epilog and will be reading the base model's unchanged value (there are some ways around this but again you want to avoid serialization and locking). This is the benefit of Hustle, it just makes managing the number of threads to use really easy and you don't need to come up with a mechanism to do this.

As to whether Hustle can make sure all 10 cores are used "efficiently", ... no it can't do that. It can only make sure that the 10 cores are used. You as the developer still need to make sure that there is no locking. Hustle is just watching the command line executables for when they finish (if 1 was working and 9 were waiting it doesn't know.
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.
holger_b
Posts: 131
Joined: Tue May 17, 2011 10:04 am
OLAP Product: TM1
Version: Planning Analytics 2.0
Excel Version: 2016
Location: Freiburg, Germany

Re: Parallel data loads and TM1runTI, Hustle

Post by holger_b »

Hi both,

do not miss to take a look at RushTI (also from Cubewise). It works just like hustle.exe, but it uses only one connection for all the calls, so less overhead, e.g. less traffic in your LDAP connection. It just requires Python installed on the server plus a little library with TM1 specific python scripts. We have been using it for several months now, and we all love it. Easy, fast and reliable.

Cheers
Holger
tomok
MVP
Posts: 2831
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: Parallel data loads and TM1runTI, Hustle

Post by tomok »

Given that Hustle has to be installed on the server is anyone using it on the IBM Cloud?
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/
Wim Gielis
MVP
Posts: 3113
Joined: Mon Dec 29, 2008 6:26 pm
OLAP Product: TM1, Jedox
Version: PAL 2.0.9.18
Excel Version: Microsoft 365
Location: Brussels, Belgium
Contact:

Re: Parallel data loads and TM1runTI, Hustle

Post by Wim Gielis »

In addition, can someone compare the different options we have for parallel loading (TM1runTI.exe, Hustle.exe, RushTI.exe, scheduling chores, ...) with the new RunProcess Turbo Integrator function ? This could be a useful alternative, given that no other software is needed and the only change that is required is swapping "ExecuteProcess" for "RunProcess". This is from PA 2.0.6 onwards.
Best regards,

Wim Gielis

IBM Champion 2024
Excel Most Valuable Professional, 2011-2014
https://www.wimgielis.com ==> 121 TM1 articles and a lot of custom code
Newest blog article: Deleting elements quickly
EvgenyT
Community Contributor
Posts: 324
Joined: Mon Jul 02, 2012 9:39 pm
OLAP Product: TM1
Version: PAL 2.0.8
Excel Version: 2016
Location: Sydney, Australia

Re: Parallel data loads and TM1runTI, Hustle

Post by EvgenyT »

Wim Gielis wrote: Tue Apr 09, 2019 2:16 pm In addition, can someone compare the different options we have for parallel loading (TM1runTI.exe, Hustle.exe, RushTI.exe, scheduling chores, ...) with the new RunProcess Turbo Integrator function ? This could be a useful alternative, given that no other software is needed and the only change that is required is swapping "ExecuteProcess" for "RunProcess". This is from PA 2.0.6 onwards.
Good to see this topic brought up. RunProcess only eliminates complexities associated with writing scripts for TM1runTI, I.e. path to .exe, admin host etc.
Although, I’m not sure how would you call a process from another instance with RunProcess? Based on documentation, a process must reside on the same instance, so my guess you still have to use one of the above mentioned methods.

My comparison would be:

1. TM1RunTI.exe - Parallel loading without inbuilt queuing mechanism. In the past we had to rely on Lock files and Thread Control cubes, but as Lotsaram pointed out it’s very tricky and not consistent.
2. Hustle - TM1RunTI with inbuilt queuing mechanism, but has a limitation where threads must be ran under different users, as when user thread logs out and another thread tries to log back in (under the same user account) it creates a lock. Potentially a lot of admin accounts just to accomodate for RunTI.
3. Rush - TM1RunTI with inbuilt queuing mechanism and ability to run many threads under one admin account without locking.
4. Chore - I have used Synchronized function in the past to mimic queuing mechanism behaviour. Somewhat limited, but can serve a good purpose when your requirement is not very complicated. E.g. load 12 months worth of some data in parallel on 6 cores, you can have the same TI twice in the chore. First process, executes a loop in the Prolog with an arbitrary string ‘Month’| Month Number (1-6) passed to Synchronized (sString). It basically creates 6 locks which government the behaviour of the next process. In the Prolog of the second process you do the same thing, execute Synchronized (sString) in a loop. Then if you look at TM1top you will see 6 original threads running and 6 additional threads sitting at a wait state (Synchronized). Note: second bunch of TIs won’t start processing until the first bunch is finished. Hence, it is somewhat of a batch processing technique, rather than queuing.

Thanks

Evgeny
EvgenyT
Community Contributor
Posts: 324
Joined: Mon Jul 02, 2012 9:39 pm
OLAP Product: TM1
Version: PAL 2.0.8
Excel Version: 2016
Location: Sydney, Australia

Re: Parallel data loads and TM1runTI, Hustle

Post by EvgenyT »

tomok wrote: Tue Apr 09, 2019 10:14 am Given that Hustle has to be installed on the server is anyone using it on the IBM Cloud?
Havent used the Hustle, but have used RunTi on the cloud. DevOpps have to create a special LDAP service account for it to work.
holger_b
Posts: 131
Joined: Tue May 17, 2011 10:04 am
OLAP Product: TM1
Version: Planning Analytics 2.0
Excel Version: 2016
Location: Freiburg, Germany

Re: Parallel data loads and TM1runTI, Hustle

Post by holger_b »

Hi Evgeny,

I cannot quite confirm this:
2. Hustle - TM1RunTI with inbuilt queuing mechanism, but has a limitation where threads must be ran under different users, as when user thread logs out and another thread tries to log back in (under the same user account) it creates a lock. Potentially a lot of admin accounts just to accomodate for RunTI.
We used hustle with around 20 parallel threads and up to 60 calls with just one user profile, so there is a lot of logging on and off with no problem at all. Still we switched to RushTI in the end because it is nicer and faster.
Drg
Regular Participant
Posts: 159
Joined: Fri Aug 12, 2016 10:02 am
OLAP Product: tm1
Version: 10.2.0 - 10.3.0
Excel Version: 2010

Re: Parallel data loads and TM1runTI, Hustle

Post by Drg »

our security service limits us in possible means therefore we have 2 processes of the parallel queue manager, one based on the flag files, the second based on the cube of the queue with the priorities of the chot processes allows building complex orchestrations, this option looks more reliable but it is really difficult to understand.
EvgenyT
Community Contributor
Posts: 324
Joined: Mon Jul 02, 2012 9:39 pm
OLAP Product: TM1
Version: PAL 2.0.8
Excel Version: 2016
Location: Sydney, Australia

Re: Parallel data loads and TM1runTI, Hustle

Post by EvgenyT »

holger_b wrote: Mon Apr 29, 2019 2:49 pm Hi Evgeny,

I cannot quite confirm this:
2. Hustle - TM1RunTI with inbuilt queuing mechanism, but has a limitation where threads must be ran under different users, as when user thread logs out and another thread tries to log back in (under the same user account) it creates a lock. Potentially a lot of admin accounts just to accomodate for RunTI.
We used hustle with around 20 parallel threads and up to 60 calls with just one user profile, so there is a lot of logging on and off with no problem at all. Still we switched to RushTI in the end because it is nicer and faster.
Hi,

I should have clarified when using CAM security. Native security / LDAP should not experience such problem.

Regards

Evgeny
Post Reply