RunProcess Again

Post Reply
MarenC
Regular Participant
Posts: 349
Joined: Sat Jun 08, 2019 9:55 am
OLAP Product: Planning Analytics
Version: Planning Analytics 2.0
Excel Version: Excel 2016

RunProcess Again

Post by MarenC »

Hi,

I really want to use runprocess but still don't have full confidence in it, for a number of reasons.

I have a question about when a master process calls another process using RunProcess, and the called process includes executeprocess calls to other processes.

For example, let us say we have Master process which calls Child process A as follows:

RunProcess( 'Child Process A' );

But Child Process A calls Child Process B as follows:

ExecuteProcress ( 'Child Process B' );

Will parallel processing be broken once Child Process A calls Child Process B using ExecuteProcess?

If so is the solution to make Child Process A also use RunProcess?

I should note that in my real world requirement I may want to call Child Process A at least 300 times and Child Process A calls numerous other processes.

Maren
User avatar
paulsimon
MVP
Posts: 808
Joined: Sat Sep 03, 2011 11:10 pm
OLAP Product: TM1
Version: PA 2.0.5
Excel Version: 2016
Contact:

Re: RunProcess Again

Post by paulsimon »

Hi Maren

If you are going to use RunProcess to set off 300 lots of Process A in parallel, then you aren't going to get that degree of parallel processing unless your box has 300 processors. Therefore I would not bother about the fact that an instance of Process A will not complete until Process B finishes if it calls Process B using ExecuteProcess.

Regards

Paul Simon
MarenC
Regular Participant
Posts: 349
Joined: Sat Jun 08, 2019 9:55 am
OLAP Product: Planning Analytics
Version: Planning Analytics 2.0
Excel Version: Excel 2016

Re: RunProcess Again

Post by MarenC »

Hi Paul,

The number of processes could be anywhere between 1 and 300, it depends on what the user does.

I was hoping to control the parallelisation using Synchronised but I wasn't sure if the fact that Child Process A uses ExecuteProcess means the whole idea is flawed anyway?

Maren
User avatar
PavoGa
MVP
Posts: 617
Joined: Thu Apr 18, 2013 6:59 pm
OLAP Product: TM1
Version: 10.2.2 FP7, PA2.0.9.1
Excel Version: 2013 PAW
Location: Charleston, Tennessee

Re: RunProcess Again

Post by PavoGa »

The ExecuteProcess('Process B') becomes part of the "transaction" of the thread running process A. Imagine if you just placed all of the activity in Process B within Process A.

SYNCHRONIZED can manage parallel process and flow control, but it is probably easier to use the file semaphore methodology where a master process needs to track how many threads are running simultaneously and spawn another thread when a core is freed up.

I do use SYNCHRONIZED when I'm splitting a task across the available cores and need those to complete before the next task (and/or group of parallel processes) executes.

So remember this, if a metadata type change is made in a child process through EXECUTEPROCESS, any locks generated hang around until the parent process finishes, resulting in lock contention for other processes. It is therefore prudent to use RunProcess to make metadata changes (we are also seeing lock contention with MDX subsets).
Ty
Cleveland, TN
User avatar
paulsimon
MVP
Posts: 808
Joined: Sat Sep 03, 2011 11:10 pm
OLAP Product: TM1
Version: PA 2.0.5
Excel Version: 2016
Contact:

Re: RunProcess Again

Post by paulsimon »

Hi

Where possible use temporary subsets and views. This had a dramatic effect on our lock contention, not to mentioned clutter.
lotsaram
MVP
Posts: 3652
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: RunProcess Again

Post by lotsaram »

PavoGa wrote: Tue Feb 09, 2021 8:43 pm SYNCHRONIZED can manage parallel process and flow control, but it is probably easier to use the file semaphore methodology where a master process needs to track how many threads are running simultaneously and spawn another thread when a core is freed up.
It is possible to keep track of the thread flags in a cube via using the trick in the watcher process of CubeSaveData of the thread control cube which sidesteps TM1's commit model and allows the watcher process to read updated values. But this is a bit of a hack although it seems pretty robust.
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.
User avatar
PavoGa
MVP
Posts: 617
Joined: Thu Apr 18, 2013 6:59 pm
OLAP Product: TM1
Version: 10.2.2 FP7, PA2.0.9.1
Excel Version: 2013 PAW
Location: Charleston, Tennessee

Re: RunProcess Again

Post by PavoGa »

paulsimon wrote: Tue Feb 09, 2021 9:48 pm Hi

Where possible use temporary subsets and views. This had a dramatic effect on our lock contention, not to mentioned clutter.
Agreed, but we have had a nasty surprise that some of our temporary dynamic subsets were causing lock contention on the dimension and }DimensionProperties. We have some situations where we are having to create subsets we need as permanent through RunProcess and then deleting in the epilog.
Ty
Cleveland, TN
User avatar
paulsimon
MVP
Posts: 808
Joined: Sat Sep 03, 2011 11:10 pm
OLAP Product: TM1
Version: PA 2.0.5
Excel Version: 2016
Contact:

Re: RunProcess Again

Post by paulsimon »

Hi Pavo

Interesting that you have still got locks when using temporary views and subsets.

Surely creating a permanent subset is more likely to cause lock contention since that is a meta data event? If I understand you correctly, you are relying on the lock being relatively short by creating the subset alone in one process and then using RunProcess to run another process that uses it? That is one approach but the downside of that is that it is more complex, and as soon as you use RunProcess you need a way to track whether the spawned process completed successfully.

On a daily basis we scan our TM1 Top log and look for any case where one thread was waiting on another for more than 5 seconds. Before, we implemented temporary views and subsets during the busy month end period when everyone leaves it to the last minute to submit and we have over 100 connections to the server, we were getting anything up to 300 waits during the course of the day. After implementing temporary views and subsets this fell to around 3-5 waits on the busiest day. That is what I mean by a dramatic improvement. There are a few batch processes going on in there which give rise to the waits, but by and large it is data loads or user entry which should be non-blocking with parallel interaction. There are some dim updates but these only occur on individual Entities' dimensions so they don't lock the rest of the system.

The only way that I can think that your temporary subsets and views could be creating lock contention is if they have MDX that is referencing something external, particularly if rule based. Is that possibly the case? Have you checked your TM1Server log to make sure that no Cube Dependencies are being created on the fly? If not, then this looks like a bug, unless it is something very specific to your model.

Regards

Paul Simon
User avatar
PavoGa
MVP
Posts: 617
Joined: Thu Apr 18, 2013 6:59 pm
OLAP Product: TM1
Version: 10.2.2 FP7, PA2.0.9.1
Excel Version: 2013 PAW
Location: Charleston, Tennessee

Re: RunProcess Again

Post by PavoGa »

Have not really seen lock contention related to temporary views, it seems to be on just the subsets. The contention occurs with either the SubsetCreateByMDX or SubsetCreate/SubsetMDXSet method. On the latter, the locks appear when the SubsetMDXSet statement is hit. Lock contention does not occur on a temporary static subset, just MDX based ones.

The process structure is the master process is needing a subset, typically our version dimension, and then we will call subs with RunProcess. Those child processes are stopping dead in their tracks with lock contentions until the master completes. What we did to get around it, is create a permanent subset through RunProcess. That, of course, eliminates the lock contention from the subsequent child threads.

We are not seeing lock contention in the sibling processes and they are creating dynamic subsets on the fly on the same dimensions, for example our time dimension. They run in parallel with no issue. Overall, we rarely have contentions during the day outside of ones on purpose by SYNCHRONIZED.

We do need to do a better job with managing cube dependencies, but the ones being created on the fly are few and far between. We have a process and cube to manage and set the dependencies, just not checking the log file and updating the cube. :?

Most of our data loading procedures are now splitting the load up and running in parallel. Need to do some additional testing and provide some examples of when it happens and when it does not.
Ty
Cleveland, TN
User avatar
paulsimon
MVP
Posts: 808
Joined: Sat Sep 03, 2011 11:10 pm
OLAP Product: TM1
Version: PA 2.0.5
Excel Version: 2016
Contact:

Re: RunProcess Again

Post by paulsimon »

Hi Pavo

Thanks for confirming that your issue with locking in temporary subsets is only related to temporary subsets that are MDX based.

We don't use of those. Typically our MDX subsets are for things like get all base elements or full hierarchy. These are created as permanent subsets as part of our dimension setup process, which as the name implies is usually only run once. Therefore we have no need to define these as temporary subsets. We just reference the permanent subset in the process.

You mentioned your version dimension. Typically for these we only select one element, eg actual, forecast, etc. There is no need for MDX to do that. Presumably, your case is more complex so you need MDX.

One possibility is to use a loop to populate the subset instead of running an MDX expression and then copying that to a static subset. You can then do that on a temporary subset without the risk of locking. Even then we don't often have a need to do that. For example, we have permanent static subsets on our months dimension to give the 12 months of the current year, and this gets updated by the year end process.

Regards

Paul
User avatar
ykud
MVP
Posts: 148
Joined: Sat Jan 10, 2009 10:52 am
Contact:

Re: RunProcess Again

Post by ykud »

lotsaram wrote: Wed Feb 10, 2021 9:54 am It is possible to keep track of the thread flags in a cube via using the trick in the watcher process of CubeSaveData of the thread control cube which sidesteps TM1's commit model and allows the watcher process to read updated values. But this is a bit of a hack although it seems pretty robust.
Can you elaborate on this a little please?
Are you running CubeSaveData in 'queue watcher' process? So the worker processes write values to the cube and watcher runs a CubeSaveData on every few seconds before reading the values from this cube to force a commit from a worker thread?
MarenC
Regular Participant
Posts: 349
Joined: Sat Jun 08, 2019 9:55 am
OLAP Product: Planning Analytics
Version: Planning Analytics 2.0
Excel Version: Excel 2016

Re: RunProcess Again

Post by MarenC »

Hi PavoGa,

"but it is probably easier to use the file semaphore methodology where a master process needs to track how many threads are running simultaneously and spawn another thread when a core is freed up"

Could you elaborate on this please? Are you saying you have asciioutputs on the child processes and the master process does some kind of loop
until it sees n number of files, once it sees n number of files it exits the loop and carries on?

Would also be interested to see how CubeSaveData can be used to track the processes.

Maren
lotsaram
MVP
Posts: 3652
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: RunProcess Again

Post by lotsaram »

ykud wrote: Thu Feb 11, 2021 12:12 am Are you running CubeSaveData in 'queue watcher' process? So the worker processes write values to the cube and watcher runs a CubeSaveData on every few seconds before reading the values from this cube to force a commit from a worker thread?
Yes exactly. The worker thread still needs to have comitted to get its data changes from its private shadow copy to the base model. Then running CubeSaveData (yes every few seconds!) in the watcher process forces the changed values in the tracking cube to be available to the watcher process where without this the watcher would only see the state of data in the thread control cube at the time of the first CellGet.

As the thread control cube is tiny the CubeSaveData is near instantaneous and we haven't seen any performance or locking issues even in big, busy systems.

My only hesitancy in recommending this over file based semaphores is that although the behaviour from using CubeSaveData is logical, it isn't exactly documented.
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.
User avatar
PavoGa
MVP
Posts: 617
Joined: Thu Apr 18, 2013 6:59 pm
OLAP Product: TM1
Version: 10.2.2 FP7, PA2.0.9.1
Excel Version: 2013 PAW
Location: Charleston, Tennessee

Re: RunProcess Again

Post by PavoGa »

MarenC wrote: Thu Feb 11, 2021 8:31 am Hi PavoGa,

"but it is probably easier to use the file semaphore methodology where a master process needs to track how many threads are running simultaneously and spawn another thread when a core is freed up"

Could you elaborate on this please? Are you saying you have asciioutputs on the child processes and the master process does some kind of loop
until it sees n number of files, once it sees n number of files it exits the loop and carries on?

Would also be interested to see how CubeSaveData can be used to track the processes.

Maren
So in a case where you wish to track how many threads are running, your master process would ASCIIOutput to a file for each process spawned through RunProcess. As those processes complete, the last thing in their EPILOG is to remove their particular file. For the file, I use a template with a random number suffix and same file extension (.lock or .txt or ...). The MASTER process is constantly looping and watching the folder (or you could use a file name template) to count the number of files there. When the number drops below the number of cores allocated to running the subs, spawn another.

Using CubeSaveData would be doing practically the same thing but writing to a cube instead of a file. There are a number of ways to accomplish this, but CubeSaveData is essential as it flushes the transaction cache to the cube for the various threads to pick up.

One example is the master process increments +1 for each spawned subprocess, stopping when the number of allocated cores are running. Each spawned process increments the counter in the cube -1 as it completes. The master process, in a loop, does a CubeSaveData, checks the counter and spawns more subs when the number of parallel processes drops. I've not done it this way, but lotsaram may be able elaborate or correct this example.
Ty
Cleveland, TN
User avatar
ykud
MVP
Posts: 148
Joined: Sat Jan 10, 2009 10:52 am
Contact:

Re: RunProcess Again

Post by ykud »

lotsaram wrote: Thu Feb 11, 2021 2:06 pm My only hesitancy in recommending this over file based semaphores is that although the behaviour from using CubeSaveData is logical, it isn't exactly documented.
Thanks a lot for confirming, this is indeed quite logical and neat, very nice approach.
I'll still stick with file flags until there's a built-in something to check process status. Too afraid of locking something global with CubeSaveData and changes in future PA versions regarding locking or this CubeSaveData behaviour :-)
Post Reply