Hi Guys,
I'm new at this Forum and I don't know whether I need to reply to this thread or create a new one.
I have an issue/question with the Chore Commit as well, if this does not belong here please let me know and I'll move it.
Context:
We need to update the metadata and data for 8 independent cubes on daily basis.
Current approach:
Logically I need to first refresh all the metadata update processes (with rejected logic as well) and then run the data ones. I have one chore than runs all TI's related to metadata refresh and, at the very end of the processes there an 'update flag' TI that insert 1 in an assumptions cube.
I've created 8 independent chores with 3 processes each that run every 5 minutes: the first one checks on that flag (<>1 ChoreQuit, else continue), the second one runs a wrapper for all data refresh (this is one TI with 15 ExecuteProcess lines each) and finally the third one resets the flag to 0. Notice that each Chore takes approx 120/180 minutes each (yes... I know.. 10 GB of data refresh on daily basis sounds unreasonable, but requirements are such at the moment), so there's no way that one chore can start and finish (therefore, reseting the flag to 0) in less than the timeframe of each Chore (5 minutes). Also, the TI's that check for the flag are the same ones for all Chores.
Problem/oddity:
At 3 a.m. the metadata Chore runs, it takes 90 minutes to run, at the end it successfully turns the flag to 1. All of the 8 Chores start (within a 5 minutes range), they check the flag (first process), then it runs the corresponding wrapper (second process) and finally they reset the flag (third one).
The real issue or bizarre behavior, is that while a Chore is running the second process (is not one in particular as it varies every day), this second process gets called again and starts over (I checked it in the log file). This happens even with the flag set to 0 (because other/s Chores finished and reset it). So, to understand the oddity, the process get called twice, but only ends once. All of these 8 processes have Multiple Commit mode. My understanding of this functionality is that objects get unlock after each individual TI of the Chore finishes, as opposed to the Single commit which has to wait until the last TI finishes so that locks from the first one get free.
Question:
Is multiple commit also like multithread? It respects the sequence that is specified in the Chore or it runs them separated? Since the TI is the same and the flag is set up in one same cell. Could it be possible that the flag check gets on hold by other Chore hence making a second Chore start with the second TI instead of the first one?
I've read
http://www.tm1forum.com/viewtopic.php?f=3&t=9631 and at the very end it gives an answer from IBM saying that:
The issue happens in 10.1 RP1 because in multi-commit mode a cache is discarded on one of the 'commits' and then subsequent cell references precipitate a very deep and complex evaluation. This means that it was taking (in 10.1 RP1) about 1 minute to compute the value for one cell. The chore is not "looping" but rather is very busy.
Here's the interesting piece of the log:
Code: Select all
9276 [] INFO 2014-04-22 09:20:03.058 TM1.Chore Chore "DAILY_RELOAD_PNL_STAGING" executed by scheduler
9276 [] INFO 2014-04-22 09:20:03.058 TM1.Process Process HQPS_DDW_UPLOAD_CHECK_FLAG executed by chore DAILY_RELOAD_PNL_STAGING
9276 [] ERROR 2014-04-22 09:20:03.074 TM1.Process Process "HQPS_DDW_UPLOAD_CHECK_FLAG": : Execution was aborted by ChoreQuit() Function.
9276 [] INFO 2014-04-22 09:20:03.074 TM1.Chore Chore "DAILY_RELOAD_PNL_STAGING" time = 0.02 seconds
9276 [] INFO 2014-04-22 09:20:03.074 TM1.Chore Chore "DAILY_RELOAD_PNL_STAGING" finished executing
9284 [] INFO 2014-04-22 09:20:05.102 TM1.Chore Chore "DAILY_RELOAD_PNL_SUMMARY" executed by scheduler
9284 [] INFO 2014-04-22 09:20:05.102 TM1.Process Process HQPS_DDW_UPLOAD_CHECK_FLAG executed by chore DAILY_RELOAD_PNL_SUMMARY
9284 [] ERROR 2014-04-22 09:20:05.102 TM1.Process Process "HQPS_DDW_UPLOAD_CHECK_FLAG": : Execution was aborted by ChoreQuit() Function.
9284 [] INFO 2014-04-22 09:20:05.102 TM1.Chore Chore "DAILY_RELOAD_PNL_SUMMARY" time = 0.00 seconds
9284 [] INFO 2014-04-22 09:20:05.102 TM1.Chore Chore "DAILY_RELOAD_PNL_SUMMARY" finished executing
9300 [] INFO 2014-04-22 09:22:50.522 TM1.Process Process "DDW_UPLOAD_PNL_SUM_ReconcileRGMtoHFM_ACTatBENCH_USD": finished executing normally, elapsed time 1303.32 seconds
9300 [] INFO 2014-04-22 09:22:50.584 TM1.Process Process "DDW_CALC_SLS_SUM_GR_VARIANCES" run from process "DDW_UPLOAD_RGM_SLS_SUM" by user "R*DAILY_RELOAD_SALES_SUMMARY"
9300 [] INFO 2014-04-22 09:23:05.592 TM1.Process Process "DDW_CALC_SLS_SUM_GR_VARIANCES": finished executing normally, elapsed time 15.01 seconds
9300 [] INFO 2014-04-22 09:23:05.592 TM1.Process Process "DDW_CALC_SLS_SUM_ASP_CEU_EQU_UNITS" run from process "DDW_UPLOAD_RGM_SLS_SUM" by user "R*DAILY_RELOAD_SALES_SUMMARY"
9296 [] INFO 2014-04-22 09:23:05.655 TM1.Process Process DDW_UPLOAD_RGM_SLS_STG executed by chore DAILY_RELOAD_SALES_STAGING
9296 [] INFO 2014-04-22 09:23:05.655 TM1.Process Process "DDW_UPLOAD_SLS_STG_ACTatACT_NON_USD" run from process "DDW_UPLOAD_RGM_SLS_STG" by user "R*DAILY_RELOAD_SALES_STAGING"
9300 [] INFO 2014-04-22 09:23:05.670 TM1.Process Process DDW_UPLOAD_RGM_SLS_SUM executed by chore DAILY_RELOAD_SALES_SUMMARY
9300 [] INFO 2014-04-22 09:23:05.670 TM1.Process Process "DDW_UPLOAD_SLS_SUM_EXPORT_USD_ADJ_2" run from process "DDW_UPLOAD_RGM_SLS_SUM" by user "R*DAILY_RELOAD_SALES_SUMMARY"
9288 [] INFO 2014-04-22 09:23:05.670 TM1.Process Process "HQPS_ACTUALS_CUBESAVEDATA": finished executing normally, elapsed time 2416.80 seconds
9288 [] ERROR 2014-04-22 09:23:05.670 TM1.Process Process "DDW_UPLOAD_HFM_PPL_STG": finished executing with errors. Error file: <TM1ProcessError_20140422084248_DDW_UPLOAD_HFM_PPL_STG.log>
9288 [] INFO 2014-04-22 09:23:05.670 TM1.Process Process HQPS_DDW_UPLOAD_RESET_FLAG executed by chore DAILY_RELOAD_PPL_STAGING
9288 [] INFO 2014-04-22 09:23:05.811 TM1.Process Process "HQPS_DDW_UPLOAD_RESET_FLAG": finished executing normally, elapsed time 0.13 seconds
9288 [] INFO 2014-04-22 09:23:05.811 TM1.Chore Chore "DAILY_RELOAD_PPL_STAGING" time = 6059.73 seconds
9288 [] INFO 2014-04-22 09:23:05.811 TM1.Chore Chore "DAILY_RELOAD_PPL_STAGING" finished executing
9300 [] INFO 2014-04-22 09:23:05.826 TM1.Process Process "DDW_UPLOAD_SLS_SUM_EXPORT_USD_ADJ_2": finished executing normally, elapsed time 0.16 seconds
9300 [] INFO 2014-04-22 09:23:05.826 TM1.Process Process "DDW_UPLOAD_SLS_SUM_ACTatACT_NON_USD" run from process "DDW_UPLOAD_RGM_SLS_SUM" by user "R*DAILY_RELOAD_SALES_SUMMARY"
*** MY COMMENTS ***
Notice that the first 3 lines verify the flag, they find it turned off and the Chore ends.
In line number 12 you can see that there's a process called DDW_CALC_SLS_SUM_GR_VARIANCES that gets called from the wrapper 'DDW_UPLOAD_RGM_SLS_SUM' (Second TI in the Chore)
after this aboved mentioned TI there should be 4 more pending to be executed. However on line 17 you will see that the wrapper gets called again without being finished.
It says that it gets called by the Chore, however the Chore should first check the Flag by using the generic process and then move to the DDW_UPLOAD_RGM_SLS_SUM wrapper.
This is why I say that the process start once but end only once.
I hope I'm being clear enough on this as it's a bit tricky to explain.
Thanks in advance to all.
Nicolás