Scheduled Chores That Fail

Post Reply
User avatar
jim wood
Site Admin
Posts: 3961
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Scheduled Chores That Fail

Post by jim wood »

Guys,

Over the weekend our weekly schedule failed to run due to a problem on the server.The server was restarted and I manually kicked the schedule off later on Sunday. While this worked well the problem I had was that early on Monday morning the scheduler noticed that the weekend schedule had failed and it kicked it of again even though I had manually run it.

This caused me 2 problems:

1) The old 2 processes running at the same time. (We have a daily schedule that ruins every morning)
2) I had to re-run some elements of the weekly schedule again. I maanaged to kill it but the first part of the schedule is a dimension rebuild which clears data.

Now for the question. (Finally) Is there a way of switching off retrospective schedule runs so that any schedule that fails to run is not automatically re-run at a later time?

Cheers,

Jim.
Struggling through the quagmire of life to reach the other side of who knows where.
Go Build a PC
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
John Hammond
Community Contributor
Posts: 300
Joined: Mon Mar 23, 2009 10:50 am
OLAP Product: PAW/PAX 2.0.72 Perspectives
Version: TM1 Server 11.8.003
Excel Version: 365 and 2016
Location: South London

Re: Scheduled Chores That Fail

Post by John Hammond »

Jim

This is based on my understanding and may not be definitive. It seems the scheduling info is actually stored within the chore file (there seems to be no dimension cube structure where run dates are stored) and when a chore runs it updates the chore with the next time to run.

My guess is that that the chore did not save to disk hence next time to run does not get saved and hence TM1 does comparison is current time > next time to run.

So to stop the chore running again you would have to update the chore file and this would involve finding out where the next run date is stored by experimenting with a tool like winmerge to find the point in the file where the next run date is stored.

Code: Select all

534,9
530,20100305042521
531,001000000
532,1
13,16
6,"_All_Cube_Stats"
560,0
533,0

My guess it would be recordtype 530. So you would change this in an editor before you restarted TM1 to the date and time of the next run.

Good luck
Alan Kirk
Site Admin
Posts: 6667
Joined: Sun May 11, 2008 2:30 am
OLAP Product: TM1
Version: PA2.0.9.18 Classic NO PAW!
Excel Version: 2013 and Office 365
Location: Sydney, Australia
Contact:

Re: Scheduled Chores That Fail

Post by Alan Kirk »

John Hammond wrote: This is based on my understanding and may not be definitive. It seems the scheduling info is actually stored within the chore file (there seems to be no dimension cube structure where run dates are stored) and when a chore runs it updates the chore with the next time to run.

My guess is that that the chore did not save to disk hence next time to run does not get saved and hence TM1 does comparison is current time > next time to run.

So to stop the chore running again you would have to update the chore file and this would involve finding out where the next run date is stored by experimenting with a tool like winmerge to find the point in the file where the next run date is stored.

Code: Select all

534,9
530,20100305042521
531,001000000
532,1
13,16
6,"_All_Cube_Stats"
560,0
533,0

My guess it would be recordtype 530. So you would change this in an editor before you restarted TM1 to the date and time of the next run.
Sorry, but that's not correct. 530 is the chore start time and date (expressed as YYYYMMDDHHnnSS in UTC) that you specify in step 2 of the chore setup wizard. 531 is the chore execution frequency. The next run time is never updated anywhere on disk, it's merely calculated in memory presumably by adding the frequency increment on to the start time until you get one which is >= now(). 530 will remain the same, no matter how many times the chore runs or is scheduled, until or unless you change the start time in the wizard again. I'd also strongly advise against manually editing files like those unless you understand (in detail) the internal structure of every line; if you get it wrong then best case the chore / process / whatever doesn't work, worst case the server doesn't.

I didn't reply to Jim's problem because frankly I've never seen that behaviour; as far as I've experienced chores never do "catch ups". If the server was down at the time, it was down and the server's only interest is in when the next future run is. Were it otherwise we'd have a very real danger of uncommanded (and unwanted) chore executions after a server restart. The only thing I can think of is that the chore schedule wasn't what Jim thought it was.
"To them, equipment failure is terrifying. To me, it’s 'Tuesday.' "
-----------
Before posting, please check the documentation, the FAQ, the Search function and FOR THE LOVE OF GLUB the Request Guidelines.
lotsaram
MVP
Posts: 3706
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: Scheduled Chores That Fail

Post by lotsaram »

I second Alan's interpretation both in that chores do not do "catch ups" if an execution is missed due to server downtime then the next run time is the next scheduled time. Also property 530 is the first scheduled run of a chore, it is never updated (unless the chore is rescheduled in the GUI.)
User avatar
jim wood
Site Admin
Posts: 3961
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Re: Scheduled Chores That Fail

Post by jim wood »

I have examined the log file in a little more detail. It seems I kicked off the weekly schedule at (log file time not local time) around 11pm. It ran through until the early hours of the following morning. While it was running another (Coremetrics) schedule was kicked off by our SQL server. (We use the SQL server DTS scheduler for quite a few of our jobs.) With this effectively being a manual execute by the user Admin it did not wait and it executed at the same time. The Coremetrics schedule completed and then the weekly schedule I kicked off completed. For some reason it then re-kicked off my schedule. I have no idea why. I have attached a copy of my log file to help.

Apologies if my first post was a little mis-leading. My time scales were tight and I was a little under the weather. I kicked in to auto pilot mode. The catch up of schedules is a feature that was added while I was at Applix. (A long time ago) I'm not convinced it is still there now,

Jim.
Attachments
tm1planning.log
(79.68 KiB) Downloaded 303 times
Struggling through the quagmire of life to reach the other side of who knows where.
Go Build a PC
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
Alan Kirk
Site Admin
Posts: 6667
Joined: Sun May 11, 2008 2:30 am
OLAP Product: TM1
Version: PA2.0.9.18 Classic NO PAW!
Excel Version: 2013 and Office 365
Location: Sydney, Australia
Contact:

Re: Scheduled Chores That Fail

Post by Alan Kirk »

jim wood wrote:I have examined the log file in a little more detail. It seems I kicked off the weekly schedule at (log file time not local time) around 11pm. It ran through until the early hours of the following morning. While it was running another (Coremetrics) schedule was kicked off by our SQL server. (We use the SQL server DTS scheduler for quite a few of our jobs.) With this effectively being a manual execute by the user Admin it did not wait and it executed at the same time. The Coremetrics schedule completed and then the weekly schedule I kicked off completed. For some reason it then re-kicked off my schedule. I have no idea why. I have attached a copy of my log file to help.

Apologies if my first post was a little mis-leading. My time scales were tight and I was a little under the weather. I kicked in to auto pilot mode. The catch up of schedules is a feature that was added while I was at Applix. (A long time ago) I'm not convinced it is still there now,
How much under the weather, Jim? :D

The thing that's of interest to me in the log file is that the second execution shows the log entry:

Code: Select all

13   INFO   2010-06-14 04:17:56,473   TM1.Chore   Chore weekly_schedule executed by user Lfin337
From your description of the earlier kick-off, Lfin337 would appear to be you, which suggests a manual triggering. Chores which are executed by being scheduled are shown as "executed by scheduler". Is there any possibility that you might have been trying to (say) change the activation state of the chore and accidentally triggered it?
"To them, equipment failure is terrifying. To me, it’s 'Tuesday.' "
-----------
Before posting, please check the documentation, the FAQ, the Search function and FOR THE LOVE OF GLUB the Request Guidelines.
User avatar
jim wood
Site Admin
Posts: 3961
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Re: Scheduled Chores That Fail

Post by jim wood »

Hi Alan,

I didn't do such a thing as I was in Bed. When I got up in the morning to check if everything was alright I noticed that the second run of the weekly had indeed kicked off. I went in to TM1 top straight away and cancelled the run. I was waitning with baited breathe for the server to go down. Thankfully it didn't,

Jim.

PS. Lfin337 is indeed me in disguise!!! 8-)
Struggling through the quagmire of life to reach the other side of who knows where.
Go Build a PC
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
Alan Kirk
Site Admin
Posts: 6667
Joined: Sun May 11, 2008 2:30 am
OLAP Product: TM1
Version: PA2.0.9.18 Classic NO PAW!
Excel Version: 2013 and Office 365
Location: Sydney, Australia
Contact:

Re: Scheduled Chores That Fail

Post by Alan Kirk »

jim wood wrote:I didn't do such a thing as I was in Bed.
Any known issues with sleepwalking? :D
jim wood wrote:When I got up in the morning to check if everything was alright I noticed that the second run of the weekly had indeed kicked off.
The only other thing that I can think of off-hand is the rollback feature that was introduced in 9.1 and never really properly explained in the documentation, to my mind.

I can't find the references at the moment but I recall that there were cases where two chores which kicked off at the same time could cause a conflict which would result in one of them triggering its processes more than once (at least until the server often crashed in the earlier 9.1 releases). However from the look of your logs the process that was running at the time that the second chore kicked off was merely a data save. That shouldn't have caused the kind of conflict that I'm describing here. It's been a while since I used 9.1 though (notwithstanding that I haven't updated my signature file/profile) and I can't recall the exact circumstances under which this will occur. However it definitely wasn't a scheduler issue, but a conflict one and I don't recall there being such a substantial lag between executions. I also can't recall whether it was the whole chore, or just individual processes that repeated.
"To them, equipment failure is terrifying. To me, it’s 'Tuesday.' "
-----------
Before posting, please check the documentation, the FAQ, the Search function and FOR THE LOVE OF GLUB the Request Guidelines.
Alan Kirk
Site Admin
Posts: 6667
Joined: Sun May 11, 2008 2:30 am
OLAP Product: TM1
Version: PA2.0.9.18 Classic NO PAW!
Excel Version: 2013 and Office 365
Location: Sydney, Australia
Contact:

Re: Scheduled Chores That Fail

Post by Alan Kirk »

Alan Kirk wrote: I can't find the references at the moment but I recall that there were cases where two chores which kicked off at the same time could cause a conflict which would result in one of them triggering its processes more than once (at least until the server often crashed in the earlier 9.1 releases).
OK, now I have. Doesn't so much look like what happened in your log, though.
"To them, equipment failure is terrifying. To me, it’s 'Tuesday.' "
-----------
Before posting, please check the documentation, the FAQ, the Search function and FOR THE LOVE OF GLUB the Request Guidelines.
User avatar
jim wood
Site Admin
Posts: 3961
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Re: Scheduled Chores That Fail

Post by jim wood »

I think the key point is I need to get away from 9.1. I don't think this is going to happen any time soon. In the mean time the reason for the problem is a mystery. I guess this is one that I will have to live. Hopefully it won't happen again,

Jim.

PS. I only talk in my sleep. (So I'm told)
Struggling through the quagmire of life to reach the other side of who knows where.
Go Build a PC
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
Post Reply