No Server Abort on a Disk Failure?

Suggest and discuss enhancements for TM1
Post Reply
User avatar
Alan Kirk
Site Admin
Posts: 6606
Joined: Sun May 11, 2008 2:30 am
OLAP Product: TM1
Version: PA2.0.9.18 Classic NO PAW!
Excel Version: 2013 and Office 365
Location: Sydney, Australia
Contact:

No Server Abort on a Disk Failure?

Post by Alan Kirk »

This is an infrequent issue, but it bit me this morning.

We have a number of "one off" (unscheduled) chores which populate a scenario for weekly reporting. These chores are not logged because there's far too much data. The intention is that they create a "snapshot" of the numbers at the time it was run, which the business units can then adjust without having to worry about having a moving target.

We don't do a data save immediately after the chore is run because with cubes stretching into hundreds of megs it just takes far too long.

The next data save is the next morning, before people get in. (Remember that we're on 8.2.12, which has server hang issues with chore driven data saves. It therefore has to be done manually.)

This morning I got an "Unable to write to disk" error. I still haven't isolated the cause of it, but have asked the IT department to look at it. Whatever it was, it was transient since I've done data saves since then.

However TM1 doesn't see it that way. As soon as it hits a disk error, it aborts, claiming that it hasn't saved any data. (This isn't necessarily true. If the error is encountered after some cubes have been saved to disk, then when the server restarts those cubes, but not the others, will have all changed values - whether logged or unlogged. It will only be true if the error is hit on the first cube being written.)

This meant that although we repopulated our reporting scenario this morning... the numbers WEREN'T the same as the snapshot from the previous day.

I can't think of a single good reason for TM1 aborting on such an error. Yes, a disk error MAY be symptomatic of a serious hardware failure, but equally it may not. The data is still in memory. If the server aborts, you lose unlogged changes, without any way back. If it doesn't, and it gives you an opportunity to cancel the save and/or to try again later, at least you don't lose the unlogged changes.

The only reason I can think of for aborting is that you may end up in a situation where some of the logged changes have been saved to disk in the .cub files and some haven't, but if the error DOES occur part way through a save then that ship has already sailed. The server restart should reload all of the logged values anyway, just as it currently does.

The only other reason is of course that Services are supposed to be invisible, and shouldn't be popping up dialogs.

However I think that this is a serious enough issue that the user should at least be given the OPTION of not losing all of their unlogged entries, especially as there will probably be MANY unlogged interfaces in the user base.

Thoughts?
"To them, equipment failure is terrifying. To me, it’s 'Tuesday.' "
-----------
Before posting, please check the documentation, the FAQ, the Search function and FOR THE LOVE OF GLUB the Request Guidelines.
User avatar
Steve Vincent
Site Admin
Posts: 1054
Joined: Mon May 12, 2008 8:33 am
OLAP Product: TM1
Version: 10.2.2 FP1
Excel Version: 2010
Location: UK

Re: No Server Abort on a Disk Failure?

Post by Steve Vincent »

Tend to agree with you. I have seen it once on one of our servers and likewise couldn't see any legitimate reason for the failure. We too lost the changes due to the way it was handling the error - not a great customer experience when we were in the middle of month end. It might also link in with an issue i found yesterday when a TI was writing an ascii file and i cancelled it, but it leaves the file locked until the service is restarted. It might be linked, it might not, but i'll post a note in support to try and find out anyway.
If this were a dictatorship, it would be a heck of a lot easier, just so long as I'm the dictator.
Production: Planning Analytics 64 bit 2.0.5, Windows 2016 Server. Excel 2016, IE11 for t'internet
Post Reply