Crash on a specific server

Post Reply
Catherine
Posts: 110
Joined: Wed May 20, 2009 7:30 am
OLAP Product: TM1
Version: 10.2.2 - PA
Excel Version: 2010
Location: Rennes, France

Crash on a specific server

Post by Catherine »

Hi,

We have here about 20 instances running on 3 "supposed to be identical" servers. The 20 instances are built the same way (same dimensions, cubes, processes, calculation rules), but correspond to different geographical areas.

Our problem is that we have some instance crashes on one of the 3 servers. If we move the instance to another server, it does not crash anymore.
The crash happens when starting the instance, at the moment "Computing feeders". It is random.

Have you ever met such a situation where an instance crashes on one server and not on another? Any idea ?

Thank you

PS: We use TM1 9.5.2 FP1
tomok
MVP
Posts: 2836
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: Crash on a specific server

Post by tomok »

While it is technically feasible to run multiple instances of TM1 under the same OS session, you do so at your own risk. This is because TM1 does not play well with other apps, even other instances of TM1, when it comes to memory. Each instance of TM1 is going to grab all the memory it needs (or thinks it needs in the case of using multiple processors in the cube loading and processing of feeders). If you have set the MaximumCubeLoadThreads to more than 1 TM1 is going to grab more memory than it actually needs. The more processors you set in this parameter, the more memory it grabs. You could be crashing due to lack of memory caused by the initial garbage memory being too high on startup of the instance. Either Use VMWare, or something like it, to separate the OS sessions, or drop the processors in the MaximumCubeLoadThreads, if you have it set to more than 1.
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/
Catherine
Posts: 110
Joined: Wed May 20, 2009 7:30 am
OLAP Product: TM1
Version: 10.2.2 - PA
Excel Version: 2010
Location: Rennes, France

Re: Crash on a specific server

Post by Catherine »

Thanks for your help but this is not our problem.
First, our server is over-oversized in terms of RAM. Secondly, I manage to make my instance crash even when the others instances are stopped.
User avatar
Michel Zijlema
Site Admin
Posts: 712
Joined: Wed May 14, 2008 5:22 am
OLAP Product: TM1, PALO
Version: both 2.5 and higher
Excel Version: 2003-2007-2010
Location: Netherlands
Contact:

Re: Crash on a specific server

Post by Michel Zijlema »

Hi Catherine,

Is there a process (or image) that guarantees that all three machines are configured identically? Or could there still be minor differences in OS/Drivers/TM1 configuration?
Have you checked the possibilty of a faulty RAM module or other hardware related issue in the problem server?

Michel
Catherine
Posts: 110
Joined: Wed May 20, 2009 7:30 am
OLAP Product: TM1
Version: 10.2.2 - PA
Excel Version: 2010
Location: Rennes, France

Re: Crash on a specific server

Post by Catherine »

Hi Michel,

The 3 servers have been installed by the same people, at the same moment, but nothing guarantees that everything is exactly the same... I have already checked TM1 version. It's the same. The rest is not under my control.

We have also thought about a problem with RAM, or with the hardware. We are planning to test further in that direction, but not right now because the moment is a little bit critical regarding users activity. Waiting for further investigation, we have parametered the service to restart until it does not crash during the night, as the crash appears only when starting the instance.

Thanks
Catherine
User avatar
Michel Zijlema
Site Admin
Posts: 712
Joined: Wed May 14, 2008 5:22 am
OLAP Product: TM1, PALO
Version: both 2.5 and higher
Excel Version: 2003-2007-2010
Location: Netherlands
Contact:

Re: Crash on a specific server

Post by Michel Zijlema »

Hi Catherine,

Just a longshot - and not in line with your remark that the same model works on one server and not on the other, but anyway:
I had a problem once with a server crashing at startup which was due to an ambigious element in a rule (in one of the dimensions of the regarding cube an element was added which name corresponded with an element in another dimension in this cube which was referenced in a rule).
You could start the server without the rule files and add the rules after startup to see whether this will lead to error messages. Maybe this will help you localize the source of the issue.

Michel
rmackenzie
MVP
Posts: 733
Joined: Wed May 14, 2008 11:06 pm

Re: Crash on a specific server

Post by rmackenzie »

Catherine wrote:The crash happens when starting the instance, at the moment "Computing feeders".
From this statement I assume you've checked in tm1server.log to see the last entry before the crash - is it always the same cube? A crash on this entry would indicate that memory shortage is the issue as setting feeders can use large amounts of memory, but you also mention that:
Catherine wrote:our server is over-oversized in terms of RAM
But I'm not clear if you are on 32-bit or 64-bit TM1... I am guessing 64-bit as it is pretty hard to oversize the RAM on a 32-bit server for TM1 nowadays. If you are 32-bit then I would check the setting the 3 gig switch (in the Windows boot.ini file) across all your environments - but this is a real long shot.

My only other suggestion at this point is to check the Windows Event log for the tm1sd process and see if that sheds any light on your issue.
Robin Mackenzie
Catherine
Posts: 110
Joined: Wed May 20, 2009 7:30 am
OLAP Product: TM1
Version: 10.2.2 - PA
Excel Version: 2010
Location: Rennes, France

Re: Crash on a specific server

Post by Catherine »

Michel Zijlema wrote:I had a problem once with a server crashing at startup which was due to an ambigious element in a rule (in one of the dimensions of the regarding cube an element was added which name corresponded with an element in another dimension in this cube which was referenced in a rule).
You could start the server without the rule files and add the rules after startup to see whether this will lead to error messages. Maybe this will help you localize the source of the issue.
Hi Michel,
Was your crash systematic or random?
To be clearer: my crash is random (one crash over 3 or 4 instance start on average), and only one one server.
The rules are OK, it is when adding feeders that the problem randomly appears. I can reproduce the crash when saving the rules.
It seemed to me that it is linked to feeders refering to attributes. But i tried yesterday to rewrite my feeders without refering to those attributes. I reproduced the crash on the 13th start...!
So it's very difficult to determine the source of the crash.
Thank you for your help.
rmackenzie wrote:From this statement I assume you've checked in tm1server.log to see the last entry before the crash - is it always the same cube? A crash on this entry would indicate that memory shortage is the issue as setting feeders can use large amounts of memory, but you also mention that:
Catherine wrote:our server is over-oversized in terms of RAM
But I'm not clear if you are on 32-bit or 64-bit TM1... I am guessing 64-bit as it is pretty hard to oversize the RAM on a 32-bit server for TM1 nowadays. If you are 32-bit then I would check the setting the 3 gig switch (in the Windows boot.ini file) across all your environments - but this is a real long shot.
My only other suggestion at this point is to check the Windows Event log for the tm1sd process and see if that sheds any light on your issue.
Hi rmackenzie,
The crash happens when computing feeders for my main cube (only one of the cube has rules in my instance).
We are running 64-bit TM1, and our server has 196 Go RAM...
The event viewer does not help me very much:
"Faulting application name: tm1sd.exe, version: 9.5.20100.18046, time stamp: 0x4e80c6bb
Faulting module name: tm1sd.exe, version: 9.5.20100.18046, time stamp: 0x4e80c6bb
Exception code: 0xc0000005
Fault offset: 0x000000000010fab7
Faulting process id: 0xc5c
Faulting application start time: 0x01ccb36a95e94d11
Faulting application path: C:\Program Files\Cognos\TM1\bin\tm1sd.exe
Faulting module path: C:\Program Files\Cognos\TM1\bin\tm1sd.exe
Report Id: 6b6ee3b1-1f61-11e1-a619-e41f13be6ae8"
Thank you for your help
tomok
MVP
Posts: 2836
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: Crash on a specific server

Post by tomok »

It's time to call IBM and get them on the case. They have a special debug version of tm1sd.exe they'll have you run which will capture error information they can use to find out what's happening. I doubt there's anything else that anyone on this forum can do to help you on this one.
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/
Catherine
Posts: 110
Joined: Wed May 20, 2009 7:30 am
OLAP Product: TM1
Version: 10.2.2 - PA
Excel Version: 2010
Location: Rennes, France

Re: Crash on a specific server

Post by Catherine »

tomok wrote:It's time to call IBM and get them on the case.
I've already done that, but so far, without success. I gave IBM a dump, but they didn't find the problem...
I posted here just in case somebody met the same problem !

Thank you everybody
rmackenzie
MVP
Posts: 733
Joined: Wed May 14, 2008 11:06 pm

Re: Crash on a specific server

Post by rmackenzie »

Catherine wrote:The crash happens when computing feeders for my main cube (only one of the cube has rules in my instance).
One tactic you can use is to remove all the feeders from the rule file and then one by one, add them back to the rule file and restart the instance each time. That way you can identify if one particular statement is causing the server to crash. I've no idea how many feeder statements you have and this could be a very tedious exercise...!
Robin Mackenzie
dkleist
Posts: 56
Joined: Wed May 21, 2008 12:33 pm

Re: Crash on a specific server

Post by dkleist »

Is DEP turned off on the crashing server?
Catherine
Posts: 110
Joined: Wed May 20, 2009 7:30 am
OLAP Product: TM1
Version: 10.2.2 - PA
Excel Version: 2010
Location: Rennes, France

Re: Crash on a specific server

Post by Catherine »

rmackenzie wrote:
Catherine wrote:The crash happens when computing feeders for my main cube (only one of the cube has rules in my instance).
One tactic you can use is to remove all the feeders from the rule file and then one by one, add them back to the rule file and restart the instance each time. That way you can identify if one particular statement is causing the server to crash. I've no idea how many feeder statements you have and this could be a very tedious exercise...!
There are not so many feeders. So I'v already tested what you say but several feeders statements can cause the crash!
Their particularity is that they refer to some attributes. So I tried to rewrite my feeders without refering to any attribute. The crash is then less often but is still there!
Once more, we haven't seen any crash on our other servers so I have given up the idea that the crash comes from the TM1 modelisation itself...
Catherine
Posts: 110
Joined: Wed May 20, 2009 7:30 am
OLAP Product: TM1
Version: 10.2.2 - PA
Excel Version: 2010
Location: Rennes, France

Re: Crash on a specific server

Post by Catherine »

dkleist wrote:Is DEP turned off on the crashing server?
Sorry, but what is DEP ? :oops:
Duncan P
MVP
Posts: 600
Joined: Wed Aug 17, 2011 1:19 pm
OLAP Product: TM1
Version: 9.5.2 10.1 10.2
Excel Version: 2003 2007
Location: York, UK

Re: Crash on a specific server

Post by Duncan P »

Catherine
Posts: 110
Joined: Wed May 20, 2009 7:30 am
OLAP Product: TM1
Version: 10.2.2 - PA
Excel Version: 2010
Location: Rennes, France

Re: Crash on a specific server

Post by Catherine »

dkleist wrote:Is DEP turned off on the crashing server?
I don't know. I will ask server team.
Do you advise to turn it on if it is not?

Nevertheless, I'm currently moving TM1 instances from the server with crashes to other stable servers. Then we will be free to investigate, reinstall if necessary the server on which we have crashes.

Thanks for your help
dkleist
Posts: 56
Joined: Wed May 21, 2008 12:33 pm

Re: Crash on a specific server

Post by dkleist »

In general, yes. I've experienced enough issues with DEP across different Cognos products that I do that for any of them. The particularly insidious issue with DEP is that it kills processes so symptoms and error messages don't tell you why your application stopped working.
User avatar
qml
MVP
Posts: 1096
Joined: Mon Feb 01, 2010 1:01 pm
OLAP Product: TM1 / Planning Analytics
Version: 2.0.9 and all previous
Excel Version: 2007 - 2016
Location: London, UK, Europe

Re: Crash on a specific server

Post by qml »

dkleist wrote:In general, yes. I've experienced enough issues with DEP across different Cognos products that I do that for any of them.
I take it that you turn DEP off, not on?
Kamil Arendt
dkleist
Posts: 56
Joined: Wed May 21, 2008 12:33 pm

Re: Crash on a specific server

Post by dkleist »

Yes, turn off DEP ("essential windows services and programs only" option)
Post Reply