Page 1 of 1

Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 5:29 am
by harrytm1
hi all,

We are planning to cluster 2 physical servers as part of failover plan for TM1 service. The plan is:
- If Server 1, which is running as TM1 Admin service, Tm1 server, and Web server fails, Server 2 will take over Admin service, Tm1 server and web immediately. Client users do not have to change Admin Host to Server 2 and Web users do not have to change URL.

- hence, both Server 1 and 2 will be running concurrently and the Tm1 server will be similary named e.g. "ABCD". Both instances will also point to the same Data folder and Log folder.

- this is to allow failover to take place without human intervention i.e. Server 2 will take over automatically.

Issues:
- I think there is an issue with licensing since this set-up requires two instances of TM1 Admin server running concurrently. Correct?

- Also, both servers will be similarly named "ABCD". Client users will see two ABCD in Server Explorer. I think I must set different port numbers so that there is no conflict.

- Lastly, this effectly means that users can enter TM1 server "ABCD" through either Server 1 or Server 2. Client user can set Server 1 as Admin Host, but it will still pick up two instances of ABCD servers. Will this work? Any other data integrity issue since the data files will be written through two instances?

many thanks in advance.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 5:33 am
by Alan Kirk
harrytm1 wrote: - hence, both Server 1 and 2 will be running concurrently and the Tm1 server will be similary named e.g. "ABCD". Both instances will also point to the same Data folder and Log folder.
No they won't. The first server to start will grab a lock on the transaction and server logs. The second one will fail to start saying that it can't access them.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 6:16 am
by Martin Erlmoser
don't think that it will work.

there is a really old document like the "MS Failover cluster" & my tm1server
maybe not and i'm dreaming.

but who cares, this won't make tm1 more stable.

virtualization makes "HA" easier..

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 8:56 am
by Martin Ryan
I think it might be possible if you didn't have to share the logging directory. You could set up two cfg files in two different directories, each pointing to the same data directory but both with different server names and different logging directories.

I'm still not sure it's a particularly good idea because you would need to restart the backup server in order to get the latest cubes, and I would have thought that the most likely DR situation is the server blowing up, so going to the same data directory is unlikely to be possible.

Martin

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 11:08 am
by David Usherwood
I had a look earlier this year into using rsync-based techniques to align two TM1 server directories, to avoid using (shudder) TM1's horrid rep and sync. It worked; but as you observe, a server (re)start was required to get the changes to show. Not, therefore, a failover solution.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 1:41 pm
by harrytm1
thanks to everyone for your reply.

I gave it a go today. it seems to be working. the TM1 Admin service and TM1 server are set to restart after 1 minute. So now Server 1 is running both Admin service and Tm1 service. Server 2 attempts to start Tm1 service but fails as it is unable to locate the data files.

The data files are located in another box and it is set up to only be accessible if a server is running fine. So now Server is locking the data files while Server 2 is attempting to restart Tm1 service always. FYI, TM1 Admin service is running in Server 2 (which i'm not sure will cause any conflict with Admin service in Server 1).

When Server 1 fails, the data files are now free up and the next time Server 2 attempts to start Tm1 service, it will locate the files and start up successfully. so it seems to work...

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 2:01 pm
by Paul Segal
harrytm1 wrote: When Server 1 fails, the data files are now free up and the next time Server 2 attempts to start Tm1 service, it will locate the files and start up successfully. so it seems to work...
Interesting. So, how about the clients? If Server 1 fails and Server 2 takes over, do the clients seamlessly take data from Server 2, or would they need to re-connect (or re-start Excel)? Not to be a glass-half-full man, but my money would be on the latter.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 2:13 pm
by Marcus Scherer
did you test it? Changes not saved to disk are retrieved by Server2?

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 4:35 pm
by jim wood
The way we have got around this issue is as follows:

We have 2 servers: Nevada and Utah. Both boxes are running the tm1admin service and both of our tm1server instances. On Sunday night Utah is brought down and the cube,dimension & rules files are copied across from Nevada to keep them in line. Then all of our load processes that are controlled by SQL server (DTS) load data on a daily basis (Our weekly data load completes before the copy to Utah) in to production. The backup server processes (Again controlled by SQL DTS jobs) firstly copy the files (generated by SQL before loading in to production) from the production server to the backup server before loading.

While this approach keeps all automated data in line it does not keep any in TM1 only data (Planning data) in line. Normally I copy this across as and when it is completed in production.

I hope this gives you another way of possibly doing what you are after,

Jim.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Mon Nov 16, 2009 5:12 pm
by Steve Vincent
Hi Harry, personally i think you are going a bit over the top with your solution and certainly adding complexity where it possibly doesn’t warrant it. A few reasons I can think of;
• What size are the server audit logs after server 2 fails to start a service every minute? By that I mean the windows logs, not the TM1 ones – it won’t take long before that becomes massive and useless to any IT support that might need to find something else in it.
• What are the usual reasons for a server to crash? In my experience its model related rather than hardware, either logs get so massive that it runs out of disk space or a dodgy rule / TI consumes all the resources or gets in to a loop. Your solution can only deal with hardware issues and if you had problems big enough to warrant this setup you should be looking at your IT provider for more reliable hardware.
• With the data elsewhere you are adding an extra “hop” in to the server data journey. You didn’t specify how that was being stored (SAN or 3rd server for instance) but does it deal with a failure on the data storage device? Again my experience is that is more likely to break than the CPU / RAM side of things.
• Our servers have been down for a grand total of 10 minutes over the last 2 ½ years. A couple of times that was my fault (oops!) and the rest was undetermined user actions, but it is such a small percentage it doesn’t seem worth going to all that effort. If we do have a hardware failure our option is to swap the SAN to a spare box and run from that. It is not an immediate fix but in those extreme circumstances we are not expected to provide an immediate fix.
• What happens when server 1 gets fixed and you need to swap back again?
• Is it worth the running / support costs of 2 servers if downtime is so low? Our option allows us to have a development server and we “loose” that functionality if we have to run it using the SAN with our production data. From our point of view that is a much better use of the resource.

Just my 2p :)

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Tue Nov 17, 2009 1:59 am
by harrytm1
Paul Segal :
Interesting. So, how about the clients? If Server 1 fails and Server 2 takes over, do the clients seamlessly take data from Server 2, or would they need to re-connect (or re-start Excel)? Not to be a glass-half-full man, but my money would be on the latter.
IT is trying to get a common name so that it will be seamless for client users and web users.

Marcus Scherer :
did you test it? Changes not saved to disk are retrieved by Server2?
I have not tested fully yet, but since changes not save into files will be recorded in the log file, I would think that, when Server 2 starts up, it will also load in the unsaved changes. Please correct me if I'm wrong here.

Many thanks to Jim Wood for your alternative solution!

To Steve Vincent:
I share your views totally. The data files is in a SAN. I totally feel that this is an overkill, but there are some firm believers that feel that it is mission-critical for the system to be available 24x7 so that users around the world can use it. Well...

When Server 1 is fix, I suppose Server 2 will be shut down and TM1 service will be restarted in Server 1 to take over the data files in the SAN. I'm not aware of the increasing Windows log files when a service constantly fails to restart, but this is a real concern. thanks for highlighting!

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Tue Nov 17, 2009 2:23 am
by Alan Kirk
harrytm1 wrote:Paul Segal :
Interesting. So, how about the clients? If Server 1 fails and Server 2 takes over, do the clients seamlessly take data from Server 2, or would they need to re-connect (or re-start Excel)? Not to be a glass-half-full man, but my money would be on the latter.
IT is trying to get a common name so that it will be seamless for client users and web users.
I wouldn't count on it being seamless. What we've found over the years is that if the users don't restart their Excel sessions when a server shuts down and restarts they'll often, not always, but often, get screwy numbers, random key errors and what have you returned. I imagine that this would be potentially an order of magnitude worse with two completely different servers of the same name.

I agree with Steve... this is overkill, and it's likely to cause you more problems than it solves. If TM1 really is that critical, it would be far better (and safer) to have two completely independent sessions running on two boxes, with one as a backup. Yes, the users may have to point to the backup server if the primary goes down (which, as Steve rightly points out, shouldn't be often), but at least if they have to point to a differently named server there's less chance of them getting screwy numbers from confused, cached data.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Tue Nov 17, 2009 2:30 am
by lotsaram
Edit: looks like Alan beat be to it but I'll post 2c anyway

This is an interesting thread. Indeed your fail over solution does seem like excessive overkill.

Your system seems set up to account for a freak hardware failure event on the primary server and have the secondary server up as quickly as possible without human intervention should this occur. However 99% + of the time the TM1 server will be offline due to a software glitch (crash in plain English) and your system will also kick in in this instance as well ...

Since ...
1/ all server names registered on a TM1 admin service must be unique
2/ you are sharing a logging directory so tm1s.log will be locked by the first TM1 server to register itself
... then the secondary server has to wait for the primary to fall over before it starts loading. The TM1 service should be set to auto-recover anyway so for the 99% of the time when you have a software crash not hardware failure all you are saving in downtime is the 30 or 60 sec delay before tm1sd automatically restarts. And should you have superior hardware in the primary environment then it might catch up this difference during the server load anyway. If the server load time on restart is long (say > 15 min) then the additional (sub 60 sec) uptime is neither here nor there.

It seems like a very large cost to bear for 30 - 60 seconds of additional uptime in the event of a crash, plus the additional complexity of toggling environments.

What about an alternative with the secondary environment replicated (probably not with TM1 rep-sync but by ftp of the data directory) once per x hours and always up at the ready running on a separate admin host and separate log directory with the same TM1 server name. Monitor the primary environment and if it goes down simply do an IP switch of the admin host. For good measure you could externally fire a process to load the tm1s.log file from the primary log directory to keep the environment in sync. I think this would be more seemless than your current proposal and give better total system uptime (since it seems that is what you are after.)

Regardless of which way you go Paul and Alan are correct that any users with current sessions, whether excel or web, would need to close and log on again. There is no way to avoid this whatever option you go with.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Tue Nov 17, 2009 6:44 am
by Martin Erlmoser
sorry, i miss the point here.

in my experience 90% of all crashes are tm1 model related -> the server works fine, os works fine but the tm1server started a recycle maintenance process ;)

the rest of these crashes are maybe "whatever problems" e.g. the connection to the san is down (happens more often then i thought one year before) or any os problem where you have to restart the server.

and some hardware problems.

if you need HA with tm1 you should think about vmware, but also here you won't get rid of all tm1 related problems..

i would say maybe configure server2 but switch the service to manual, start the service from the clients with a batchfile if you really need it.
then you have no service on your server which is crashing every sek

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Tue Nov 17, 2009 7:57 am
by jim wood
Hi Martin,

I think you are partially correct, I would say about 60% of crashes are model related. There are quite a few bugs in TM1 I don't think having viable backup plan is over kill at all,

JIm.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Tue Nov 17, 2009 8:58 am
by Steve Vincent
True (depending on your version) but that irrelevant consindering that both servers will be using the same version of TM1 anyway, so they are both open to the same problems :lol: In fact TM1 bugs are the only thing you cannot mitigate against, at least not as easily as the rest.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Tue Nov 17, 2009 9:15 am
by Martin Erlmoser
i think you can have something nice in vmware vsphere

2 vm's
mirrored with n minutes delay.


scenario:

tm1server crashes on server1 but immediately vmware swiches to the other vmware with the same name n minutes before the crash occured. (okay, really funny would be, 1 minute before the crash and then the mirror server also crashes.. :lol: )

maybe i saw it in a dream, maybe its real. who knows..

would be interesting to test such an environment.

Re: Two TM1 servers pointing to same Data files, possible?

Posted: Tue Nov 17, 2009 10:55 am
by jim wood
Steve Vincent wrote:True (depending on your version) but that irrelevant consindering that both servers will be using the same version of TM1 anyway, so they are both open to the same problems :lol: In fact TM1 bugs are the only thing you cannot mitigate against, at least not as easily as the rest.
Very true, a backup however may give continuos usage while the production server comes back up. (Ours takes 2 hours)