We have a central server (Z) and currently 3 servers (A, B and C) are set up to replicate some cubes from it. C is the latest server to be added to the replication and A & B have been running fine for over a year now.
C was setup to replicate this week and the 1st time i came to replicate to all machines i have a "systemserverconnectionfailed" error on B. No configuration changes have been made on A or B, but the extra server was added to the config of Z and C, as instructed in the guides. A and C both replicate fine with no issues, but no matter what i have tried B refuses to play. All services have unique ports and the configs are correct. I can see the admin user in Z (which is used to connect during replication) in TM1top when i try to replicate anything to B, but it tries 3 times (according to the error messages) to connect and fails everytime. I have deleted and recreated the replication and i still get the same issue. Has any one got any ideas why this has suddenly gone south?
TIA
If this were a dictatorship, it would be a heck of a lot easier, just so long as I'm the dictator. Production: Planning Analytics 64 bit 2.0.5, Windows 2016 Server. Excel 2016, IE11 for t'internet
Ugh, replication issues are the worst to troubleshoot. And, if your servers are separated by a WAN, it takes FOREVER....
All four servers are running exactly the same version of TM1, right?
Have you tried changing the password for the account on Z that B uses to replicate? Could have been an issue where the account was locked and changing it would reset that.
Have you tried restarting the admin server?
When you delete and re-create the replication, does it work and only fail on the first synchronization after initial rep?
Obvious question, but each A/B/C server has its own account on Z right? I think that's how I read your comment, but it's a little ambiguous so I thought I'd check.
Do any of the replication times overlap? I've had much more success when I stagger replications. I believe that it's actually critical if the replications have overlapping write permission. This is a long shot since you're Server B can't even log in, but it's worth mentioning.
I haven't used replication in quite some time, but years ago I had a simialr problem which had to do with the internal time on the servers. I don't know if subsequent versions of TM1 resolved the problem, but back then the GMT had to match (or be reasonably close) or the replication would fail.
Cheers guys, it has been resolved but disturbingly i do not know why it happened in the first place. I wasted most of Friday chasing for TM1 reasons when our IT were meddling with the server without our knowledge. When they REBOOTED it all the issues went away.
To answer the various questions to aid others that may stumble across this when they get issues;
1> oh yes, been caught out by that one before.
2> no but will keep that in mind in the future.
3> yes to no avail.
4> fails to even set up first time around.
5> no, they all use the same account on z to replicate with (see 6).
6> no because all replication is done by our team manually after certain structural or monthly data changes.
7> i can not control the internal time of the servers but they are all centrally synced and appear to be in line with each other.
I have yet to receive an explaination from IT as to what they were playing at. I think they were fixing some network drivers by the look of the event logs but wtf that is done without our knowledge or permission on a production machine is beyond me...
If this were a dictatorship, it would be a heck of a lot easier, just so long as I'm the dictator. Production: Planning Analytics 64 bit 2.0.5, Windows 2016 Server. Excel 2016, IE11 for t'internet