we are trying to change our setup in order to deal with a catastrophic failure on one of the servers and we are running into a lot of unexpected difficulties in working something out with the storage and infrastructure team.
The basic goal we have is no more than to assure that if their is a catastrophic failure on the production server we can bring up a back-up server running the database in a minimal amount of time (few hours at most) but with absolutely no data loss - this is much more important than whether it takes us 5 minutes or a few hours to recover everything -.
Fine so the first thing we thought of doing was storing the data on the SAN instead of the local hard drives as theorethically at least the SAN is secure. The big issue we have there is that apparently the san storage associated with this server is exclusively locked to it. So yes if that server goes down we would not lose any data, but it seems that the we would have to wait for whatever issue happened to that server itself being fixed which could be a few days easily if we are really unlucky.
So now we have some sort of experimental setup in which a second server can actually see the san drives associated with the first one and the idea is that we would disable these drives on the second server untill there is an actual need to access the data on them, then reboot that second server and they would magically appear. Keeping them enabled on both is a no no as apparently there are sorts of exclusive signatures happening. Anyways we notice it does not work just by seeing that depending on which server I am looking from I see different contents for the same san storage. (Explained by IT to me as the san being locked by server 1, so server 2 currently does not see the contents)
The idea is that if server 1 goes down, server 2 can be rebooted and then we would see the contents.
Now I have several issues:
* first off I do not feel particulary confident in this setup, as I am not so sure whether any locks put by server1 on the san would be properly released in case of a server crash. I can imagine we would reboot server 2 and still see an empty disk for instance.
* I can not imagine there is no better solution than this imaginable?? How is this done at other sites... just keep in mind that a: we want to deal with catastrophic failures b: we should be absolutely sure not to lose any data (so the daily backups are not good enough) and c: a schedule where we would run a batch every few minutes to copy and zip the directory may be unworkable as well.. mostly because I think the savedata operations might cause too many performance hiccups..
Anyone have any suggestions? Keep in mind that I am close to illiterate about hardware or storage concepts so please use basic words
