Monday, August 8, 2016

backup - What is the industry standard solution for linux web server and mac file server failover systems?




I am a web developer at my company and am therefore qualified to be an IT guy, apparently. I have a single Ubuntu web and file server currently, but I want to break up the server tasks onto two different systems, both of which I want to be highly available. We have no backup system in place, so if this box goes down, we are done.



All of my computers that use the file server are Macs, so I was considering getting two Mac Pros to use for IP failover. I already have two PCs I can use to throw a distro of Linux on for the web server. I want the data of both Linux boxes to mirror each other and both Mac Pros to mirror each other, though I haven't found a solution for that yet.



Am I approaching this right? Is my thinking in line with industry standards? I realize there are probably numerous ways of attacking this. I am trying to prepare for growth, while fixing the backup issue.


Answer



For the Highly Available web servers, look into OpenAIS, Pacemaker and DRBD to build an HA Cluster. OpenAIS is cluster messaging software, Pacemaker is Cluster Resource Management software, and DRBD (Distributed Replicated Block Device) is "network RAID1". Combining these, you can build a cluster out of two or more nodes.



There are basically two ways to do it: active/passive, and active/active. Active/passive is going to be the easiest to setup (and maintain). In active/passive, one machine provides services, while the other sits idle and waits for the active machine to fail. In active/active, both machines would provide services.




OpenAIS will handle passing messages between cluster nodes to make sure they are available and responding.



Pacemaker will handle running your resource, eg Apache, DRBD, FTP, whatever. It also handles moving resources between nodes (say, in the event of a node failure) and handles stop/start/management of resources.



DRBD, is pretty cool. It sits between the kernel and the filesystem and (in Protocol C) when a write is issued, DRBD issues a write to the other cluster node, and once both nodes confirm the write to disk, it is considered committed. So basically the write has to be on-disk on both nodes to be committed. This is how you make sure that whatever you are serving up with Apache, is exactly the same on both machines, so should a failover occur you're serving up the same thing.



If you had shared storage (eg, an iSCSI SAN), then you could remove DRBD from the mix.



You can google Clusters From Scratch (it's on ClusterLabs.org) for a basically step-by-step guide to doing this.



No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...