Two-Node GNBD Redux

Posted by Miguel Sarmiento Sun, 04 Jan 2009 05:37:00 GMT

 In a previous post I had a two-node cluster using gnbd on CentOS. While that setup worked fine it had a major draw back. It used manual fencing thus not a practical real life example. So I decided to test a better setup, since I am using virtual machines the way to do this was to use gnbd fencing and gfs.

In this scenario I will need 3 nodes for the cluster. Refer to the figure on my post for a two-node cluster for a diagram of the setup. I will use:

  • node0 gnbd server
  • node1 first node of cluster
  • node2 second node of cluster

1) Prep Work

Follow the same steps as on the Two-Node gnbd post, make sure the /etc/hosts files have 3 entries pointing to the ips you will be using for each member of the cluster.

  • 10.0.0.2 node0
  • 10.0.0.3 node1
  • 10.0.0.4 node4

Do not use 10.0.0.1 it will be the ip for HA. Have also a free partition for the shared file system, in our case it is /dev/sdb1/, on node0.

2) Configure Nodes

On node0:

  • Install gnbd and kmod-gnbd and gfs
  • Create a gfs file system on /dev/sdb1: gfs_mkfs -p lock_dlm -t cluster:export1 -j 3 /dev/sdb1
  • The important parameter here is -t clustername:fsname, clustername must match the one use in cluster.conf (in our case cluster).
  • -j 3 is the number of journals to create and you need at least one journal per machine that will mount the file system.

Configure both node1 and node2 as indicated in the Two-Node gnbd post and also install gfs. Finally create a mount point (mkdir /ha) on all 3 nodes.

3) Mount File System

At this point you need to copy the necessary files to the new root of the Apache server, but in order to this you need to mount the file system you created in step 2. Unfortunately since the cluster is not running the mount command will fail. So in order to do so we need to fool the mount command, we do this by creating a 2 node cluster with no resources on it and manual fencing. This will allow you to start the cluster and mount the file system. Here we go:

  • Copy the attached file fool-cluster.conf to both node0 and node2 as cluster.conf in /etc/cluster/
  • On node0 issue: service cman start
  • On node2 issue: service cman start

At this point you should be able to mount /dev/sdb1 on mount point /ha on node0 and copy the files for Apache onto it. After doing this stop the cman service on both nodes and unmount /ha. Finally do not forget to edit the Apache configuration accordingly as stated in the previous gnbd post and propagate it to node1 and node2.

4) Configure Startup Scripts

On node0 add the following to /etc/rc.local

  • service cman start
  • gnbd_serv
  • gnbd_export -c -d /dev/sdb1 -e ha
  • service rgmanager start

On nodes 1 and 2 add the following to /etc/rc.local

  • service cman start
  • modprobe gnbd
  • gnbd_import -i node0
  • service gfs start
  • service manager start

5) Start Cluster

Copy the attached file cluster.conf onto /etc/clsuter on all 3 nodes. At this point you can either reboot all nodes in the cluster, start with node0 and then the other two or start the services in the order they appear in /etc/rc.local. If everything goes ok (hint tail -f /var/log/messages on a separate terminal window for any errors) the cluster should be up and the Apache service should have started and /ha mounted on /dev/gnbd/ha on node1 the primary. Now you can go to a browser and browse to the virtual ip you assigned on the cluster.conf file and the web site should respond.

6) Caveats

Please note:

  • I do not have a logical volume created on /dev/sdb1 and do not run clvmd either.
  • I want Apache to fail over only between node1 and node2 thus node0 is not part of the fail over scheme (see cluster.conf for details).
  • Above also means that node0 will not mount the partition you created on node0 if both node1 and node2 fail nor attempt to fail over other resources in the cluster to it.
  • I use the setup above because this strictly simulates a SAN (in this case node0 is the SAN) and as such the SAN should not host any resources at all other than the raw file system.
  • Of course nothing prevents you from making a logical volume on /dev/sdb1 if you wish.

7) Testing Fail Over

At this point you can simulate a failure, disconnect or disable the ethernet interface on node1. The cluster will move the resources to node2 and fence node1 on node0 preventing node1 from mounting the exported gnbd device.

After a few moments (you will know if you tail the log file) node2 will have:

  • The virtual IP added to eth0.
  • /ha mounted.
  • The Apache service started on node2.

Go to a browser and browse to the virtual ip of the cluster and voila you should get a response.

At this point your are done. To recover the primary node just reboot it when it comes back it will take over the resources from node2 as it should.

This setup is of course more robust that the previous one I have wrote about because the cluster moves its resources automatically without human intervention as you would expect.

Cheers.

Attached Files:

Trackbacks

Use the following link to trackback from your own site:
http://blog.miguelsarmiento.com/trackbacks?article_id=7

Leave a comment

Comments