Wednesday, August 4, 2010

Setting up a highly available NFS cluster

The following is a proof-of-concept HA NFS cluster running on RHEL 5, using DRBD for block-level replication and heartbeat for failover.

In this example, I'll be setting up 2 nodes: nfs01 and nfs02. Ensure that ALL commands and setup is run against both nodes EXCEPT where stated otherwise.

Following is the partition table for a test install - the metadata partition /dev/sda7 and the data partition /dev/sda8 are unformatted and unmounted.
[root@nfs01 ~]# fdisk -l /dev/sda

Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14         274     2096482+  82  Linux swap / Solaris
/dev/sda3             275         535     2096482+  83  Linux
/dev/sda4             536        2610    16667437+   5  Extended
/dev/sda5             536         796     2096451   83  Linux
/dev/sda6             797        1057     2096451   83  Linux
/dev/sda7            1058        1074      136521   83  Linux
/dev/sda8            1075        2610    12337888+  83  Linux
Instructions for building the DRBD RPMs, including the kernel module can be found on the DRBD website.

The kernel in use is the latest update for RHEL 5.5 at the time of writing, version kernel-2.6.18-194.3.1.el5.x86_64.rpm - DRBD must be compiled against your specific version of kernel-headers.

Ensure at least the following RPMs are installed.
rpm-build-4.4.2.3-18.el5    
elfutils-0.137-3.el5        
elfutils-libs-0.137-3.el5   
flex-2.5.4a-41.fc6          
glibc-devel-2.5-49.el5_5.2  
glibc-headers-2.5-49.el5_5.2
kernel-2.6.18-194.3.1.el5   
kernel-devel-2.6.18-194.3.1.el5  
kernel-headers-2.6.18-194.3.1.el5
gcc-c++-4.1.2-48.el5
gcc-4.1.2-48.el5
make-3.81-3.el5
Use DRBD's built-in functionality to build both the userland and kernel module RPMs.
[root@nfs01 build]# tar zxf drbd-8.3.8.tar.gz 
[root@nfs01 build]# cd drbd-8.3.8
[root@nfs01 drbd-8.3.8]# ./configure --with-km=/usr/src/kernels/2.6.18-194.3.1.el5-x86_64/
[root@nfs01 drbd-8.3.8]# make km-rpm rpm
...
You now have:
/usr/src/redhat/RPMS/x86_64/drbd-bash-completion-8.3.8-1.x86_64.rpm
/usr/src/redhat/RPMS/x86_64/drbd-pacemaker-8.3.8-1.x86_64.rpm
/usr/src/redhat/RPMS/x86_64/drbd-km-2.6.18_194.3.1.el5-8.3.8-12.x86_64.rpm
/usr/src/redhat/RPMS/x86_64/drbd-udev-8.3.8-1.x86_64.rpm
/usr/src/redhat/RPMS/x86_64/drbd-utils-8.3.8-1.x86_64.rpm
/usr/src/redhat/RPMS/x86_64/drbd-8.3.8-1.x86_64.rpm
/usr/src/redhat/RPMS/x86_64/drbd-xen-8.3.8-1.x86_64.rpm
/usr/src/redhat/RPMS/x86_64/drbd-heartbeat-8.3.8-1.x86_64.rpm
Install DRBD and modprobe the new module - a reboot is given in the examples here to check the module loads on a system restart.
[root@nfs01 drbd-8.3.8]# rpm -ivh /usr/src/redhat/RPMS/x86_64/drbd-*
[root@nfs01 drbd-8.3.8]# modprobe drbd # or...
[root@nfs01 drbd-8.3.8]# reboot
...
[root@nfs01 drbd-8.3.8]# lsmod | grep drbd
drbd                  277784  0 
Install the sample /etc/drbd.conf file.
# A comprehensively commented example of this file exists at:
# /usr/share/doc/drbd-utils-8.3.8/drbd.conf.example

resource r0 {

    protocol C;
    
    # Parse error here:
    # incon-degr-cmd "halt -f";
    
    startup {
        degr-wfc-timeout 120;    # 2 minutes.
    }

    disk {
        on-io-error   detach;
    }

    net {
    }

    syncer {
        rate 10M;
        # group 1;
        al-extents 257;
    }

    on nfs01 {                   
        device    /dev/drbd0;          
        disk      /dev/sda8;           # NFS data partition
        meta-disk /dev/sda7[0];        # 1024MB partition for DRBD metadata
        address   192.168.1.101:7788;
    }

    on nfs02 {                   
       device    /dev/drbd0;           
       disk      /dev/sda8;            # NFS data partition
       meta-disk /dev/sda7[0];         # 1024MB partition for DRBD metadata
       address   192.168.1.102:7788;  
    }

}
Install the sample /etc/exports file, and ensure the NFS service is stopped and is not set to launch on system startup.
/export/ 192.168.1.0/255.255.255.0(rw,no_root_squash,no_all_squash,sync)
[root@nfs01 drbd-8.3.8]# /etc/init.d/nfs stop
[root@nfs01 drbd-8.3.8]# chkconfig nfs off
Configure the NFS statd daemon to use the hostname for the floating IP.
[root@nfs01 ~]# echo >> /etc/sysconfig/nfs 'STATDARG="-n nfs.localdomain"'
Ensure there's no existing filesystem on your metadata partition, /dev/sda7 in this case.
[root@nfs01 ~]# dd if=/dev/zero of=/dev/sda7 bs=1M count=256
[root@nfs01 ~]# drbdadm create-md r0
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.

[root@nfs01 ~]# mkfs.ext3 /dev/sda8
[root@nfs01 ~]# drbdadm up all

[root@nfs01 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by root@nfs01, 2010-06-18 16:52:07
 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12217404
Do this only on node 1:
[root@nfs01 ~]# drbdadm adjust all 
[root@nfs01 ~]# drbdadm -- --force primary r0

[root@nfs01 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by root@nfs01, 2010-06-18 16:52:07
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
    ns:537600 nr:0 dw:0 dr:537600 al:0 bm:32 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:11679804
 [>....................] sync'ed:  4.5% (11404/11928)M delay_probe: 102
 finish: 0:15:41 speed: 12,272 (10,336) K/sec

# Completed now:
[root@nfs01 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by root@nfs01, 2010-06-18 16:52:07
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
    ns:12217401 nr:0 dw:0 dr:12217401 al:0 bm:746 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
Do this only on node 1:
[root@nfs01 ~]# mount /dev/drbd0 /export
[root@nfs01 ~]# mv /var/lib/nfs /export/
[root@nfs01 ~]# cd /var/lib/
[root@nfs01 ~]# mv nfs nfs.old
[root@nfs01 ~]# ln -s /export/nfs
Do this only on node 2:
[root@nfs02 ~]# cd /var/lib/
[root@nfs02 lib]# mv nfs nfs.old
[root@nfs02 lib]# ln -s /export/nfs
Ensure the EPEL repository is enabled and install heartbeat.
[root@nfs01 ~]# yum install heartbeat.x86_64
Edit the /etc/ha.d/ha.cf file.
logfacility local0
keepalive 2
deadtime 10
bcast eth0
node nfs01.localdomain nfs02.localdomain
auto_failback on
Edit /etc/ha.d/haresources - the IP address is the floating one.
nfs01.localdomain IPaddr::192.168.1.103/24/eth0 \
    drbddisk::r0 Filesystem::/dev/drbd0::/export::ext3 nfs
Add a secret to /etc/ha.d/authkeys file.
auth 3
3 md5 pa55word
chmod 600 /etc/ha.d/authkeys
Make sure the drbd and heartbeat servces are set to start on boot, and start them up:
/etc/init.d/drbd start
/etc/init.d/heartbeat start

chkconfig drbd on
chkconfig heartbeat on
That's a very brief and tersely documented example - you can now test failover by pulling a network cable :-)

1 comments:

  1. I think where you have:

    [root@nfs01 ~]# mkfs.ext3 /dev/sda8


    it should read:

    [root@nfs01 ~]# mkfs.ext3 /dev/drbd0

    ReplyDelete