Removing a gluster volume doesn't remove the volume's contents

This article was posted by Matty on 2011-11-27 11:31:00 -0400 -0400

I made another interesting discovery this weekend while playing around with the gluster volume deletion option. Prior to creating a volume with a new layout, I went through the documented process to remove my volume:

$ gluster volume stop glustervol01

Stopping volume will make its data inaccessible. Do you want to
continue? (y/n) y
Stopping volume glustervol01 has been successful

$ gluster volume delete glustervol01

Deleting volume will erase all information about the volume. Do you
want to continue? (y/n) y
Deleting volume glustervol01 has been successful

I then re-created it using the documented process:

$ gluster volume create glustervol01 replica 2 transport tcp

fedora-cluster01.homefetch.net:/gluster/vol01
fedora-cluster02.homefetch.net:/gluster/vol01**
Creation of volume glustervol01 has been successful. Please start the
volume to access data.

$ gluster volume start glustervol01

Starting volume glustervol01 has been successful

Once the new volume was created and started I mounted in on my clients. When I went to access the volume I was quite intrigued to find that the data that was written to the previous gluster volume was still present:

$ ls -l

total 24
drwxr-xr-x 126 root root 12288 Nov 26 2011 etc
drwxr-xr-x 126 root root 4096 Nov 26 13:07 etc2
drwxr-xr-x 126 root root 4096 Nov 26 13:07 etc3
drwx------ 2 root root 4096 Nov 26 2011 lost+found

Ey? Since ‘gluster delete volume’ spit out the message “Deleting volume will erase all information about the volume”, I figured the contents of the volume would be nuked (never assume, always confirm!). That doesn’t appear to be the case here. When you delete a volume the metadata that describes the volume is the only thing that is removed. It would be helpful if the developers noted this in the output above. I can see this causing headaches for folks down the road.

Some interesting insights on the gluster replicated volume replica value

This article was posted by Matty on 2011-11-27 10:05:00 -0400 -0400

While playing around with gluster, I had an interesting finding about the way gluster handles replicated volumes. The gluster volume I am using for testing is a replicated volume with a replica factor of 2 (the replica factor determines how many copies of your data will be made). I wanted to add a third replica to my volume, and thought it would be as simple as using the “add-brick” option:

$ gluster volume add-brick glustervol01

centos-cluster03.prefetch.net:/gluster/vol01**
Incorrect number of bricks supplied 1 for type REPLICATE with count 2

Hmmmm – no go. At first I thought this was no big deal, I figured there was an option or setting I needed to change to increase my replica count. I couldn’t find an option in the official documentation, and after reading through a number of mailing list postings I came across a horrific finding. From what I have been able to gather so far you cannot add a third replica to a volume that was created with a replica count of 2. Erf!

Being somewhat curious, I was wondering if I could work around this limitation by creating a volume with a replica value higher than the number of bricks that were specified on the command line. This would allow me to grow the number of replicated bricks as needed, giving me some buffer room down the road. Well – this doesn’t work either:

$ gluster volume create glustervol01 replica 3 transport tcp \

fedora-cluster01.homefetch.net:/gluster/vol01 \
fedora-cluster02.homefetch.net:/gluster/vol01**
number of bricks is not a multiple of replica count
Usage: volume create <new-volname> [stripe <count>] [replica <count>]
[transport <tcp|rdma|tcp,rdma>] <new-brick> ...

So this leaves me with the following options to change my volume lay out:

Replace a brick with the “replace-brick” option.
Remove a brick with the “remove-brick” option and then add a new brick with the “add-brick” option.
Destroy my volume and re-create it with a replica factor of 3.
Add replica count bricks with the “add-brick” option.

So you may be asking why don’t you do #4 and add two more bricks to the volume to make gluster happy? There are two reasons:

I only have one node to add at this time (hardware doesn’t grow on trees).
Adding two more bricks with “add-brick” would create a distributed replicated volume. This doesn’t increase the replica factor for my data, it adds two more replicated bricks to the volume (see this post for additional detail).

As with ALL storage-related solutions, you need to do your homework prior to deploying gluster. Make sure you take into account how things will need to look down the road, and design your gluster solution around this vision (and make sure you have a good backup and recovery solution in place in case you need to make drastic changes). Also make sure to test out your vision to ensure it works as you expect it to. I’m a huge fan of beating the heck out of technology in a lab and learning as much as I can in a non-production environment. I don’t like getting bitten once something goes live and my users depend on it.

If someone is aware of a way to add a third replica to a volume please leave me a comment (as well as a link to documentation that talks about it) and I’ll update the blog entry. I’ve searched and searched and searched and have yet to come up with anything. If there truly is no way to expand the number of replicas in a volume I would consider this a serious limitation of gluster. With disk sizes growing like mad, I could definitely see it being useful to expand the replica factor for an existing volume. It won’t be too long before you can install a 10TB disk drive in your PC, and when those are $80 a pop a replica value of three doesn’t seem that unrealistic (just ask Adam Leventhal).

Centos 6 Linux VMs running inside vSphere 4.1 appear to dynamically discover new LUNs

This article was posted by Matty on 2011-11-27 08:50:00 -0400 -0400

I came across an interesting discovery yesterday while working on a CentOS 6 gluster node. The node was virtualized inside vSphere 4.1 and needed some additional storage added to it. I went into the VI client and added a new disk while the server was running, expecting to have to reboot or rescan the storage devices in the server. Well, I was pleasantly surprised when the following messages popped up on the console:

VMWare Resizing

Nice, it looks like the device was added to the system dynamically! I ran dmesg to confirm:

$ dmesg | tail -14

mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 1, phy 1, sas_addr 0x5000c295575f0957
scsi 2:0:1:0: Direct-Access VMware Virtual disk 1.0 PQ: 0 ANSI: 2
sd 2:0:1:0: [sdb] 75497472 512-byte logical blocks: (38.6 GB/36.0 GiB)
sd 2:0:1:0: [sdb] Write Protect is off
sd 2:0:1:0: [sdb] Mode Sense: 03 00 00 00
sd 2:0:1:0: [sdb] Cache data unavailable
sd 2:0:1:0: [sdb] Assuming drive cache: write through
sd 2:0:1:0: Attached scsi generic sg2 type 0
sd 2:0:1:0: [sdb] Cache data unavailable
sd 2:0:1:0: [sdb] Assuming drive cache: write through
sdb: unknown partition table
sd 2:0:1:0: [sdb] Cache data unavailable
sd 2:0:1:0: [sdb] Assuming drive cache: write through
sd 2:0:1:0: [sdb] Attached SCSI disk

Rock on! In the past I’ve had to reboot virtual machines or rescan the storage devices to find new LUNs. This VM was configured with a LSI Logic SAS controller and is running CentOS 6. I’m not sure if something changed in the storage stack in CentOS 6, or if the SAS controller is the one to thank for this nicety. Either way I’m a happy camper, and I love it when things just work! :)

Installing gluster on a CentOS machine via rpmbuild

This article was posted by Matty on 2011-11-27 08:28:00 -0400 -0400

I talked previously about my experience getting gluster up and running on Fedora and CentOS Linux servers. The installation process as it currently stands is different between Fedora and CentOS servers. The Fedora package maintainers have build RPMs for gluster, so you can use yum to install everything needed to run gluster:

$ yum install glusterfs flusterfs-fuse glusterfs-server glusterfs-vim glusterfs-devel

Gluster packages aren’t currently available for CentOS 6 (at least they aren’t in extras or centosplus as of this morning), so you are required to build from source if you want to use CentOS as your base operating system. The build process is pretty straight forward, and I’ll share my notes and gotchas with you.

Before you compile a single piece of source code you will need to make sure the development tools group, as well as the rpcbind, readline, fuse and rpm-devel packages are installed. If these aren’t installed you can run yum to add them to your system:

$ yum -y groupinstall "Development tools"

$ yum -y install fuse fuse-devel rpcbind readline-devel libibverbs-devel rpm-devel

Once the pre-requisites are installed you can download and and build gluster:

$ wget http://download.gluster.com/pub/gluster/glusterfs/LATEST/glusterfs-3.2.5.tar.gz

$ rpmbuild -ta glusterfs-3.2.5.tar.gz

The rpmbuild utilities “-ta” option (build source and binary packages from an archive) will build RPMs from a tar archive, and the packages that are produced will be placed in the rpmbuild directory in your home directory once rpmbuild does its magic:

$ cd /home/matty/rpmbuild/RPMS/x86_64

$ ls -la

total 5896
drwxr-xr-x. 2 root root 4096 Nov 26 17:32 .
drwxr-xr-x. 3 root root 4096 Nov 26 17:32 ..
-rw-r--r--. 1 root root 1895624 Nov 26 17:32 glusterfs-core-3.2.5-1.el6.x86_64.rpm
-rw-r--r--. 1 root root 3988860 Nov 26 17:32 glusterfs-debuginfo-3.2.5-1.el6.x86_64.rpm
-rw-r--r--. 1 root root 49260 Nov 26 17:32 glusterfs-fuse-3.2.5-1.el6.x86_64.rpm
-rw-r--r--. 1 root root 50732 Nov 26 17:32 glusterfs-geo-replication-3.2.5-1.el6.x86_64.rpm
-rw-r--r--. 1 root root 35032 Nov 26 17:32 glusterfs-rdma-3.2.5-1.el6.x86_64.rpm

The glusterfs.spec file in the tar archive you downloaded includes the RPM specification, so you can extract the archive and review this file if you are curious what rpmbuild is being instructed to do. To install the packages above we can use our good buddy rpm:

$ rpm -ivh *

Preparing... ########################################### [100%]
1:glusterfs-core ########################################### [ 20%]
2:glusterfs-fuse ########################################### [ 40%]
3:glusterfs-rdma ########################################### [ 60%]
4:glusterfs-geo-replicati########################################### [ 80%]
5:glusterfs-debuginfo ########################################### [100%]

If the packages installed successfully you can configure your gluster node and start the glusterd service:

$ chkconfig glusterd on

$ service glusterd start

While not quite as easy as ‘yum install gluster*', it’s still pretty darn simple to get gluster installed and operational on a CentOS Linux server.

Wiping disk drive data with Darek's boot and nuke

This article was posted by Matty on 2011-11-26 17:00:00 -0400 -0400

Over the years I have accumulated dozens of disk drives. As I upgrade drives and donate my older hardware to friends and charities, I like to make sure the data on those drives is wiped. I’ve been using Darik’s boot and nuke (DBAN) to wipe my drives for the past year or two, and the entire process couldn’t be easier.

DBAN is a bootable Linux CDROM image that wipes a hard drive using one of several strong data destruction algorithms (Gutmann wipe, DoD short, Dod long, etc.). To wipe a drive you boot from a CDROM with the DBAN ISO image, select the drives you want to wipe, hit “M” to choose the algorithm to wipe the drives with and then sit back and watch as the program wipes your drives clean:

Wipe

If you decided to use this technique, make sure to unplug all disk drives with valid data. If you forget and leave them plugged in and accidentally select one of them (or use the autonuke option), your data WILL BE DESTROYED! Use this information at your own risk. The author will not be held liable for any data loss that results from the information in this blog entry.