Locating Linux LVM (Logical Volume Manager) free space

This article was posted by Matty on 2012-01-15 09:24:00 -0400 -0400

The Linux Logical Volume Manager (LVM) provides a relatively easy way to combine block devices into a pool of storage that you can allocate storage out of. In LVM terminology, there are three main concepts:

Physical Volumes - A sequence of sectors on a physical device.
Volume Groups - A group of physical volumes.
Logical Volumes - A logical device that is allocated from a volume group.

When you use LVM to manage your storage, you will typically do something similar to this when new storage requests are made:

Create a physical volume on a block device or partition on a block device.
Add one or more physical volumes to a volume group
Allocate logical volumes from the volume group.
Create a file system on the logical volume.

With this approach you can end up with free space in one or more physical volumes or one or more volume groups depending on how you provisioned the storage. To see how much free space your physical volumes have you can run the pvs utility without any arguments:

$ pvs

PV VG Fmt Attr PSize PFree
/dev/sda2 VolGroup lvm2 a-- 8.51g 0
/dev/sdb DataVG lvm2 a-- 18.00g 18.00g
/dev/sdc DataVG lvm2 a-- 18.00g 184.00m

The “PFree” column shows the free space for each physical volume in the system. To see how much free space your volume groups have you can run the vgs utility without any arguments:

$ vgs

VG #PV #LV #SN Attr VSize VFree
DataVG 2 1 0 wz--n- 35.99g 18.18g
VolGroup 1 2 0 wz--n- 8.51g 0

In the vgs output the “VFree” column shows the amount of free space in each volume group. LVM is nice, but I’m definitely a ZFS fan when it comes to storage management. I’m hopeful that Oracle will come around and port ZFS to Linux, since it would benefit a lot of users and hopefully help to repair some of the broken relations between Oracle and the opensource community. I may be too much of an optimist though.

Using exec-shield to protect your Linux servers from stack, heap and integer overflows

This article was posted by Matty on 2012-01-14 11:08:00 -0400 -0400

I’ve been a long time follower of the OpenBSD project, and their amazing work on detecting and protecting the kernel and applications from stack and heap overflows. Several of the concepts that were developed by the OpenBSD team were made available in Linux, and came by way of the exec-shield project. Of the many useful security features that are part of exec-shield, the two features that can be controlled by a SysAdmin are kernel virtual address space randomizations and the exec-shield operating mode.

Address space randomization are controlled through the kernel.randomize_va_space sysctl tunable, which defaults to 1 on my CentOS systems:

$ sysctl kernel.randomize_va_space

kernel.randomize_va_space = 1

The exec-shield operating mode is controlled through the kernel.exec-shield sysctl value, and can be set to one of the following four modes (the descriptions below came from Steve Grubb’s excellent post on exec-shield operating modes):

A value of 0 completely disables ExecShield and Address Space Layout Randomization
A value of 1 enables them ONLY if the application bits for these protections are set to “enable”
A value of 2 enables them by default, except if the application bits are set to “disable”
A value of 3 enables them always, whatever the application bits

The default exec-shield value on my CentoOS servers is 1, which enables exec-shield for applications that have been compiled to support it:

$ sysctl kernel.exec-shield
kernel.exec-shield = 1

To view the list of running processes that have exec-shield enabled, you can run Ingo Molnar and Ulrich Drepper’s lsexec utility:

$ lsexec --all |more

init, PID 1, UID root: no PIE, no RELRO, execshield enabled
httpd, PID 11689, UID apache: DSO, no RELRO, execshield enabled
httpd, PID 11691, UID apache: DSO, no RELRO, execshield enabled
httpd, PID 11692, UID apache: DSO, no RELRO, execshield enabled
httpd, PID 11693, UID apache: DSO, no RELRO, execshield enabled
httpd, PID 12224, UID apache: DSO, no RELRO, execshield enabled
httpd, PID 12236, UID apache: DSO, no RELRO, execshield enabled
pickup, PID 16181, UID postfix: DSO, partial RELRO, execshield enabled
appLoader, PID 2347, UID root: no PIE, no RELRO, execshield enabled
auditd, PID 2606, UID root: DSO, partial RELRO, execshield enabled
audispd, PID 2608, UID root: DSO, partial RELRO, execshield enabled
restorecond, PID 2629, UID root: DSO, partial RELRO, execshield enabled

In this day and age of continuos security threats there is little to no reason that you shouldn’t be using these amazing technologies. When you combine exec-shield, SELinux and proper patching and security best practices you can really limit the attack vectors that can be used to break into your systems.

Fcron, a feature rich cron and anacron replacement

This article was posted by Matty on 2012-01-14 10:34:00 -0400 -0400

I’ve been looking at some opensource scheduling packages, and while doing my research I came across the fcron package. Fcron is a replacement for vixie cron and anacron, and provides a number of super useful features:

Run jobs based on the system load average.
Serialize jobs.
Set the nice value of the process that is fork()‘ed.
Options to control how results are mailed.
And several more …

My initial testing has been positive, and I definitely plan to keep this package in my back pocket. I’m still looking at various opensource schedulers, and if you have any experience in this area please leave me a comment. I’m curious which solutions worked well for my readers. :)

Improved ZFS scrub statistics in Solaris 10 update 9

This article was posted by Matty on 2011-12-06 08:50:00 -0400 -0400

I talked about the ZFS scrub feature a few months back. In the latest Solaris 10 update the developers added additional scrub statistics, which are quite handy for figuring out throughout and estimated completion times:

$ zpool scrub rpool

$ zpool status -v

pool: rpool
state: ONLINE
scan: scrub in progress since Tue Dec 6 07:45:31 2011
1005M scanned out of 81.0G at 29.5M/s, 0h46m to go
1005M scanned out of 81.0G at 29.5M/s, 0h46m to go
0 repaired, 1.21% done
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c1t0d0s0 ONLINE 0 0 0

errors: No known data errors

This sure beats the previous output! Nice job team Solaris.

Another interesting finding about gluster replicas

This article was posted by Matty on 2011-11-30 09:04:00 -0400 -0400

In a previous post I talked about my problems getting gluster to expand the number of replicas in a volume. While experimenting with the gluster utilities “add-brick” option I wanted to see if adding two more bricks would replicate the existing data across four bricks (two old, two new), or if the two new bricks would be a replica pair and the two previous bricks would be a replica pair. To see what would happen I added two more bricks:

$ gluster volume add-brick glustervol01

centos-cluster01.homefetch.net:/gluster/vol01
centos-cluster02.homefetch.net:/gluster/vol01**
Add Brick successful

And then checked out the status of the volume:

$ gluster volume info glustervol01

Volume Name: glustervol01
Type: Distributed-Replicate
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: fedora-cluster01.homefetch.net:/gluster/vol01
Brick2: fedora-cluster02.homefetch.net:/gluster/vol01
Brick3: centos-cluster01.homefetch.net:/gluster/vol01
Brick4: centos-cluster02.homefetch.net:/gluster/vol01

Interesting. The volume is now a distributed-replicated volume, and has a two by two configuration giving four nodes in total. This configuration is similar to RAID 10, where you stripe across mirrors. The previous two nodes would be one mirror, and the two new nodes would become the second mirror. I confirmed this by copying files to my gluster file system and then checking the bricks to see where the files landed:

$ cd /gluster

$ cp /etc/services file1

$ cp /etc/services file2

$ cp /etc/services file3

$ cp /etc/services file4

$ ls -la

total 2648
drwxr-xr-x 4 root root 8192 Nov 27 2011 .
dr-xr-xr-x. 23 root root 4096 Nov 12 15:44 ..
drwxr-xr-x 2 root root 16384 Nov 27 2011 etc1
-rw-r--r-- 1 root root 656517 Nov 27 2011 file1
-rw-r--r-- 1 root root 656517 Nov 27 2011 file2
-rw-r--r-- 1 root root 656517 Nov 27 2011 file3
-rw-r--r-- 1 root root 656517 Nov 27 2011 file4
drwx------ 2 root root 20480 Nov 26 21:11 lost+found

Four files were copied to the gluster file system, and it looks like two landed on each replicated pair of bricks. Here is the ls listing from the first pair (I pulled this from one of the two nodes):

$ ls -la

total 1328
drwxr-xr-x. 4 root root 4096 Nov 27 10:00 .
drwxr-xr-x. 3 root root 4096 Nov 26 17:53 ..
drwxr-xr-x. 2 root root 4096 Nov 27 10:00 etc1
-rw-r--r--. 1 root root 656517 Nov 27 10:00 file1
-rw-r--r--. 1 root root 656517 Nov 27 10:01 file2
drwx------. 2 root root 16384 Nov 26 21:11 lost+found

And here is the listing from the second replicated pair of bricks:

$ ls -la

total 1324
drwxr-xr-x 4 root root 4096 Nov 27 10:00 .
drwxr-xr-x 3 root root 4096 Nov 12 20:05 ..
drwxr-xr-x 126 root root 12288 Nov 27 10:00 etc1
-rw-r--r-- 1 root root 656517 Nov 27 10:00 file3
-rw-r--r-- 1 root root 656517 Nov 27 10:00 file4
drwx------ 2 root root 4096 Nov 26 21:11 lost+found

So there you have it. Adding two more bricks with “add-brick” adds a new pair of replicated bricks, it doesn’t mirror the data between the old bricks and the new ones. Given the description of a distributed replicated volume in the official documentation this makes total sense. Now to play around with some of the other redundancy types.