I mentioned previously that I built out some new hardware. When I was spec’ing out the hardware, I made sure to get “green” components that supported advanced power management features. Solaris is able to take advantage of the CPU power states, and can lower the processor operating frequency when a server is idle. Power managementis handled by the the Solaris kernel, and configured through the /etc/power.conf configuration file. By default, the file will contain a CPU power management entry similar to the following:
cpupm enable cpu-threshold 1s
The cpupm directive enables CPU power management, and the cpu-threshold directive indicates how often the processor needs to be idle before the CPU frequency is lowered. To check the current operating frequency of a CPU, you can check the current_clock_Hz value for each CPU:
$ kstat -m cpu_info -i 0 -s current_clock_Hz
module: cpu_info instance: 0
name: cpu_info0 class: misc
current_clock_Hz 1100000000
This is awesome stuff, and when the disk power management project is integrated, servers will hopefully be able to reduce their power consumption dramatically. Viva la Solaris!
I just posted an article on how to install, configure and debug the ISC DHCP server. The example configuration referenced in the article supports Windows, OS X, Linux and Solaris hosts, and provides the directives needed to PXE boot physical and virtual machines (I have tested it with Xen, KVM, Linux and Solaris). If you have any comments or suggestions, please let me know.
I recently re-designed my main website, and learned quite a bit about CSS and XHTML in the process. I also learned that there are a number of things you should do before re-designing a site, and thought I would list them here for folks who are looking to change their site layout:
Keeping these things in mind will made re-designing your site quite a bit easier, and may even turn up some surprises that you weren’t aware of.
One of things I love about Solaris is its ability to generate a core file when a system panics. The core files are an invaluable resource for figuring out what caused a host to panic, and are often the first thing OS vendor support organizations will request when you open a support case. Linux provides the kdump, diskdump and netdump tools to collect core file when a systems panics, and although not quite as seamless as their Solaris counterpart, they work relatively well.
I’m not a huge fan of diskdump and netdump, since they have special pre-requisites (i.e., operational networking, supported storage controller, etc.) that need to be met to ensure a core file is captured. Kdump does not. Kdump works by reserving a chunk of memory for a crash kernel, and then rebooting into this kernel when a box panics. Since the crashkernel uses a chunk of memory that is unused and reserved for this specific purpose, it can be sure that the memory it is using won’t taint the previous kernel. This approach also provides full access to the previous kernel’s memory, which is read and written off to disk or a network accessible location.
To configure kdump to write core files to the /var/crash directory on local disk, you will first need to install the kexec-tools package:
$ yum install kexec-tools
Once the package is installed, you will need to add a crashkernel line to the kernel boot arguments. This line contains the amount of memory to reserve for the crashkernel, and should look similar to the following (you may need to increase the amount of memory depending on the platform you are using):
title CentOS (2.6.18-128.el5)
root (hd0,0)
kernel /boot/vmlinuz-2.6.18-128.el5 ro root=LABEL=/ console=ttyS0
crashkernel=128M@16M
initrd /boot/initrd-2.6.18-128.el5.img
To allow you to get a core file if a box hangs, you can enable sysrq magic key sequences by setting “kernel.sysrq” to 1 in /etc/sysctl.conf (you can also use the sysctl “-w” option to enable this feature on an active host):
kernel.sysrq = 1
Once these settings are in place, you can enable the kdump service with the chkconfig and service commands:
$ chkconfig kdump on
$ service kdump start
If you want to verify that kdump is working, you can type “alt + sysrq + c” on the console, or echo a “c” character to the sysrq-trigger proc entry:
$ echo "c" > /proc/sysrq-trigger
SysRq : Trigger a crashdump
Linux version 2.6.18-128.el5 (mockbuild@builder10.centos.org)
....
This will force a panic, which should result in a core file being generated in the /var/crash directory:
$ pwd
/var/crash/2009-07-05-18:31
$ ls -la
total 317240
drwxr-xr-x 2 root root 4096 Jul 5 18:32 .
drwxr-xr-x 3 root root 4096 Jul 5 18:31 ..
-r-------- 1 root root 944203448 Jul 5 18:32 vmcore
If you are like me and prefer to be notified when a box panics, you can configure your log monitoring solution to look for the string “kdump: saved a vmcore” in /var/log/messages:
Jul 5 18:32:08 kvmnode1 kdump: saved a vmcore to /var/crash/2009-07-05-18:31
Kdump is pretty sweet, and it’s definitely one of those technologies that every RAS savy engineer should be configuring on each server he or she deploys.
Sparse files have become pretty common in the virtualization arena, since they allow you to present a large chunks of disk space to guests without having to reserve the space in an actual backing store. This has a couple of benefits:
To create a sparse file on a Linux host, you can run dd with a count size of zero (this tells dd not to write any data to the file), and then use the seek option to extend the file to the desired size:
$ dd if=/dev/zero of=xen-guest.img bs=1 count=0 seek=8G
Once the sparse fie is created, you can use dd to verify how much space is allocated to it:
$ du -sh xen-guest.img
0 xen-guest.img
$ du -sh --apparent-size xen-guest.img
8.0G xen-guest.img
Sparse files are extremely handy, though it’s important to know when and when not to use them.