Setting up the OpenBSD watchdog daemon (watchdogd)


The watchdog daemon (watchdogd) was introduced in OpenBSD 3.8, and can be used to help machines automatically recover from system hangs. If the OpenBSD hardware watchdog daemon is enabled, it will periodically update the hardware watchdog timer built into the system. If this timer is not reset for a period of time, the hardware will reset itself. The watchdog daemon is not enabled by default, and can be enabled (assuming OpenBSD can find a watchdog timer in your system) by adding a pair of empty quotes to the watchdog_flags variables in /etc/rc.conf:

$ grep watchdog /etc/rc.conf
watchdogd_flags=“” # for normal use: “”

The update interval is controlled through the kern.watchdog.period variable, which can be set in /etc/sysctl.conf, and viewed with the sysctl(8) command:

$ sysctl -a | grep watchdog
kern.watchdog.period=30 kern.watchdog.auto=0

Using the hardware watchdog can be useful when you are running routers and access points in remote locations, and don’t want to spend time driving to a remote location to reboot a hung system. I always add an rc script to the servers I support to E-mail me when the system boots. If I get an E-mail while I am performing planned maintenance, I can toss it in the trash can. If I get an E-mail because the machine reboots due to faulty hardware or a kernel bug, I will know that the system reset, and can begin investigating the the source of the problem. There are definitely times (e.g., clustered nodes) when it’s better to leave the hardware watchdog disabled, and have a monitoring station alert you to a hung system.

This article was posted by Matty on 2006-09-03 16:59:00 -0400 EDT