Debugging the ipfilter SMF service

I logged into one of my Solaris 10 hosts today to add some additional firewall rules, and noticed that the ipfilter service was in the maintenance state:

$ svcs -x ipfilter

svc:/network/ipfilter:default (IP Filter)
 State: maintenance since Sat Oct 28 15:56:30 2006
Reason: Start method failed repeatedly, last exited with status 2.
   See: http://sun.com/msg/SMF-8000-KS
   See: ipfilter(5)
   See: /etc/svc/volatile/network-ipfilter:default.log
Impact: This service is not running.

This is odd, considering this was working the last time I had checked up on the server. When I dumped out the logfile mentioned in the service state listed above, I noticed that the shell script that starts ipfilter was bombing out at line 180:

$ cat /etc/svc/volatile/network-ipfilter:default.log

[ Oct 28 15:56:16 Enabled. ]
[ Oct 28 15:56:27 Executing start method ("/lib/svc/method/ipfilter start") ]
/lib/svc/method/ipfilter: syntax error at line 180: `end of file' unexpected
[ Oct 28 15:56:27 Method "start" exited with status 2 ]
[ Oct 28 15:56:27 Executing start method ("/lib/svc/method/ipfilter start") ]
/lib/svc/method/ipfilter: syntax error at line 180: `end of file' unexpected
[ Oct 28 15:56:28 Method "start" exited with status 2 ]
[ Oct 28 15:56:28 Executing start method ("/lib/svc/method/ipfilter start") ]
/lib/svc/method/ipfilter: syntax error at line 180: `end of file' unexpected

Since I didn’t modify /lib/svc/method/ipfilter, I started to wonder why ipfilter all of a sudden quit working. The erorr message above indicated that there was an error in the script at line 180, which is a bit misleading considering the script only has 179 lines:

$ cat /lib/svc/method/ipfilter | wc -l
179

To find the actual line that was causing the issue, I decided to change the shell in /lib/svc/method/ipfilter from /sbin/sh to /bin/bash ( As a side note — I still don’t quite understand why anyone would use /sbin/sh on Solaris hosts, considering zsh, tsch and bash are available. If the reason is because of dependencies, Sun should consider moving the shells folks actually use into one of the core packages!). Once I made this change and invoked the script with the start option, bash notified me that line 123 was actually to blame:

$ /lib/svc/method/ipfilter start

/lib/svc/method/ipfilter: line 123: unexpected EOF while looking for matching “’
/lib/svc/method/ipfilter: line 180: syntax error: unexpected end of file

Upon inspecting the ipfitler script in more detail, I noticed that a “`” character was missing on line 123:

case "$1" in
        start)
                [ ! -f ${IPFILCONF} ] && exit 0
                [ -n "$pfildpid" ] && kill -TERM $pfildpid 2>/dev/null
                [ -n "$pid" ] && kill -TERM $pid 2>/dev/null
                /usr/sbin/pfild >/dev/null
                if load_ippool && load_ipf && load_ipnat ; then
                        ipmon -Dsv`  <------- ** PROBLEM **
                else

Once I removed the "`" from line 123, everything worked as expected. I am still not certain what caused this to happen in the first place, and the sunsolve and opensolaris bug database are not much help. If anyone else happens to experience this issue, please let me know!

Converting an rc script to an SMF manifest

I use the ORCA utility to graph various system and application metrics, and have recently run into a few problems. The application periodically crashes for no apparaent reason, and I haven’t had time to debug the issue (once I get a core file I will figure this out). Since I rely on the graphs to trend server and application capacity, I want to ensure that the application gets restarted each time a failure occurs. Since ORCA is running on a Solaris 10 server, I decided to convert the existing start/stop scripts to Solaris 10 SMF manifests. To begin the conversion process, I first created a shell script that would be able to start up ORCA and clear any lockfiles that are present:

$ cat /usr/local/bin/orca.start

#!/bin/sh

if [ -d /var/orca/configs/orcallator.cfg.lock ]
then
        logger -p daemon.notice "Removing orca lockfile"
        rm -rf /var/orca/configs/orcallator.cfg.lock
fi

logger -p daemon.notice "Starting orca in daemon mode"

/usr/local/bin/orca -logfile /var/logs/orcallator.log \
                   -daemon /var/orca/configs/orcallator.cfg

Once I verified that the script worked correctly, I created an SMF manifest with a stop and start method and no dependencies:

$ cat orca.xml

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<tservice_bundle type='manifest' name='orca'>

<service
    name="application/orca"
    type="service"
    version="1">

    <create_default_instance enabled="true"/>

    <exec_method
          type='method'
          name='start'
          exec='/usr/local/bin/orca.start'
          timeout_seconds='0'>
    </exec_method>

    <exec_method
           type='method'
           name='stop'
           exec=':kill -15'
           timeout_seconds='3'>
    </exec_method>
</service>
</service_bundle>

After the manifest was created, I used the svccfg ‘validate’ option to verify the structure of the XML document:

$ svccfg validate orca.xml

$ echo $?
0

If svccfg encounters an error, it will display an error on the console, and return a non-zero return code. If the XML document validates, the svccfg ‘import’ option can be used to import the manifest into the SMF repository:

$ svccfg import orca.xml

Once the manifest has been imported into the SMF repository, the svccfg ‘listprop’ option can be used to display the service’s properties:

$ svccfg -s application/orca listprop
start method
start/exec astring /opt/data/orca/scripts/orca.start
start/timeout_seconds count 0
start/type astring method
stop method
stop/exec astring “:kill -15”
stop/timeout_seconds count 3
stop/type astring method

All of this took me about 15 minutes, and now when ORCA crashes SMF restarts the process, which generates the following messages in the system logfile:

Nov 9 11:56:03 winnie root: [ID 702911 daemon.notice] Removing orca lockfile
Nov 9 11:56:03 winnie root: [ID 702911 daemon.notice] Starting orca in daemon mode

The SMF team did an awesome job with this!!