Debugging DTrace

While catching up on the DTrace discussion list tonight, I came across a cool DTrace debugging recommendation from Adam Leventhal. Adam mentioned using the DTRACE_DEBUG flag, which will cause libdtrace to spew tons of debugging data each time dtrace is invoked:

$ export DTRACE_DEBUG=1

$ dtrace -n ‘syscall:::entry { @calls[execname] = count(); }’

libdtrace DEBUG: opened provider fasttrap
libdtrace DEBUG: opened provider fbt
libdtrace DEBUG: opened provider lockstat
libdtrace DEBUG: opened provider profile
libdtrace DEBUG: opened provider sdt
libdtrace DEBUG: opened provider systrace
libdtrace DEBUG: opened 64-bit module unix (/platform/sun4u/kernel/sparcv9/unix) [0]
libdtrace DEBUG: opened 64-bit module krtld (misc/sparcv9/krtld) [1]
libdtrace DEBUG: opened 64-bit module genunix (/platform/sun4u/kernel/sparcv9/genunix) [2]
libdtrace DEBUG: opened 64-bit module platmod (/platform/SUNW,Ultra-5_10/kernel/misc/sparcv9/platmod) [3]
libdtrace DEBUG: opened 64-bit module SUNW,UltraSPARC-IIi (/platform/sun4u/kernel/cpu/sparcv9/SUNW,UltraSPARC-IIi) [4]
libdtrace DEBUG: opened 64-bit module specfs (/kernel/fs/sparcv9/specfs) [6]
libdtrace DEBUG: opened 64-bit module dtrace (/kernel/drv/sparcv9/dtrace) [7]

[ ...... ]

dtrace: description 'syscall:::entry ' matched 225 probes
^C

  utmpd                                                             2
  fmd                                                               4
  inetd                                                             4
  svc.configd                                                       5
  svc.startd                                                        5
  sshd                                                             16
  nscd                                                             77
  ipmon                                                            94
  mysqld                                                          149
  dtrace                                                         1784
libdtrace DEBUG: dt_buf_destroy(section headers): size=1024 resizes=1
libdtrace DEBUG: dt_buf_destroy(string table): size=512 resizes=0
libdtrace DEBUG: dt_buf_destroy(loadable data): size=512 resizes=0
libdtrace DEBUG: dt_buf_destroy(unloadable data): size=512 resizes=0
libdtrace DEBUG: dt_buf_destroy(probe data): size=512 resizes=0
libdtrace DEBUG: dt_buf_destroy(probe args): size=512 resizes=0
libdtrace DEBUG: dt_buf_destroy(probe offs): size=512 resizes=0
libdtrace DEBUG: dt_buf_destroy(probe rels): size=512 resizes=0

This is super cool, and useful for tracking down problems, or learning more about what DTrace does behind the scenes.

Maximizing IP Filter performance with group statements

Solaris 10 ships with the IP filter firewall package, which is a fast, flexible and verstile firewall engine. IP filter by default will read /etc/ipf/ipf.conf to get the list of rules (e.g., pass in all) to apply to traffic as it traverses the interfaces in a system, and these rules will be applied in a top down fashion. If rules are not optimally ordered and grouped, CPU cycles will be wasted, since IP filter will have to check the packet against a set of rules that don’t apply to the type of traffic that is passing through the interface. The following rule set is a good example of a poorly written IP filter configuration file:

# Allow all packets to flow on the loopback interface (wish we had skip on here)
pass in quick on lo0 all
pass out quick on lo0 all

# Set up default block all policy for eri0
block in log on eri0 all

# Set up default block all policy for eri1
block in log on eri1 all

# Apply rules
pass in quick on eri0 proto tcp from any to 192.168.1.3/32 port = 22 keep state
pass in quick on eri0 proto udp from any to 192.168.1.3/32 port = 53 keep state
pass in quick on eri0 proto tcp from any to 192.168.1.3/32 port = 80 keep state
pass in quick on eri0 proto tcp from any to 192.168.1.3/32 port = 110 keep state
pass in quick on eri0 proto tcp from any to 192.168.1.3/32 port = 143 keep state
pass in quick on eri0 proto tcp from any to 192.168.1.3/32 port = 443 keep state
pass in quick on eri1 proto tcp from any to 192.168.1.3/32 port = 993 keep state
pass in quick on eri1 proto tcp from any to 192.168.1.3/32 port = 995 keep state

When a sytem using this rule set receives a new connection on TCP port 995 (secure POP), the IP Filter firewall engine will need to evaluate 8 rules prior to allowing the connection through. As the number of rules increases, the CPU overhead required to enforce policy can quickly exceed the resources in a system. IP filter solves this problem by allowing rules to be grouped and assigned to an object (e.g, network interface) with the “group” statement. The following example shows how to arrange the above rules with better performing group statements:

# Allow all packets to flow on the loopback interface (wish we had skip on here)
pass in quick on lo0 all
pass out quick on lo0 all

# Set up default block all policy for eri0, and associate rule group 100 to interface
block in log on eri0 all head 100

# Set up default block all policy for eri1, and associate rule group 200 to interface
block in log on eri1 all head 200

# Associate the following rules with group 100, and order by usage
pass in quick proto udp from any to 192.168.1.3/32 port = 53 keep state group 100
pass in quick proto tcp from any to 192.168.1.3/32 port = 80 keep state group 100
pass in quick proto tcp from any to 192.168.1.3/32 port = 443 keep state group 100
pass in quick proto tcp from any to 192.168.1.3/32 port = 110 keep state group 100
pass in quick proto tcp from any to 192.168.1.3/32 port = 143 keep state group 100
pass in quick proto tcp from any to 192.168.1.3/32 port = 22 keep state group 100

Associate the following rules with group 200, and order by usage
pass in quick proto tcp from any to 192.168.1.3/32 port = 993 keep state group 200
pass in quick proto tcp from any to 192.168.1.3/32 port = 995 keep state group 200

When this rule set is used, a connection to TCP port 995 will be matched by the second rule in group 200. Since we avoided six rule checks in this example, IP filter had to perform less work to get the same results (an allowed connection). This might not seem like a big deal on a workstation or lightly loaded server, but firewalls that handle thousands of connections each minute can definitely benefit from group statements and rule placement optimizations.

Jarod Jenson coming to Atlanta to talk about DTrace

If you live in the Atlanta, GA area, you might be interested to know that Jarod Jenson will be talking about DTrace April 19th at the Marriott Perimeter Center. The event is being put on by Sun FREE of charge, and will be way better than any training course you could take on the subject ( there is one exception; a class by Brendan Gregg would be just as awesome )! Sun is requiring pre-registration to ensure that enough seats are available, and the meeting will be held at a swank location! I plan to wander to the event with Clay, and I am hoping to learn more about DTrace, the DVM provider, debugging and DTrace implementation details. Jarod is one of the smartest people at Sun, and the Atlanta SysAdmin community is truly fortunate to get him for a day. If you can make time, I HIGHLY recommend wandering to the Marriott to hear Jarod talk! You WILL NOT be disappointed!

Getting around smartmontools linker errors

I posted a blog entry a while back that showed how to setup smartd on a Solaris system. Two individuals left comments indicating that smartmontools wouldn’t build, and one individual (Peter) received the following errors during the build process:

gcc  -g -O2 -Wall -W -c `test -f 'os_solaris_ata.s' || echo './'`os_solaris_ata.s
gcc  -g -O2 -Wall -W   -o smartd  smartd.o atacmdnames.o  atacmds.o ataprint.o knowndrives.o scsicmds.o scsiprint.o utility.o o
s_solaris.o os_solaris_ata.o -lnsl
ld: fatal: relocation error: R_SPARC_32: file os_solaris_ata.o: symbol : offset 0xfea5feef is non-aligned
ld: fatal: relocation error: R_SPARC_32: file os_solaris_ata.o: symbol : offset 0xfea5fef5 is non-aligned
ld: fatal: relocation error: R_SPARC_32: file os_solaris_ata.o: symbol : offset 0xfea5fef9 is non-aligned
ld: fatal: relocation error: R_SPARC_32: file os_solaris_ata.o: symbol : offset 0xfea5fefd is non-aligned
ld: fatal: relocation error: R_SPARC_32: file os_solaris_ata.o: symbol : offset 0xfea72806 is non-aligned
ld: fatal: relocation error: R_SPARC_32: file os_solaris_ata.o: symbol : offset 0xfea77436 is non-aligned
collect2: ld returned 1 exit status

I was able to recreate the problem on my Sun Ultra 10 running Solaris 10. To get smartmontools to compile, I ran “configure,” “make,” and then waited for the build process to fail. Once the build errored out, I removed the “-g” option from the gcc options and executed the following statements by hand:

$ gcc -O2 -Wall -W -c `test -f ‘os_solaris_ata.s’ || echo ‘./’`os_solaris_ata.s

$ make

Since os_solaris_ata.s is a hand coded SPARC assembly file, I am not 100% certain why the smartmontools authors are trying to add debugging symbols to the object file. I will need to do some additional digging to find the answer.

Viewing dynamic executable dependencies

Large dynamically linked executables can have a LOT of dependencies, which are resolved by the runtime linker when a program is executed. To see which libraries an executable and the executable’s shared dependencies depend on, the ldd utility can be used:

$ ldd /usr/sbin/metastat

        libmeta.so.1 =>  /lib/libmeta.so.1
        libc.so.1 =>     /lib/libc.so.1
        libnsl.so.1 =>   /lib/libnsl.so.1
        libadm.so.1 =>   /lib/libadm.so.1
        libdevid.so.1 =>         /lib/libdevid.so.1
        libgen.so.1 =>   /lib/libgen.so.1
        libefi.so.1 =>   /lib/libefi.so.1
        libdevinfo.so.1 =>       /lib/libdevinfo.so.1
        libscf.so.1 =>   /lib/libscf.so.1
        libmp.so.2 =>    /lib/libmp.so.2
        libmd5.so.1 =>   /lib/libmd5.so.1
        libuuid.so.1 =>  /lib/libuuid.so.1
        libnvpair.so.1 =>        /lib/libnvpair.so.1
        libdoor.so.1 =>  /lib/libdoor.so.1
        libuutil.so.1 =>         /lib/libuutil.so.1
        libsocket.so.1 =>        /lib/libsocket.so.1
        libm.so.2 =>     /lib/libm.so.2
        /platform/SUNW,Ultra-5_10/lib/libc_psr.so.1
        /platform/SUNW,Ultra-5_10/lib/libmd5_psr.so.1

To view just the shared libraries that the executable depends on, the elfdump “-d” (dump the contents of the dynamic symbol table) option can be used:

$ /usr/ccs/bin/elfdump -d /usr/sbin/metastat | grep NEEDED

       [0]  NEEDED           0x867             libmeta.so.1
       [1]  NEEDED           0x884             libc.so.1

In the examples above, metastat only requires two libraries, but those two libraries have numerous dependencies.

Debugging problems with Solaris in.dhcpd vendor options

While attempting to jumpstart my Sun Ultra 10 this week, I encountered the following error:

ok boot net:dhcp – install
Boot device: /pci@1f,0/pci@1,1/network@1,1:dhcp File and args: – install
38800 panic – boot: Could not mount filesystem.
Program terminated

The machine was getting a kernel, but for some reason was unable to mount the Solaris miniroot. Since DHCP was used to jumpstart the client, the location of the miniroot should have been provided as a DHCP option. To begin debugging the problem, I fired up snoop on the jumpstart server to see which options (options are used to pass custom items such as the boot server to the client) were sent to the client:

$ snoop -vv -d hme0 ether 0:3:ba:9:62:f8 | grep DHCP


DHCP: Message type = DHCPDISCOVER
DHCP: Client Class Identifier = "SUNW.Ultra-5_10"
DHCP: Requested Options:
DHCP:    1 (Subnet Mask)
DHCP:    3 (Router)
DHCP:   12 (Client Hostname)
DHCP:   43 (Vendor Specific Options)
DHCP: Maximum DHCP Message Size = 1472 bytes

[ ..... ]

DHCP: Message type = DHCPOFFER
DHCP: DHCP Server Identifier = 192.168.1.3
DHCP: IP Address Lease Time = -1 seconds
DHCP: Subnet Mask = 255.255.255.0
DHCP: Boot File Name = SUNW.sun4u

Well this is odd, the server is returning just the basic options (i.e. boot file name, IP address, netmask, etc.). Since the jumpstart-specific options weren’t returned, I ran the dhtadm utility with the “-P” (print the contents of DHCP tables) option to print the symbols and macros in the DHCP tables:

$ dhtadm -P

Name                    Type            Value
==================================================
Solaris11SPARC          Macro           :SinstNM="tigger":SinstIP4=192.168.1.3:SinstPTH="/media/solaris/11_sparc/current":SrootNM="tigger":SrootIP4=192.168.1.3:SrootPTH="/media/solaris/11_sparc/current/Solaris_11/Tools/Boot":BootFile="SUNW.sun4u":SsysidCF="192.168.1.3:/media/solaris/11_sparc/":SjumpsCF="192.168.1.3:/media/solaris/":BootSrvA=192.168.1.3:
Solaris10x86            Macro           :SinstNM="tigger":SinstIP4=192.168.1.3:SinstPTH="/media/solaris/10_x86/current":SrootNM="tigger":SrootIP4=192.168.1.3:SrootPTH="/media/solaris/10_x86/current/Solaris_10/Tools/Boot":BootFile="SUNW.sun4u":SsysidCF="192.168.1.3:/media/solaris/10_x86/":SjumpsCF="192.168.1.3:/media/solaris/":BootSrvA=192.168.1.3:
Solaris10SPARC          Macro           :SinstNM="tigger":SinstIP4=192.168.1.3:SinstPTH="/media/solaris/10_sparc/current":SrootNM="tigger":SrootIP4=192.168.1.3:SrootPTH="/media/solaris/10_sparc/current/Solaris_10/Tools/Boot":BootFile="SUNW.sun4u":SsysidCF="192.168.1.3:/media/solaris/10_sparc/":SjumpsCF="192.168.1.3:/media/solaris/":BootSrvA=192.168.1.3:
FedoraCore5             Macro           :BootFile="fedora5/pxelinux.0":BootSrvA=192.168.1.3:
PXEClient:Arch:00000:UNDI:002001        Macro           :BootSrvA=192.168.1.3:
tigger                  Macro           :Include=Locale:Timeserv=192.168.1.3:LeaseTim=86400:LeaseNeg:DNSdmain="foo.com":DNSserv=192.168.1.1:
Locale                  Macro           :UTCoffst=-18000:
Sterm                   Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,15,ASCII,1,0
SjumpsCF                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,14,ASCII,1,0
SsysidCF                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,13,ASCII,1,0
SinstPTH                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,12,ASCII,1,0
SinstNM                 Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,11,ASCII,1,0
SinstIP4                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,10,IP,1,1
SbootRS                 Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,9,NUMBER,2,1
Stz                     Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,8,ASCII,1,0   
SbootFIL                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,7,ASCII,1,0
SswapPTH                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,6,ASCII,1,0
SswapIP4                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,5,IP,1,0
SrootPTH                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,4,ASCII,1,0
SrootNM                 Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,3,ASCII,1,0
SrootIP4                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,2,IP,1,1
SrootOpt                Symbol          Vendor=SUNW.Ultra-10 SUNW.i86pc,1,ASCII,1,0

At first glance, the DHCP vendor options looked fine, so I started to think through how a DHCP jumpstart installation works. After pondering this for a few minutes, it dawned on me that in.dhcpd uses the Client Class Identifier (e.g., SUNW.Ultra-30) to identify the client and return the options specific to that client. To see what my Ultra 10 was reporting itself as, I fired up snoop:

$ snoop -vv -d hme0 ether 0:3:ba:9:62:f8 | grep DHCP

[ ….. ]

DHCP: Client Class Identifier = “SUNW.Ultra-5_10”

Bingo! The client reported itself as a SUNW.Ultra-5_10, but I had specified SUNW.Ultra-10 when creating the symbols. To fix the issue, I updated the pertinent symbols, restarted in.dhcpd, and low and behold the jumpstart worked flawlessly! The moral of this story: there is none. :)