Monitoring the ZFS ARC cache

This article was posted by Matty on 2007-10-21 23:48:00 -0400 -0400

The ZFS file system uses the adaptive replacement cache (ARC) to cache data in the kernel. Measuring ARC utilization is pretty straight forward, since ZFS populates a number of kstat values with usage data. Neelakanth Nadgir wrote a cool Perl script to summarize the ARC kstats, and a sample run is included below:

$ arcstat.pl

Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c
12:31:28 0 0 0 0 0 0 0 0 0 283M 283M
12:31:30 0 0 0 0 0 0 0 0 0 283M 283M
12:31:32 3 1 42 1 42 0 0 1 42 283M 283M
12:31:34 0 0 0 0 0 0 0 0 0 283M 283M
12:31:36 4 1 25 1 25 0 0 1 33 283M 283M
12:31:38 0 0 0 0 0 0 0 0 0 283M 283M
12:31:40 24 23 95 23 95 0 0 23 95 284M 283M
12:31:42 35 33 95 33 95 0 0 33 95 202K 283M
12:31:44 0 0 0 0 0 0 0 0 0 202K 283M
12:31:46 0 0 0 0 0 0 0 0 0 202K 283M
12:31:48 0 0 0 0 0 0 0 0 0 202K 283M
12:31:50 0 0 0 0 0 0 0 0 0 202K 283M

Since numerous discussions have come up on zfs-discuss regarding ARC sizing (the size of the ARC is controlled by the zfs:zfs_arc_max tunable), folks will find the “arcsz” column extremely useful. Nice!

Enabling the DTrace hotspot provider after the JVM starts

This article was posted by Matty on 2007-10-18 00:40:00 -0400 -0400

While debugging a JVM performance issue a while back, I encountered the following error when I enabled the DTrace hotspot provider:

$ jinfo -flag +ExtendedDTraceProbespgrep java``

590: Unable to open door: target process not responding or HotSpot VM not loaded

After a bit of debugging, I figured out that the jinfo command needs to be run by the user the JVM runs as. Hopefully this will help others who encounter this annoying problem.

Stopping nfsmapid from querying DNS TXT records

This article was posted by Matty on 2007-10-18 00:23:00 -0400 -0400

With the introduction of NFSv4, user and group identifiers were changed to use the username@domain format. On Solaris hosts, the domain is determined using the following methods:

The NFSMAPID_DOMAIN variable is checked in /etc/default/nfs
DNS is queried for the _nfsv4idmapdomain TXT record
The configured DNS domain is used
The file /etc/defaultdomain is consulted

If a site doesn’t update the NFSMAPID_DOMAIN variable when deploying NFSv4, DNS will be queried for the domain to use. If the DNS server doesn’t contain a _nfsv4idmapdomain TXT record, you will see failed queries similar to the following:

host1 -> host2 ETHER Type=0800 (IP), size = 77 bytes
host1 -> host2 IP D=1.2.3.4 S=1.2.3.5 LEN=63, ID=19779, TOS=0x0,
TTL=255
host1 -> host2 UDP D=53 S=52032 LEN=43
host1 -> host2 DNS C _nfsv4idmapdomain. Internet TXT ?
________________________________
host2 -> host1 ETHER Type=0800 (IP), size = 77 bytes
host2 -> host1 IP D=1.2.3.5 S=1.2.3.4 LEN=63, ID=26996, TOS=0x0,
TTL=254
host2 -> host1 UDP D=52032 S=53 LEN=43
host2 -> host1 DNS R Error: 3(Name Error)

This can of course pose a problem for large sites, since the DNS server will be inundated with queries for records that don’t exist. If you want to stop these DNS queries from happening, you can add the domain to the NFSMAPID_DOMAIN variable in /etc/default/nfs. Shibby!

Isolating network traffic with IP instances

This article was posted by Matty on 2007-10-09 21:31:00 -0400 -0400

With the introduction of Nevada build 57, the Solaris IP stack was enhanced to support IP instances. IP instances allow you to create one or more unique TCP/IP stacks on a server, and each stack can be managed independently. What makes these extremely powerful is the ability to assign an IP instance to a zone or Xen instance, and then configure the IP stack attributes (e.g., IP filter policies, DHCP settings, etc.) from inside the zone or Xen guest domain.

To create an IP instance and assign it to a Solaris zone, you will first need to identify a spare physical NIC to dedicate to the zone (when Crossbow comes around, you will be able to allocate virtual NICs to zones, and these virtual NICs can reside on a physical NIC). Once a NIC is identified, you can use the zonecfg “ip-type” directive and the “exclusive” keyword to allocate an IP instance to a zone:

zonecfg:apache> **create**
zonecfg:apache> **set zonepath=/zones/apache**
zonecfg:apache> **set ip-type=exclusive**
zonecfg:apache> **add net**
zonecfg:apache:net> **set physical=e1000g1**
zonecfg:apache:net> **end**
zonecfg:apache> **verify**
zonecfg:apache> **commit**
zonecfg:apache> **exit**

Once a zone that uses an IP instance is created, the NIC can be configured just like any other interface on a Solaris server. Here is an example of how to plumb an interface in a zone, and apply a basic IP filter policy to that zone:

$ zlogin -C apache

$ ifconfig e1000g1 plumb

$ ifconfig e1000g1 inet 192.168.1.2 netmask 255.255.255.0 broadcast

$ route add default 192.168.1.1

$ cat /etc/ipf/ipf.conf

### Block all inbound and outbound traffic by default
block in log on e1000g1 all head 100
block out log on e1000g1 all head 150

### Allow inbound SSH connections
pass in quick proto tcp from any to any port = 22 keep state group 100

### Allow my box to utilize all UDP, TCP and ICMP services
pass out quick proto tcp all flags S/SA keep state group 150
pass out quick proto udp all keep state group 150
pass out quick proto icmp all keep state group 150

$ svcadm enable ipfilter

$ ipf -f /etc/ipf.conf

As you can see, this is no different than configuring a physical IP interface from the global zone! IP instances are amazingly cool, and sites that need to isolate traffic between zones will definitely be happy (I am sure they will be even happier once crossbow is available)!

Getting core files when a Solaris hosts gets confused

This article was posted by Matty on 2007-10-07 12:21:00 -0400 -0400

In the past few months, I have had a couple of Solaris hosts go haywire (e.g., zones hanging, network interfaces no longer responding, etc.). When problems similar to these occur, I like to generate a core file from the running kernel to help the Sun support organization isolate the problem. There are two ways that I am aware of to grab a core file from a borked system. The first method utilizes the reboot utilities “-d” option:

$ reboot -d

This will reboot the host, and will generate a core file as part of the reboot. The second method uses the savecore utility to generate a core file from a running system. To use savecore, you will need to first configure a dedicated dump device (since swap is most likely in use on a production host, you can’t use it). Once the dedicated dump device is configured, the savecore utility can be run with the “-L” option:

$ savecore -Lv

It’s all about a bug free Solaris / Nevada, and these are two methods to help us get there.