The subtleties between the NXDOMAIN, NOERROR and NODATA DNS response codes

This past weekend I spent some time troubleshooting a DNS timeout issue. During my debugging session I noticed some subtle differences when querying an A and AAAA record with dig:

$ dig  +answer @216.92.61.30 prefetch.net a
.......
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9624
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1

$ dig  +answer @216.92.61.30 prefetch.net aaaa
.......
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16528
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

When I was interpreting the results I was expecting dig to provide a response code of NODATA when I asked the DNS server for a resource record that didn’t exist. Obviously that didn’t happen. This led me to ask myself what is the technical difference between NODATA, NOERROR and NXDOMAIN? Let’s take a peak at the descriptions of each in RFC 1035:

RCODE           Response code - this 4 bit field is set as part of
                responses.  The values have the following
                interpretation:

                0               No error condition

                1               Format error - The name server was
                                unable to interpret the query.

                2               Server failure - The name server was
                                unable to process this query due to a
                                problem with the name server.

                3               Name Error - Meaningful only for
                                responses from an authoritative name
                                server, this code signifies that the
                                domain name referenced in the query does
                                not exist.

NXDOMAIN (RCODE 3 above) is pretty straight forward. When this code is present the record being requested doesn’t exist in any shape or form:

$ dig +answer @216.92.61.30 gak.prefetch.net a

;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 8782
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

Now what about NODATA and NOERROR? Well I found out there isn’t an RCODE associated with NODATA. This value is computed based on the answers to one or more questions and dig represents NODATA by displaying NOERROR with an ANSWER of zero. So what does NOERROR with an ANSWER of 0 actually represent? It means one or more resource records exist for this domain but there isn’t a record matching the resource record type (A, AAAA, MX, etc.). This was a useful clarification for me and helped me isolate and fix the issue I was debugging. Sometimes the devil is in the details.

Locating WWPNs on Linux servers

I do a lot of storage-related work, and often times need to grab WWPNs to zone hosts and to mask storage. To gather the WWPNs I would often times use the following script on my RHEL and CentOS servers:

#!/bin/sh

FC_PATH="/sys/class/fc_host"
 
for fc_adapter in `ls ${FC_PATH}`
do  
    echo "${FC_PATH}/${fc_adapter}:"
    NAME=$(awk '{print $1}' ${FC_PATH}/${fc_adapter}/symbolic_name )
    echo "  $NAME `cat ${FC_PATH}/${fc_adapter}/port_name`"
done

This produced consolidated output similar to:

$ print_wwpns

/sys/class/fc_host/host5:
  QLE2562 0x21000024ff45afc2
/sys/class/fc_host/host6:
  QLE2562 0x21000024ff45afc3

While doing some research I came across sysfsutils, which contains the incredibly useful systool utility. This nifty little tool allows you to query sysfs values, and can be used to display all of the sysfs attributes for your FC adapters:

$ systool -c fc_host -v

Class = "fc_host"

  Class Device = "host5"
  Class Device path = "/sys/class/fc_host/host5"
    fabric_name         = "0x200a547fee9e5e01"
    issue_lip           = 
    node_name           = "0x20000024ff45afc2"
    port_id             = "0x0a00c0"
    port_name           = "0x21000024ff45afc2"
    port_state          = "Online"
    port_type           = "NPort (fabric via point-to-point)"
    speed               = "8 Gbit"
    supported_classes   = "Class 3"
    supported_speeds    = "1 Gbit, 2 Gbit, 4 Gbit, 8 Gbit"
    symbolic_name       = "QLE2562 FW:v5.06.03 DVR:v8.03.07.09.05.08-k"
    system_hostname     = ""
    tgtid_bind_type     = "wwpn (World Wide Port Name)"
    uevent              = 

  Class Device = "host6"
  Class Device path = "/sys/class/fc_host/host6"
    fabric_name         = "0x2014547feeba9381"
    issue_lip           = 
    node_name           = "0x20000024ff45afc3"
    port_id             = "0x9700a0"
    port_name           = "0x21000024ff45afc3"
    port_state          = "Online"
    port_type           = "NPort (fabric via point-to-point)"
    speed               = "8 Gbit"
    supported_classes   = "Class 3"
    supported_speeds    = "1 Gbit, 2 Gbit, 4 Gbit, 8 Gbit"
    symbolic_name       = "QLE2562 FW:v5.06.03 DVR:v8.03.07.09.05.08-k"
    system_hostname     = ""
    tgtid_bind_type     = "wwpn (World Wide Port Name)"
    uevent              = 

This output is extremely useful for storage administrators, and provides everything you need in a nice consolidated form. +1 for systool!

Configuring yum to keep more than three kernels

When you run ‘yum update’ on your Fedora system, the default yum configuration will keep the last 3 kernels. This allows you to fail back to a previous working kernel if you encounter an error or a bug. The number of kernels to keep is controlled by the installonly_limit option, which is thoroughly described in the yum.conf(8) manual page:

installonly_limit Number of packages listed in installonlypkgs to keep installed at the same time. Setting to 0 disables this feature. Default is ‘0’. Note that this functionality used to be in the “installonlyn” plugin, where this option was altered via. tokeep. Note that as of version 3.2.24, yum will now look in the yumdb for a installonly attribute on installed packages. If that attribute is “keep”, then they will never be removed.

If you need to keep more than 3 kernels, you can increase the value of installonly_limit in /etc/yum.conf.

Purging the yum header and package cache

Most of the Linux distributions that utilize the yum package manager cache headers and packages by default. These files are cached in the directory identified by the cachedir option, which defaults to /var/cache/yum on all of the hosts I checked. On my Fedora 16 desktop this directory has grown to 167MB in size:

$ du -sh /var/cache/yum
167M /var/cache/yum

You can clean out the cached directory with the yum “clean” option:

$ yum clean all

If disk space is an issue on your systems, you can also set the “keepcache” option to 0. This will remove cached files after they are installed, as noted in yum.conf(8)the manual page:

keepcache Either `1' or `0'. Determines whether or not yum keeps
          the cache of headers and packages after successful installation.
          Default is '1' (keep files)

This is a useful option for hosts that have limited disk space. Nice!

Sudo insults — what a fun feature!

I think humor plays a big role in life, especially the life of a SysAdmin. This weekend I was cleaning up some sudoers files and came across a reference to the “insult” option in the documentation. Here is what the manual says:

insults If set, sudo will insult users when they enter an incorrect password. This flag is off by default.”

This of course peaked my curiosity, and the description in the online documentation got me wondering what kind of insults sudo would spit out. To test this feature out I compiled sudo with the complete set of insults:

$ ./configure –prefix=/usr/local –with-insults –with-all-insults

$ gmake

$ gmake install

To enable insults I added “Defaults insults” to my sudoers file. This resulted in me laughing myself silly:

$ sudo /bin/false

Password: 
Take a stress pill and think things over.
Password: 
This mission is too important for me to allow you to jeopardize it.
Password: 
I feel much better now.
sudo: 3 incorrect password attempts

$ sudo /bin/false

Password: 
Have you considered trying to match wits with a rutabaga?
Password: 
You speak an infinite deal of nothing
Password: 
You speak an infinite deal of nothing
sudo: 3 incorrect password attempts

Life without laughter is pretty much a useless life. You can quote me on that one! ;)

Summarizing system call activity on Solaris hosts

I previously described how to use strace to to summarize system call activity on Linux hosts. Solaris provides similar data with the truss “-c” option:

$ truss -c -p 26396
syscall               seconds   calls  errors
read                     .000       3
write                   7.671   25038
time                     .000      21
stat                     .000      15
lseek                    .460   24944
getpid                   .000      15
kill                     .162    7664
sigaction                .004     237
writev                   .000       3
lwp_sigmask              .897   49887
pollsys                  .476    7667
                     --------  ------   ----
sys totals:             9.674  115494      0
usr time:               2.250
elapsed:              180.940

The output contains the total elapsed time, a breakdown of user and system time, the number of errors that occurred, the number of times each system call was invoked and the total accrued time for each system call. This has numerous uses, and allows you to easily see how a process is intereacting with the kernel. Sweet!

« Older Entries