Solaris ‘w’ command bug?

While reviewing some system information last week , I came across a discrepancy (which I would think is a bug — at least according to the idle time description in the w manual pages) in the Solaris ‘w’ output:

$ date
Wed Dec 21 10:19:59 EST 2005

$ w

10:19am  up 5 day(s), 19:52,  2 users,  load average: 0.02, 0.03, 0.04
User     tty           login@  idle   JCPU   PCPU  what
user1234 pts/1        10:19am    51                myapp

$ who -u

user1234   pts/1        Dec 21 10:19   .    10503      (1.2.3.4)

The user “user1234” logged in at 10:19, and according to w has been idle for 51 minutes (since they have been logged in for one minute or less, they obviously cannot be idle for 51 minutes). A quick check of w.c reveals that idle time is calculated using the atime from a stat of /dev/pts/[1-9]+:

if (stat(ttyname, &stbuf) != -1) {
    lastaction = stbuf.st_atime;
    diff = now - lastaction;
    diff = DIV60(diff);

While who.c uses the mtime from a stat of /dev/pts/[1-9]+:

if ((rc = stat(path, stbufp)) != -1) {
    idle = timnow - stbufp->st_mtime;

To understand when atime and mtime are updated, I read through the Solaris stat(2) manual page:

st_atime        Time  when  file  data  was  last  accessed.
                Changed by the following functions: creat(),
                mknod(), pipe(), utime(2), and read(2).

st_mtime        Time when data was last modified. Changed by
                the  following  functions: creat(), mknod(),
                pipe(), utime(), and write(2).

Since there seems to be some inconsistency (atime in w and mtime in who) in the source code, I thought I would look at my favorite OS (FreeBSD / Darwin) to see how they implemented w:

# From FreeBSD source code
touched = stp->st_atime;
if (touched < ep->utmp.ut_time) {
   /* tty untouched since before login */
   touched = ep->utmp.ut_time;
}
if ((ep->idle = now - touched) < 0)
   ep->idle = 0;

So this explains it. Since the user logged in and didn’t do anything, the pseudo-terminal would have been written to but not necessarily read from (I assume since the pseudo-terminal is setup by the terminal driver after the user is authenticated). FreeBSD fixes this problem by adding an additional check to compare idle time to the time the user logged in, which seems to address this issue. I posted this issue to the Solaris discussion forums, but unfortunately no one replied. Since the putback model with opensolaris is still somewhat new and requires an excessive amount of work on the contributor’s behalf, I think I will just chock this up to fun and move my code analysis efforts back to Apache and FreeBSD (which is a super neat OS).

StartTLS support for ssl-cert-check

Marco Amrein sent me a StartTLS patch for ssl-cert-check this week. This nifty patch allows ssl-cert-check to query servers that setup the TLS connection after the initial protocol handshake, which can be helpful for checking the certificate expiration on SMTP and POP3 servers:

$ ssl-cert-check -s mail.prefetch.net -p 25

Host                                            Status       Expires      Days Left
----------------------------------------------- ------------ ------------ ----------
mail.prefetch.net:25                             Valid        Jun 20 2006  179   

Thanks Marco!

Saving bandwidth with mod_deflate

While playing around with mod_deflate this week, I began to wonder how much bandwidth would be saved by enabling mod_deflate on a busy production site. To get a better idea of the bandwidth savings, I developed deflate-stats.pl:

$ deflate-stats.pl /var/tmp/apache/logs/deflate_log

Processed logfile /var/tmp/apache/logs/deflate_log

Total number of HTTP requests          : 177934
Total number of HTTP requests deflated : 150511
Total output bytes by mod_deflate      : 327.16 Megabytes
Total input bytes to mod_deflate       : 14.15 Gigabytes
Total bandwidth saved by mod_deflate   : 13.83 Gigabytes (97%)

Average request size                   : 98.55 Kilobytes
Average # bytes saved per request      : 96.32 Kilobytes

Requests  Total Bytes Requested  Bytes Actually Sent   Ratio  URI                                     
94506     13.90 Gigabytes        298.05 Megabytes      97     /nada.html                              
28007     41.24 Megabytes        13.46 Megabytes       67     /doc.txt                                
27998     206.13 Megabytes       15.65 Megabytes       92     /feedback.txt          

If you are paying for each byte of transit, I highly recommend testing out the awesome mod_deflate module. mod_deflate should be able to speed up rendering time for dial-up users, and could lead to a reduction in monthly circuit costs.

Using StartTLS with HTTP connections

While catching up with some news groups today, I came across RFC 2817. This RFC describes HTTP protocol extensions to allow a client and server to initiate a TLS session over an existing connection. This has numerous benefits, and could definitely speed up web-based commerce (e.g., a dedicated secure connection is not required, slow start is avoided, etc.) . Now if only the browser developers would implement this! :)

Bizarre SVM Issue

I had a disk drive fail in one of my machines this week, and used the typical drive replacement procedure (cfgadm / metadevadm / devfsadm) to replace the physical disk. Once the drive was replaced, I attempted to run metareplace to re-synchronize the two sub-mirrors:

$ metareplace -e d0 c0t1d0s0
d0: device c0t1d0s0 is enabled

$ metastat d0

d0: Mirror
    Submirror 0: d10
      State: Okay
    Submirror 1: d20
      State: Unavailable
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 8391880 blocks (4.0 GB)

d10: Submirror of d0
    State: Okay
    Size: 8391880 blocks (4.0 GB)
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c0t0d0s0          0     No            Okay   Yes


d20: Submirror of d0
    State: Unavailable
    Size: 8391880 blocks (4.0 GB)
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c0t1d0s0          0     No               -   Yes

Eh? For some reason d20 refused to re-synchronize and enter the Okay state, and repeated attempts to use metareplace led to the same behavior. This seemed odd, so I decided to detach d20 and re-attach it with metadetach and metattach:

$ metadetach d0 d20
d0: submirror d20 is detached

$ metattach d0 d20
d0: submirror d20 is attached

These operations completed successfully, and once the re-synchronization completed, the sub-mirror entered the “Okay” state:

$ metastat d0

d0: Mirror
    Submirror 0: d10
      State: Okay
    Submirror 1: d20
      State: Okay
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 8391880 blocks (4.0 GB)

d10: Submirror of d0
    State: Okay
    Size: 8391880 blocks (4.0 GB)
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c0t0d0s0          0     No            Okay   Yes


d20: Submirror of d0
    State: Okay
    Size: 8391880 blocks (4.0 GB)
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c0t1d0s0          0     No            Okay   Yes

I am starting to speculate that this is a bug in metareplace, but wasn’t able to pinpoint anything specific on sunsolve. The moral of the story is use metattach/metadetach if you don’t want to waste lots of time. :)