The netkit-ftp client that ships with Redhat Enterprise Linux comes with a verbose option, which will among other things instruct the client to print the number of bytes transferred after each file is successfully sent. These messages look similar to the following:

85811076 bytes sent in 1.3e+02 seconds (6.7e+02 Kbytes/s)

I had several enormous files (each > 2GB) I needed to move to another server, and noticed that the netkit-ftp client wasn’t printing status messages after the files were transferred. To see what was causing the issue, I started reading throught the netkit-ftp source code. After a few minutes of poking around ftp.c, I came across this gem:

void
sendrequest(const char *cmd, char *local, char *remote, int printnames)
{
   volatile long bytes = 0

  while ((c = read(fileno(fin), buf, sizeof (buf))) > 0) {
      printf("Bytes (%ld) is incremented by %d\n", bytes, c);
      bytes += c;
      for (bufp = buf; c > 0; c -= d, bufp += d)
       if ((d = write(fileno(dout), bufp, c)) <= 0)
              break;
    ......
}

I reckon the folks who developed this code never transferred files larger than 2^31 bits on 32-bit platforms. After changing bytes (and the code that uses bytes) to use the unsigned long long data type, everything worked as expected. I digs me some opensource!

Posted by matty, filed under Linux Debugging. Date: April 27, 2007, 7:06 pm | No Comments »

While testing out LDAP authentication on a CentOS 4.4 Linux host this week, I noticed that the “password” statements I added to /etc/pam.d/sshd weren’t taking effect:

password    requisite     /lib/security/$ISA/pam_cracklib.so retry=3
password    sufficient    /lib/security/$ISA/pam_unix.so nullok use_authtok md5 shadow
password    sufficient    /lib/security/$ISA/pam_ldap.so use_authtok
password    required      /lib/security/$ISA/pam_deny.so

After pondering the issue for a while, I eventually started to wonder if the “passwd” utility was called by sshd to change user passwords. To see if this was the case, I decided to expire a user’s password, and then strace sshd while I logged in as that user:

$ strace -f -e trace=execve -p 2616

Process 2616 attached - interrupt to quit
--- SIGCHLD (Child exited) @ 0 (0) ---
Process 26638 attached
[pid 26638] execve("/usr/sbin/sshd", ["/usr/sbin/sshd", "-R"], [/* 14 vars */]) = 0
Process 26639 attached
Process 26639 detached
[pid 26638] --- SIGCHLD (Child exited) @ 0 (0) ---
Process 26640 attached
Process 26641 attached
[pid 26641] execve("/usr/bin/passwd", ["passwd"], [/* 14 vars */]) = 0
Process 26641 detached
[pid 26640] --- SIGCHLD (Child exited) @ 0 (0) ---
Process 26640 detached
[pid 26638] --- SIGCHLD (Child exited) @ 0 (0) ---
Process 26638 detached
--- SIGCHLD (Child exited) @ 0 (0) ---
Process 2616 detached

Sure enough, /usr/bin/passwd is called to change an expired password. To verify that the sshd daemon was the entity invoking /usr/bin/passwd, I used the strings utility to see if the string “/usr/bin/passwd” resided in the data segment of the sshd executable:

$ strings sshd | grep passwd

kerberosorlocalpasswd
/usr/bin/passwd
auth2-passwd.c
%s: struct passwd size mismatch
sshpam_passwd_conv
sshpam_auth_passwd

Once I knew that sshd called /usr/bin/passwd, I added my changes to /etc/pam.d/system-auth (which is “stacked” by pam_stack.so in /etc/pam.d/passwd), and everything worked as expected. I kinda dig the stacking capabilities that come out of the box with CentOS 4.4, since you can make a change in one location (/etc/pam.d/system-auth), and it’s effects are propogated to all service definitions in /etc/pam.d.

Posted by matty, filed under Linux Debugging. Date: January 20, 2007, 4:00 pm | 1 Comment »

While debugging a problem a few weeks back, I needed to generate a core file from a hung process. I typically use the gcore utility to generate core files from running processes, but in this case I was already attached to the process with gdb, so gcore failed:

$ gcore 2575
ptrace: Operation not permitted.
You can’t do that without a process to debug.

Gak! I remembered reading about a gdb option that would dump core, so I wandered off to read through my gdb notes. Sure enough, gdb has a “generate-core-file” command to create a core file:

$ gdb -q - 2575

Attaching to process 2575
Reading symbols from /usr/sbin/gpm...(no debugging symbols found)...done.
Using host libthread_db library "/lib/tls/libthread_db.so.1".
Reading symbols from /lib/tls/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libc.so.6...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2

0x0046e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2

(gdb) generate-core-file
Saved corefile core.2575

(gdb) detach
Detaching from program: /usr/sbin/gpm, process 2575

(gdb) quit

$ ls -al core.*
-rw-r–r–  1 root root 2468288 Dec 11 13:49 core.2575

Nifty! I am starting to wonder if there is anything gdb can’t do. :)

Posted by matty, filed under Linux Debugging. Date: December 21, 2006, 11:19 am | 2 Comments »