Archive for 'Linux Debugging'

Summarizing system call activity on Linux systems

Linux has a guadzillion debugging utilities available. One of my favorite tools for debugging problems is strace, which allows you to observe the system calls a process is making in realtime. Strace also has a “-c” option to summarize system call activity: $ strace -c -p 28009 Process 28009 attached Process 28009 detached % time […]

Using the automated bug-reporting tool (abrt) to generate core dumps when a Linux process fails

Software fails, and it often occurs at the wrong time. When failures occur I want to understand why, and will usually start putting together the events that lead up to the issue. Some application issues can be root caused by reviewing logs, but catastrophic crashes will often require the admin to sit down with gdb […]

Dealing with xauth “error in locking authority file” errors

I recently logged into one of my servers and received the following error: $ ssh foo matty@foo’s password: Last login: Tue Nov 1 13:42:52 2011 from /usr/bin/xauth: error in locking authority file /home/matty/.Xauthority I haven’t seen this one before, but based on previous “locking issues” I’ve encountered in the past I ran strace against […]

One way to avoid tcpdump “packets dropped by kernel” messages

I have been knee deep this week debugging a rather complex DNS issue. I’ll do a full write up on that next week. While I was debugging the issue I needed to fire up tcpdump to watch the DNS queries from one of my authoritative servers to various servers on the Internet. What I noticed […]

Why isn’t Oracle using huge pages on my Redhat Linux server?

I am currently working on upgrading a number of Oracle RAC nodes from RHEL4 to RHEL5. After I upgraded the first node in the cluster, my DBA contacted me because the RHEL5 node was extremely sluggish. When I looked at top, I saw that a number of kswapd processes were consuming CPU: $ top top […]

Using the Linux parted utility to re-create a lost partition table

I ran into an issue last week where two nodes using shared storage lost the partition table on one of the storage devices they were accessing. This was extremely evident in the output from fdisk: $ fdisk -l /dev/sdb Disk /dev/sdb: 107.3 GB, 107374182400 bytes 255 heads, 63 sectors/track, 13054 cylinders Units = cylinders of […]

