Archive for 'Linux Debugging'
Software fails, and it often occurs at the wrong time. When failures occur I want to understand why, and will usually start putting together the events that lead up to the issue. Some application issues can be root caused by reviewing logs, but catastrophic crashes will often require the admin to sit down with gdb [...]
I recently logged into one of my servers and received the following error: $ ssh foo matty@foo’s password: Last login: Tue Nov 1 13:42:52 2011 from 10.10.56.100 /usr/bin/xauth: error in locking authority file /home/matty/.Xauthority I haven’t seen this one before, but based on previous “locking issues” I’ve encountered in the past I ran strace against [...]
I have been knee deep this week debugging a rather complex DNS issue. I’ll do a full write up on that next week. While I was debugging the issue I needed to fire up tcpdump to watch the DNS queries from one of my authoritative servers to various servers on the Internet. What I noticed [...]
I am currently working on upgrading a number of Oracle RAC nodes from RHEL4 to RHEL5. After I upgraded the first node in the cluster, my DBA contacted me because the RHEL5 node was extremely sluggish. When I looked at top, I saw that a number of kswapd processes were consuming CPU: $ top top [...]
I ran into an issue last week where two nodes using shared storage lost the partition table on one of the storage devices they were accessing. This was extremely evident in the output from fdisk: $ fdisk -l /dev/sdb Disk /dev/sdb: 107.3 GB, 107374182400 bytes 255 heads, 63 sectors/track, 13054 cylinders Units = cylinders of [...]
I have been debugging a problem with Redhat cluster, and was curious if a specific process was getting executed. On my Solaris 10 hosts I can run execsnoop to observe system-wide process creation, but there isn’t anything comparable on my Linux hosts. The best I’ve found is systemtap, which provides the kprocess.exec probe to monitor [...]