Archive for 'Linux Debugging'
I am currently working on upgrading a number of Oracle RAC nodes from RHEL4 to RHEL5. After I upgraded the first node in the cluster, my DBA contacted me because the RHEL5 node was extremely sluggish. When I looked at top, I saw that a number of kswapd processes were consuming CPU: $ top top [...]
I ran into an issue last week where two nodes using shared storage lost the partition table on one of the storage devices they were accessing. This was extremely evident in the output from fdisk: $ fdisk -l /dev/sdb Disk /dev/sdb: 107.3 GB, 107374182400 bytes 255 heads, 63 sectors/track, 13054 cylinders Units = cylinders of [...]
I have been debugging a problem with Redhat cluster, and was curious if a specific process was getting executed. On my Solaris 10 hosts I can run execsnoop to observe system-wide process creation, but there isn’t anything comparable on my Linux hosts. The best I’ve found is systemtap, which provides the kprocess.exec probe to monitor [...]
I support a couple of yum repositories, and use the yum repository build instructions documented in my previous post to create my repositories. When I tried to apply the latest CentOS 5.3 updates to one of my servers last week, I noticed that I was getting a number of “Error performing checksum” errors: $ yum [...]
Most Linux distributions ship with the netconsole service, which allows kernel printk() messages to be sent to a remote destination. This feature can be useful for debugging system hangs and panics, and is handy for archiving console messages to a central location. To configure netconsole, you will need to add the IP address of a [...]
While debugging an application a few weeks back, I noticed the following error in the application log: Cannot open file : Too many open files The application runs as an unprivileged user, and upon closer inspection I noticed that the maximum number of file descriptors available to the process was 1024: $ ulimit -n 1024 [...]