I was recently debugging an issue with a shell script, and noticed that the shell was exiting with an exit code greater than 100 when it received a SIGTSTP signal:

$ cat test

#!/bin/bash
sleep 60

# Window one
$ ./test
[1]+ Stopped ./test
Home:~ matty$ echo $?
146

# Window two
$ kill -18 4667

I was curious where the exit value of 146 came from, so I did a bit of digging. It turns out that when a shell exits due to an uncaght signal, the signal number is added to 128 and that is the value that is returned. So in the case above, the exit code 146 was returned. I digs me some random shell knowledge.

Posted by matty, filed under UNIX Shell. Date: November 20, 2008, 12:19 am | No Comments »

I recently gave a presentation at the local UNIX users group titled Debugging Java performance
problems
. The presentation describes various opensource tools and how they can be used to understand what is causing CPU, memory and lock contention issues inside a Java virtual machine. If there are additional tools not discussed in the presentation that you find useful for debugging Java performance problems, please let me know through the comment feature.

Posted by matty, filed under Java. Date: October 27, 2008, 10:12 pm | 4 Comments »

Slashdot and other sites had mention of the release of the 2.6.27 Linux kernel today.   Some of the new features in the kernel take from here about improved SMP support for the page cache:

The page cache is the place where the kernel keeps in RAM a copy of a file to improve performance by avoiding disk I/O when the data that needs to be read is already on RAM. Each “mapping”, which is the data structure that keeps track of the correspondence between a file and the page cache, is SMP-safe thanks to its own lock. So when different processes in different CPUs access different files, there’s no lock contention, but if they access the same file (shared libraries or shared data files for example), they can hit some contention on that lock. In 2.6.27, thanks to some rules on how the page cache can be used and the usage of RCU, the page cache will be able to do lookups (ie., “read” the page cache) without needing to take the mapping lock, and hence improving scalability. But it will only be noticeable on systems with lots of cpus (page fault speedup of 250x on a 64 way system have been measured).

Hasn’t Solaris been able to successfully scale vertically on 64+ CPU systems since the Solaris 2.5.1 days back on the E10k in 1996 without this type of contention?  It also seems like this kernel version brings new enhancements to direct I/O.  This also was implemented back in Solaris 6?

Its great that work is being done in the Linux kernel now to allow for vertical scalability, but it just seems to me that these are already mature kernel features that have been around in Solaris for years.

Posted by mike, filed under Uncategorized. Date: October 10, 2008, 9:44 am | 4 Comments »

I have been a long time Cowboy Mouth fan, and still remember the first time I saw them play a concert in Delaware. They had an incredible amount of energy, and where one of the few bands I had seen that wanted the crowd involved in every aspect of the show. When I heard they were playing one of the local festivals here in town, I decided to venture out with a few friends to see them.

The show started off like all Cowboy Mouth shows with a ton of energy, and the lead singer (who also plays drums) jamming on drums to get the crowd into the show! I don’t recall the full set list, but the show peaked when they jammed out their classic hit “Jenny Says”! These guys are totally rad, and I can’t wait to see them again in the near future!

Posted by matty, filed under Music. Date: October 2, 2008, 9:01 pm | 1 Comment »

One really cool feature of ubuntu is the command-not-found script.  If you try to execute a program, say, nmap — and that dpkg hasn’t been installed, the command-not-found script executes to parse a locally installed database to suggest,  “hey, you should try executing # apt-get install nmap” or something similar.

This really makes ubuntu more user friendly.  Fedora has the ability to use ” # yum whatprovides <>” but yum is going to bomb out if you don’t have a working network connection.  Yum’s first steps are to connect to online repositories to download filelists.sqlite.bz2, primary.sqlite,bz2, etc.

So what are these files?  They are sqlite databases that contain data about what packages are available and the information about those packages…  So lets say we’re looking for some binary called kde4-config, but I don’t have an active network connection…. What package contains that binary?

First, lets find all available instances of filelist.sqlite.  Each fedora RPM repository is going to contain its own copy.

[cylon:/]
<svoboda>locate filelists.sqlite
/var/cache/yum/fedora/filelists.sqlite
/var/cache/yum/livna/filelists.sqlite
/var/cache/yum/updates/filelists.sqlite
/var/cache/yum/updates-newkey/filelists.sqlite

Cool — so there are 4 network repositories here.  Lets mess around with the fedora repository first…

[cylon:/var/cache/yum/fedora]
<svoboda>ls -lh
total 89M
-rw-r–r– 1 root root    0 2008-09-30 10:11 cachecookie
-rw-r–r– 1 root root  58M 2008-09-17 13:46 filelists.sqlite
-rw-r–r– 1 root root 2.0K 2008-09-30 07:19 mirrorlist.txt
drwxr-xr-x 2 root root 4.0K 2008-09-30 10:12 packages
-rw-r–r– 1 root root  32M 2008-09-17 13:35 primary.sqlite
-rw-r–r– 1 root root 2.4K 2008-09-30 10:11 repomd.xml

[cylon:/var/cache/yum/fedora]
<svoboda>file filelists.sqlite primary.sqlite
filelists.sqlite: SQLite 3.x database
primary.sqlite:   SQLite 3.x database

Sure enough, they’re sqlite databases.  Lets issue a select statement on filenames within the filelists.sqlite database and look for the kde4-config binary.  First, lets take a peek at what we have available….

[cylon:/var/cache/yum/fedora]
<svoboda>sqlite3 filelists.sqlite
SQLite version 3.5.9
Enter “.help” for instructions

Before we get too deep, lets turn headers on and the output to column mode so things are a bit more readable…

sqlite> .headers on
sqlite> .mode column

Next, lets see what tables we have available…

sqlite> .tables
db_info   filelist  packages

Filelist looks interesting.  What does it contain?

sqlite> .schema filelist
CREATE TABLE filelist (  pkgKey INTEGER,  dirname TEXT,  filenames TEXT,  filetypes TEXT);
CREATE INDEX dirnames ON filelist (dirname);
CREATE INDEX keyfile ON filelist (pkgKey);

sqlite> pragma table_info (filelist);
0|pkgKey|INTEGER|0||0
1|dirname|TEXT|0||0
2|filenames|TEXT|0||0
3|filetypes|TEXT|0||0
sqlite>

Cool.  So lets extract the pkgKey and filenames column for any entries that contain the phrase “kde4-config” anywhere in the path…

sqlite> SELECT pkgKey,filenames FROM filelist
…> WHERE filenames LIKE ‘%kde4-config%’;
pkgKey      filenames
———-  ———————————————————————————————
3703        preparetips/nepomuk-rcgen/meinproc4/kwrapper4/kunittestmodrunner/kshell4/kross/kjsconsole/kjscmd/kjs/kdeinit4_wrapper/kdeinit4_shutdown/kdeinit4/kded4/kde4automoc/kde4-doxygen.sh/kde4-config/kcookiejar4/kbuildsycoca4/checkXML
3703        kde4-config.1.gz/checkXML.1.gz
10746       preparetips/nepomuk-rcgen/meinproc4/kwrapper4/kunittestmodrunner/kshell4/kross/kjsconsole/kjscmd/kjs/kdeinit4_wrapper/kdeinit4_shutdown/kdeinit4/kded4/kde4automoc/kde4-doxygen.sh/kde4-config/kcookiejar4/kbuildsycoca4/checkXML
10746       kde4-config.1.gz/checkXML.1.gz

Awesome!  We found something.  It looks like we matched two packages — package 3703 and package 10746.  Now, we just need to figure out what those packages are…  Lets go peek in the second sqlite database file, primary.sqlite.

[cylon:/var/cache/yum/fedora]
<svoboda>sqlite3 primary.sqlite
SQLite version 3.5.9
Enter “.help” for instructions
sqlite> .header on
sqlite> .mode column
sqlite> .tables
conflicts  db_info    files      obsoletes  packages   provides   requires
sqlite>

What does the packages table look like?

sqlite> pragma table_info (packages);
cid         name        type        notnull     dflt_value  pk
———-  ———-  ———-  ———-  ———-  ———-
0           pkgKey      INTEGER     0                       1
1           pkgId       TEXT        0                       0
2           name        TEXT        0                       0
3           arch        TEXT        0                       0
4           version     TEXT        0                       0
5           epoch       TEXT        0                       0
6           release     TEXT        0                       0
7           summary     TEXT        0                       0
8           descriptio  TEXT        0                       0
9           url         TEXT        0                       0
10          time_file   INTEGER     0                       0
11          time_build  INTEGER     0                       0
12          rpm_licens  TEXT        0                       0
13          rpm_vendor  TEXT        0                       0
14          rpm_group   TEXT        0                       0
15          rpm_buildh  TEXT        0                       0
16          rpm_source  TEXT        0                       0
17          rpm_header  INTEGER     0                       0
18          rpm_header  INTEGER     0                       0
19          rpm_packag  TEXT        0                       0
20          size_packa  INTEGER     0                       0
21          size_insta  INTEGER     0                       0
22          size_archi  INTEGER     0                       0
23          location_h  TEXT        0                       0
24          location_b  TEXT        0                       0
25          checksum_t  TEXT        0                       0
Cool.  We want to match the name of the package that has the pkgKey entries 3703 or 10746.  Lets adjust our width so we don’t get truncated values displayed to the screen…

sqlite> .width 10 50

And query aganist that pkgKey value.

sqlite> SELECT name,location_href FROM packages
…> WHERE pkgKey = 3703;
name        location_href
———-  ————————————————–
kdelibs     Packages/kdelibs-4.0.3-7.fc9.i386.rpm

sqlite> SELECT name,location_href FROM packages
…> WHERE pkgKey = 10746;
name        location_href
———-  ————————————————–
kdelibs     Packages/kdelibs-4.0.3-7.fc9.x86_64.rpm

Posted by mike, filed under Linux Package Management, Linux Utilities. Date: September 30, 2008, 11:31 am | No Comments »

Oh my gosh, this is the funniest (and coolest) site I have seen in a LONG LONG time!

Posted by matty, filed under Uncategorized. Date: September 10, 2008, 4:31 pm | No Comments »

You can disable hardware directly from the OBP with “asr” commands.  If it’s a production critical machine, and it won’t boot because of a failed component, you can disable the hardware from the OBP and get the machine back up (although crippled) to minimize your production downtime impact.

Rebooting with command: boot
Boot device: /pci@1e,600000/pci@0/pci@2/scsi@0/disk@0,0  File and args: -rsv
Loading ufs-file-system package 1.4 04 Aug 1995 13:02:54.
FCode UFS Reader 1.12 00/07/17 15:48:16.
Loading: /platform/SUNW,Sun-Fire-V445/ufsboot
Loading: /platform/sun4u/ufsboot
ERROR: Last Trap: Corrected ECC Error

{3} ok

YIKES!@#$!  We have memory failure.

The OBP keyword “sifting” will search through all of the commands the OBP knows for a particular string.  So to search for all of the commands that contain asr:

{3} ok sifting asr
In vocabulary  srassembler
(f001d858) rdasr        (f001d550) wrasr        (f001d53c) rdasr
In vocabulary  forth
(f008ee08) asr-list-keys        (f008ed2c) asr-enable
(f008ebd8) asr-disable          (f008d22c) .asr         (f008cb50) asr-clear
(f0052240) asr-policies

So, the main commands here then are asr-list-keys (show what we can disable) .asr (show what we already have disabled) asr-enable, asr-disable, and asr-clear

{3} ok asr-list-keys

key = net2&3                /pci@1f,700000/pci@0/pci@2/pci@0/@4
key = net0&1                /pci@1e,600000/pci@0/pci@1/pci@0/@4
key = ide                   /pci@1f,700000/pci@0/pci@1/pci@0/@1f
key = usb                   /pci@1f,700000/pci@0/pci@1/pci@0/@1c
key = pci7                  /pci@1f,700000/pci@0/@9
key = pci6                  /pci@1e,600000/pci@0/@9
key = pci5                  /pci@1f,700000/pci@0/pci@2/pci@0/@8
key = pci4                  /pci@1f,700000/pci@0/pci@2/pci@0/@8
key = pci3                  /pci@1e,600000/pci@0/pci@1/pci@0/@8
key = pci2                  /pci@1e,600000/pci@0/pci@1/pci@0/@8
key = pci1                  /pci@1f,700000/pci@0/@8
key = pci0                  /pci@1e,600000/pci@0/@8
key = cpu3-bank3
key = cpu3-bank2
key = cpu3-bank1
key = cpu3-bank0
key = cpu2-bank3
key = cpu2-bank2
key = cpu2-bank1
key = cpu2-bank0
key = cpu1-bank3
key = cpu1-bank2
key = cpu1-bank1
key = cpu1-bank0
key = cpu0-bank3
key = cpu0-bank2
key = cpu0-bank1
key = cpu0-bank0

Since we have an ECC memory error, we know it is with one of the above memory banks.  By disabling the memory banks on each CPU one at a time, by trial and error we can find the failed memory.

{3} ok .asr
There are no devices disabled by ASR.

Disabling cpu0-2 kept hitting the ECC memory error.  Lets disable CPU3.

{3} ok asr-disable cpu3-bank0
{3} ok asr-disable cpu3-bank1
{3} ok asr-disable cpu3-bank2
{3} ok asr-disable cpu3-bank3

{3} ok .asr
cpu3-bank3              Disabled by USER
No reason given
cpu3-bank2              Disabled by USER
No reason given
cpu3-bank1              Disabled by USER
No reason given
cpu3-bank0              Disabled by USER
No reason given

And lets boot the machine

Sun Fire V445, No Keyboard
Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.22.19, 24576 MB memory installed, Serial xxxxxxxxx
Ethernet address 0:14:4f:xx:xx:xx, Host ID: xxxxxxx

NOTICE: CPU 3 has 8192/8192 MB of memory disabled

ERROR: The following devices are disabled:
cpu3-bank3
cpu3-bank2
cpu3-bank1
cpu3-bank0

Thanks for telling me!

Rebooting with command: boot -rsv
Boot device: /pci@1e,600000/pci@0/pci@2/scsi@0/disk@0,0  File and args: -rsv
Loading ufs-file-system package 1.4 04 Aug 1995 13:02:54.
FCode UFS Reader 1.12 00/07/17 15:48:16.
Loading: /platform/SUNW,Sun-Fire-V445/ufsboot
Loading: /platform/sun4u/ufsboot
module /platform/sun4u/kernel/sparcv9/unix: text at [0x1000000, 0x107a767] data at 0×1800000
module misc/sparcv9/krtld: text at [0x107a768, 0x10933af] data at 0×184c760
module /platform/sun4u/kernel/sparcv9/genunix: text at [0x10933b0, 0x11f0f17] data at 0×1852040
module /platform/SUNW,Sun-Fire-V445/kernel/misc/sparcv9/platmod: text at [0x11f0f18, 0x11f1817] data at 0×18a45e0
module /platform/sun4u/kernel/cpu/sparcv9/SUNW,UltraSPARC-IIIi: text at [0x11f1880, 0x120278f] data at 0×18a4e80
SunOS Release 5.10 Version Generic_118833-33 64-bit
Copyright 1983-2006 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Ethernet address = 0:14:4f:2b:ea:aa
mem = 25165824K (0×600000000)
avail mem = 25226371072
root nexus = Sun Fire V445

YAY!  Our gimpy machine is going back into production minus 8gb of memory.  There will be a performance impact running on less system resources, but better something than nothing?

Posted by mike, filed under Uncategorized. Date: July 25, 2008, 9:12 am | 2 Comments »

There is quite a bit of documentation around the internet on the linux boot process, but Gustavo Duarte I think did an excellent job describing this in a clear and concise way.  He also has several links to the Linux  kernel source code and describes what is occurring step-by-step through the bootstrap phase all the way to the execution of /sbin/init.

His first entry lays the foundation of the basis of the x86 Intel chipset, memory map, and logical motherboard layout.   This provides a basic understanding about the traditional hardware motherboard implementations.

Next, he describes BIOS initialization, and loading of the MBR.  This briefly touches on the boot loader which starts the Linux bootstrap phase.

Finally, the kernel boot process is detailed with links to C and Assembly source code, with a brief narrative of exactly what is happening.

This was an awesome description of the early-on start up and initialization phases of hardware and bootstrapping of the O/S.  Gustavo provides a great description of real-mode and protected-mode CPU states.

Thanks Gustavo!

Posted by mike, filed under Linux Kernel, Linux Misc. Date: July 16, 2008, 9:58 am | No Comments »

« Previous Entries