Configuring wget to use a proxy server

This article was posted by Matty on 2011-11-09 08:22:00 -0400 -0400

Periodically I need to download files on servers that aren’t directly connected to the Internet. If the server has wget installed I will usually execute it passing it the URL of the resource I want to retrieve:

$ wget prefetch.net/iso.dvd

If the system resides behind a proxy server the http_proxy variable needs to be set to the server name and port of the proxy:

$ export http_proxy=proxy.prefetch.net:3128

If your proxy requires a username and password you can pass those on the command line:

$ wget --proxy-user=foo --proxy-password=bar prefetch.net

Or you can set the proxy-user and proxy-password variables in your ~/.wgetrc file.

Four super cool utilities that are part of the psmisc package

This article was posted by Matty on 2011-11-07 13:14:00 -0400 -0400

There are a ton of packages available for the various Linux distributions. Some of these packages aren’t as well know as others, though they contain some crazy awesome utilities. One package that fits into this cataegory is psmisc. Psmisc contains several tools that be used to print process statistics, look at file descriptor activity, see which process ids have a file or directory open, kill all processes that match a pattern and print the process table as a tree. Here is the list of utilities provided by psmisc:

$ rpm -q -l psmisc | grep bin

/sbin/fuser
/usr/bin/killall
/usr/bin/peekfd
/usr/bin/prtstat
/usr/bin/pstree
/usr/bin/pstree.x11

Starting at the top, the fuser tool will allow you to view the processes that have a file or directory open:

$ fuser -c -v /home/matty

/home/matty: root kernel mount /home
matty 1652 F.c.. sh
matty 1723 ....m imsettings-daem
matty 1810 F.c.m gnome-screensav
matty 1812 F.c.. xfce4-session
matty 1818 F.c.. xfsettingsd
.....

Peekfd will display reads and writes to a set of file descriptors passed on the command line, or all of the file descriptors opened by a process:

$ peekfd 24983

reading fd 0:
c
writing fd 2:
c
reading fd 0:

writing fd 2:
[08] [1b] [K

Next up we have prtstat. This super handy tool will display the contents of /proc//stat in a nicely formatted ascii visual display:

$ prtstat 16860

Process: bash State: S (sleeping)
CPU#: 0 TTY: 136:1 Threads: 1
Process, Group and Session IDs
Process ID: 16860 Parent ID: 16855
Group ID: 16860 Session ID: 16860
T Group ID: 16860

Page Faults
This Process (minor major): 996 1
Child Processes (minor major): 6651 0
CPU Times
This Process (user system guest blkio): 0.00 0.01 0.00 0.00
Child processes (user system guest): 14.30 21.06 0.00
Memory
Vsize: 119 MB
RSS: 716 kB RSS Limit: 18446744073709 MB
Code Start: 0x400000 Code Stop: 0x4d8528
Stack Start: 0x7fff4b027150
Stack Pointer (ESP): 0x7fff4b025dd8 Inst Pointer (EIP): 0x3d0c0d3090
Scheduling
Policy: normal
Nice: 0 RT Priority: 0 (non RT)

And the last tool I love from this package is pstree. Pstree will print a tree-like structure of the process table, allowing you to easily see what started a given process:

$ pstree --ascii |more

systemd-+-NetworkManager-+-dhclient
| `-2*[{NetworkManager}]
|-Terminal-+-bash-+-more
| | `-pstree
| |-bash
| |-bash---ssh
| |-gnome-pty-helpe
| `-{Terminal}
|-Thunar---2*[{Thunar}]
|-VBoxSVC-+-VBoxNetDHCP
| |-VirtualBox---22*[{VirtualBox}]
| |-2*[VirtualBox---21*[{VirtualBox}]]
| `-10*[{VBoxSVC}]
.....

So there you go my friends, four amazing tools that don’t get the recognition they deserve. Which packages do you use that don’t get the street cred they deserve?

Figuring out how long a Linux process has been alive

This article was posted by Matty on 2011-11-06 08:10:00 -0400 -0400

I’ve bumped into a few problems in the past where processes that were supposed to be short lived encountered an issue and never died. Over time these processes would build up and if it wasn’t for a cleanup task I developed the process table would have eventually filled up (the bug that caused this was eventually fixed).

Now how would you go about checking to see how long a process has been alive? There are actually several ways to get the time a process started on a Linux host. You can look at the 5th field in the SYSV ps output:

$ ps -ef | tail -5

matty 29501 28486 0 Nov02 pts/6 00:00:00 ssh 192.168.56.101
matty 29666 28085 0 Nov02 pts/7 00:00:00 bash
matty 29680 29666 0 Nov02 pts/7 00:00:00 vim
root 29854 2 0 Oct31 ? 00:00:00 [kdmflush]
matty 29986 20521 0 Nov02 ? 00:00:07 java

You can get a space separate date range with the “bsdstart” option:

$ ps ax -o pid,command,bsdstart,bsdtime | tail -5

29501 ssh 192.168.56.101 Nov 2 0:00
29666 bash Nov 2 0:00
29680 vim Nov 2 0:00
29854 [kdmflush] Oct 31 0:00
29986 java Nov 2 0:07

Or you can get the full date a process started with the “lstart” option:

$ ps ax -o pid,command,lstart | tail -5

29501 ssh 192.168.56.101 Wed Nov 2 14:16:23 2011
29666 bash Wed Nov 2 14:56:48 2011
29680 vim Wed Nov 2 14:56:49 2011
29854 [kdmflush] Mon Oct 31 10:56:54 2011
29986 java Wed Nov 2 15:54:05 2011

Now you may be asking yourself where does ps get the time from? It opens /proc/ /status and reads the starttime value to get the time in jiffies that the process was started after boot (see proc(5) for further detail). I’m sure there are numerous other ways to visualize the time a process has been running, but the ones listed above have been sufficient to deal with most aging issues (aka. bugs) I’ve encountered. :)

Viewing resource limits for Linux processes

This article was posted by Matty on 2011-11-05 09:08:00 -0400 -0400

Most Linux distributions ship with the pam_limits module to limit the resources that can be used by a process. You can enforce process resource limits by the user that a process runs as or by the group name a process runs as. These limits are set in /etc/security/limits.conf. To see the limits for your running shell you can run ulimit with the “-a” option:

$ ulimit -a

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63354
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

All of the limits above are the system defaults, since I haven’t made any changes to /etc/security/limits.conf. To view the resource limits assigned to an arbitrary process on the system you can page the /proc//limits file. Here are the values assigned to the limits file for my active shell:

$ echo $$

$ cd /proc/24566

$ cat limits

Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 1024 63354 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 63354 63354 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us

There are a number of cool things you can do with resource limits, and I’ll type up my resource limits notes and post them in the near future. Rock & roll!

Defragmenting EXT4 file systems with e4defrag (coming soon to a distribution near you)

This article was posted by Matty on 2011-11-04 10:13:00 -0400 -0400

If you have been around the systems engineering field you have probably read about file system fragmentation at some point. This typically occurs when files are randomly updated over time, and the blocks that comprise the file get scattered over different areas of the disk. This causes the drives to perform more work since the drive heads have to perform additional seeks vs. being able to sequentially read data off of a given platter. Anytime you do sequential I/O vs. random I/O you are better off.

Several file systems provide tools to defragment file systems, and it looks like the EXT4 engineers are planning to come out with a similar tool for EXT4. The tool will be called e4defrag, and a test version of the tool is available in the e2fsprogs git repository. This utility will be super handy when it’s stable, and should assist with getting every last bit of performance out of your EXT4 file system.

Before I provide an overview of this tool I need to state that this tool is currently being released for testing, and there is the possibility of data corruption if you use it!! Do not use this utility on anything you need to keep around! This isn’t my opinion, this is the opinion from the developers themselves:

$ e4defrag -c /mnt

This is a release only for testing, and bugs may
which could corrupt your data. Please invoke
with "-test" if you wish to use it at this time.
Usage : e4defrag [-v] file...| directory...| device...
: e4defrag -c file...| directory...| device...

With that said, e4defrag can be used to defragment a file, a directory or a file system that has been placed on a device. To see how fragmented your file system is you can run e4defrag with the “-c” option:

$ e4defrag -c -test /mnt

now/best ratio
1. /mnt/mail/domaintable.db 2/1 50.00%
2. /mnt/mail/virtusertable.db 2/1 50.00%
3. /mnt/mail/mailertable.db 2/1 50.00%
4. /mnt/file 72148/1 0.77%
5. /mnt/elinks.conf 1/1 0.00%

Total/best extents 74010/1860
Fragmentation ratio 3.84%
Fragmentation score 30.72
[0-30 no problem: 31-55 a little bit fragmented: 55- needs defrag]
This directory(/mnt) does not need defragmentation.

Nifty! The utility will print the files that are fragemented, the number of extents that are associated with the file, and the fragmentation ratio. Here is the full description of the fields from the source code:

struct frag_statistic_ino {
        int now_count;          /* the file's extents count of before defrag */
        int best_count;         /* the best file's extents count */
        __u64 size_per_ext;     /* size(KB) per extent */
        float ratio;            /* the ratio of fragmentation */
        char msg_buffer[PATH_MAX + 1];  /* pathname of the file */
};

To defragment a fragmented file, directory or device you can run e4defrag without the “-c” option:

$ e4defrag -test /mnt

This will take some time, and the result will hopefully be files that aren’t scattered all through out the hard drive. While this tool isn’t ready for primetime, it’s nice to know that it will be available down the road. You are on your own if you decide to use e4defrag. I provide zero warranties or assurances on the information provided. You are seriously putting your data at risk if you choose to ignore the various warnings provided here.