Using xargs and lscpu to spawn one process per CPU core

One of my friends reached out to me earlier this week to ask if there was an easy way to run multiple Linux processes in parallel. There are
several ways to approach this problem but most of them don’t take into account hardware cores and threads. My preferred solution for CPU intensive operations is to use the xargs parallel option (“-P”) along with the CPU cores listed in lscpu. This allows me to run one process per core which is ideal for CPU intensive applications. But enough talk, let’s see an example.

Let’s say you need to compress a directory full of log files and want to run one compression job on each CPU core. To locate the number of cores you can combine lscpu and grep:

$ CPU_CORES=`lscpu -p=CORE,ONLINE | grep -c ‘Y’`

To generate a list of files we can run find and pass the output of that to xargs:

$ find . -type f -name \*.log | xargs -n1 -P${CPU_CORES} bzip2

The xargs command listed above will create one bzip2 process per core and pass it a log file to process. To monitor the pipeline to make sure it is working as intended we can run a simple while loop:

$ while :; do ps auxwww | grep [b]zip; sleep 1; done

matty    14322  0.0  0.0 113968  1228 pts/0    S+   07:24   0:00 xargs -n1 -P4 bzip2
matty    14323 95.0  0.0  13748  7624 pts/0    R+   07:24   0:11 bzip2 ./log10.txt.log
matty    14324 95.9  0.0  13748  7616 pts/0    R+   07:24   0:11 bzip2 ./log2.txt.log
matty    14325 96.0  0.0  13748  7664 pts/0    R+   07:24   0:11 bzip2 ./log3.txt.log
matty    14326 94.9  0.0  13748  7632 pts/0    R+   07:24   0:11 bzip2 ./log4.txt.log

There are a number of other useful things you can do with the items listed above but I will leave that to your imagination. Viva la xargs!

Leave a Reply

Your email address will not be published. Required fields are marked *