Bash Parallel Execution


If you have ever wanted an easy way to execute multiple jobs in parallel in bash, then this is the snippet for you. This was originally posted on Stack Exchange. It has been modified a bit.

#!/bin/bash

#how many jobs to run at one time
JOBS_AT_ONCE=20

# The bgxupdate and bgxlimit functions below allow for
# running X jobs in parallel in bash.  They are taken from:
# http://stackoverflow.com/questions/1537956/bash-limit-the-number-of-concurrent-jobs/1685440#1685440

# bgxupdate - update active processes in a group.
#   Works by transferring each process to new group
#   if it is still active.
# in:  bgxgrp - current group of processes.
# out: bgxgrp - new group of processes.
# out: bgxcount - number of processes in new group.

bgxupdate() {
    bgxoldgrp=${bgxgrp}
    bgxgrp=""
    ((bgxcount = 0))
    bgxjobs=" $(jobs -pr | tr '\n' ' ')"
    for bgxpid in ${bgxoldgrp} ; do
        echo "${bgxjobs}" | grep " ${bgxpid} " >/dev/null 2>&1
        if [[ $? -eq 0 ]] ; then
            bgxgrp="${bgxgrp} ${bgxpid}"
            ((bgxcount = bgxcount + 1))
        fi
    done
}

# bgxlimit - start a sub-process with a limit.

#   Loops, calling bgxupdate until there is a free
#   slot to run another sub-process. Then runs it
#   an updates the process group.
# in:  $1     - the limit on processes.
# in:  $2+    - the command to run for new process.
# in:  bgxgrp - the current group of processes.
# out: bgxgrp - new group of processes

bgxlimit() {
    bgxmax=$1 ; shift
    bgxupdate
    while [[ ${bgxcount} -ge ${bgxmax} ]] ; do
        sleep 1
        bgxupdate
    done
    if [[ "$1" != "-" ]] ; then
        $* &
        bgxgrp="${bgxgrp} $!"
    fi
}

bgxgrp="process_group_1"
for LINE in `cat hosts`
do
    CHECK_SCRIPT='echo $(hostname),$(cat /etc/debian_version)'
    bgxlimit $JOBS_AT_ONCE ssh ${LINE} "${CHECK_SCRIPT}"
done
# Wait until all queued processes are done.

bgxupdate
while [[ ${bgxcount} -ne 0 ]] ; do
    oldcount=${bgxcount}
    while [[ ${oldcount} -eq ${bgxcount} ]] ; do
        sleep 1
        bgxupdate
    done
done

In this script the primary changes are defining the max number of simultaneous jobs, as well as doing somewhat useful work in returning the hostname and the debian version.

About these ads

3 thoughts on “Bash Parallel Execution

  1. I am trying to figure out what bgx can do that you cannot do by using GNU Parallel (10 seconds installation: wget -O – pi.dk/3 | sh).

    It seems your example could be written:

    cat hosts | parallel -q -j 20 ssh {} echo ‘$(hostname),$(cat /etc/debian_version)’

    or:

    parallel –slf hosts -j 20 –nonall echo ‘$(hostname),$(cat /etc/debian_version)’

      • But that is exactly my point: If you are able to write in your homedir then ‘wget -O – pi.dk/3 | sh’ will install it in 10 seconds. You do not need write permission anywhere else.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s