When strace isn’t enough Part 1

An important tool in any linux admin’s toolkit is the venerable strace command. It enables us to get insight into what a program is actually doing. As awesome as strace can be, it doesn’t tell us everything. This series of articles will get you familiar with some of the other commands and approaches to gain insight into program execution.

Continue reading

Bash Nagios plugin

Today lets have a look at one way to construct a nagios plugin in bash. I would usually write these in perl, but sometimes that is not possible. This plugin is actually written to be executed using NRPE.

#!/bin/bash
# bash nagios plugin

###
# Variables
###
OK=0
WARNING=1
CRITICAL=2
UNKNOWN=-1
TO_RETURN=${OK}
TO_OUTPUT=''

# Print usage information and exit
print_usage(){
    echo -e "\n" \
    "usage: ./check_uptime -w 20 -c 30 \n" \
    "\n" \
    "-w <days>    warning value\n" \
    "-c <days>    critical value\n" \
    "-h           this help\n" \
    "\n" && exit 1
}

###
# Options
###

# Loop through $@ to find flags
while getopts ":hw:c:" FLAG; do
    case "${FLAG}" in
        w) # Warning value
            WARNING_VALUE="${OPTARG}" ;;
        c) # Critical value
            CRITICAL_VALUE="${OPTARG}" ;;
        h) # Print usage information
            HELP=1;;
        [:?]) # Print usage information
            print_usage;;
    esac
done

###
# Functions
###

log_date(){
    echo $(date +"%b %e %T")
}

error() {
    NOW=$(log_date)
    echo "${NOW}: ERROR: $1"
    exit 1
}

warning() {
    NOW=$(log_date)
    echo "${NOW}: WARNING: $1"
}

info() {
    NOW=$(log_date)
    echo "${NOW}: INFO: $1"
}

# Do something
get_cmd_output(){
    #generate output
    echo `uptime | sed 's/.*up \([0-9]*\) day.*/\1/'` || error "failed to run command"
}

###
# Program execution
###
[ "${HELP}" ] && print_usage

if [ ${WARNING_VALUE} ] && [ ${CRITICAL_VALUE} ]
then
    CMD_OUTPUT=$(get_cmd_output)
else
    print_usage
fi

if [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt ${CRITICAL_VALUE} ]
then
    TO_RETURN=${CRITICAL}
elif [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt ${WARNING_VALUE} ]
then
    TO_RETURN=${WARNING}
elif [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt 0 ]
then
    TO_RETURN=${OK}
else
    TO_RETURN=${UNKNOWN}
fi

if [ $TO_RETURN == ${CRITICAL} ]
then
    TO_OUTPUT="CRITICAL "
elif [ $TO_RETURN == ${WARNING} ]
then
    TO_OUTPUT="WARNING "
elif [ ${TO_RETURN} == ${OK} ]
then
    TO_OUTPUT="OK "
else
    TO_OUTPUT="UNKNOWN "
fi

TO_OUTPUT="${TO_OUTPUT}| uptime=${CMD_OUTPUT};$WARNING_VALUE;$CRITICAL_VALUE"

echo "$TO_OUTPUT";
exit $TO_RETURN;

Lets break it down…

OK=0
WARNING=1
CRITICAL=2
UNKNOWN=-1

We define some readable names for the return codes.

TO_RETURN=${OK}

Set the initial return value to OK.

# Do something
get_cmd_output(){
    #generate output
    echo `uptime | sed 's/.*up \([0-9]*\) day.*/\1/'` || error "failed to run command"
}

Function to obtain the value we want to check. In this case uptime.

if [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt ${CRITICAL_VALUE} ]
then
    TO_RETURN=${CRITICAL}
elif [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt ${WARNING_VALUE} ]
then
    TO_RETURN=${WARNING}
elif [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt 0 ]
then
    TO_RETURN=${OK}
else
    TO_RETURN=${UNKNOWN}
fi

Check the value of uptime against our warning and critical values.

if [ $TO_RETURN == ${CRITICAL} ]
then
    TO_OUTPUT="CRITICAL "
elif [ $TO_RETURN == ${WARNING} ]
then
    TO_OUTPUT="WARNING "
elif [ ${TO_RETURN} == ${OK} ]
then
    TO_OUTPUT="OK "
else
    TO_OUTPUT="UNKNOWN "
fi

Set the visible output of the plugin. This output is not used by nagios.

TO_OUTPUT="${TO_OUTPUT}| uptime=${CMD_OUTPUT};$WARNING_VALUE;$CRITICAL_VALUE"

Construct the output string according to the nagios plugin developer guidelines.

Stay tuned. The perl version will be out soon.

For more information see:
http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN201

ssh-copy-id missing in OS X

Not sure if anyone else has noticed, but OS X is missing ssh-copy-id. This utility is included with the ssh client in most major linux distros. As it turns out, it is just a shell script.

#!/bin/sh

# Shell script to install your public key on a remote machine
# Takes the remote machine name as an argument.
# Obviously, the remote machine must accept password authentication,
# or one of the other keys in your ssh-agent, for this to work.

ID_FILE="${HOME}/.ssh/id_rsa.pub"

if [ "-i" = "$1" ]; then
  shift
  # check if we have 2 parameters left, if so the first is the new ID file
  if [ -n "$2" ]; then
    if expr "$1" : ".*\.pub" > /dev/null ; then
      ID_FILE="$1"
    else
      ID_FILE="$1.pub"
    fi
    shift         # and this should leave $1 as the target name
  fi
else
  if [ x$SSH_AUTH_SOCK != x ] && ssh-add -L >/dev/null 2>&1; then
    GET_ID="$GET_ID ssh-add -L"
  fi
fi

if [ -z "`eval $GET_ID`" ] && [ -r "${ID_FILE}" ] ; then
  GET_ID="cat ${ID_FILE}"
fi

if [ -z "`eval $GET_ID`" ]; then
  echo "$0: ERROR: No identities found" >&2
  exit 1
fi

if [ "$#" -lt 1 ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
  echo "Usage: $0 [-i [identity_file]] [user@]machine" >&2
  exit 1
fi

{ eval "$GET_ID" ; } | ssh ${1%:} "umask 077; test -d .ssh || mkdir .ssh ; cat >> .ssh/authorized_keys" || exit 1

cat <<EOF
Now try logging into the machine, with "ssh '${1%:}'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

EOF

Go forth, copy, paste, chmod and happily deploy your ssh keys with ease.

P.S. For those who don’t know what I mean by chmod, see the following.

chmod +x ./ssh-copy-id

Bash Parallel Execution

If you have ever wanted an easy way to execute multiple jobs in parallel in bash, then this is the snippet for you. This was originally posted on Stack Exchange. It has been modified a bit.

#!/bin/bash

#how many jobs to run at one time
JOBS_AT_ONCE=20

# The bgxupdate and bgxlimit functions below allow for
# running X jobs in parallel in bash.  They are taken from:
# http://stackoverflow.com/questions/1537956/bash-limit-the-number-of-concurrent-jobs/1685440#1685440

# bgxupdate - update active processes in a group.
#   Works by transferring each process to new group
#   if it is still active.
# in:  bgxgrp - current group of processes.
# out: bgxgrp - new group of processes.
# out: bgxcount - number of processes in new group.

bgxupdate() {
    bgxoldgrp=${bgxgrp}
    bgxgrp=""
    ((bgxcount = 0))
    bgxjobs=" $(jobs -pr | tr '\n' ' ')"
    for bgxpid in ${bgxoldgrp} ; do
        echo "${bgxjobs}" | grep " ${bgxpid} " >/dev/null 2>&1
        if [[ $? -eq 0 ]] ; then
            bgxgrp="${bgxgrp} ${bgxpid}"
            ((bgxcount = bgxcount + 1))
        fi
    done
}

# bgxlimit - start a sub-process with a limit.

#   Loops, calling bgxupdate until there is a free
#   slot to run another sub-process. Then runs it
#   an updates the process group.
# in:  $1     - the limit on processes.
# in:  $2+    - the command to run for new process.
# in:  bgxgrp - the current group of processes.
# out: bgxgrp - new group of processes

bgxlimit() {
    bgxmax=$1 ; shift
    bgxupdate
    while [[ ${bgxcount} -ge ${bgxmax} ]] ; do
        sleep 1
        bgxupdate
    done
    if [[ "$1" != "-" ]] ; then
        $* &
        bgxgrp="${bgxgrp} $!"
    fi
}

bgxgrp="process_group_1"
for LINE in `cat hosts`
do
    CHECK_SCRIPT='echo $(hostname),$(cat /etc/debian_version)'
    bgxlimit $JOBS_AT_ONCE ssh ${LINE} "${CHECK_SCRIPT}"
done
# Wait until all queued processes are done.

bgxupdate
while [[ ${bgxcount} -ne 0 ]] ; do
    oldcount=${bgxcount}
    while [[ ${oldcount} -eq ${bgxcount} ]] ; do
        sleep 1
        bgxupdate
    done
done

In this script the primary changes are defining the max number of simultaneous jobs, as well as doing somewhat useful work in returning the hostname and the debian version.

Turboprop

As an extension of my previous post on parallel execution I present turboprop. The initial version of this script will perform an optimization of a mysql database with multiple tables running at the same time. In the future it may be extended to allow for more operations from the command line such as mysql dumps.

#!/bin/bash
# turboprop

# how many jobs to run at one time
JOBS_AT_ONCE=20
# Command to run in parallel in this case mysqlcheck -o
COMMAND="mysqlcheck -o"

# Print usage information and exit
print_usage(){
    echo -e "\n" \
    "usage: ./turboprop -d databasename \n" \
    "Optimizes mysql tables in parallel\n" \
    "-d <databasename>      Database to optimize\n" \
    "-h                     this help\n" \
    "\n" && exit 1
}

###
# Options
###

# Loop through $@ to find flags
while getopts ":d:" FLAG; do
    case "${FLAG}" in
        d) # Database name
            DB=${OPTARG} ;;
        h) # Print usage
            print_usage;;
        [:?]) print_usage;;
    esac
done

[ ! ${DB} ] && print_usage

###
# Functions
###

# The bgxupdate and bgxlimit functions below allow for
# running X jobs in parallel in bash.  They are taken from:
# http://stackoverflow.com/questions/1537956/bash-limit-the-number-of-concurrent-jobs/1685440#1685440

# bgxupdate - update active processes in a group.
#   Works by transferring each process to new group
#   if it is still active.
# in:  bgxgrp - current group of processes.
# out: bgxgrp - new group of processes.
# out: bgxcount - number of processes in new group.

bgxupdate() {
    bgxoldgrp=${bgxgrp}
    bgxgrp=""
    ((bgxcount = 0))
    bgxjobs=" $(jobs -pr | tr '\n' ' ')"
    for bgxpid in ${bgxoldgrp} ; do
        echo "${bgxjobs}" | grep " ${bgxpid} " >/dev/null 2>&1
        if [[ $? -eq 0 ]] ; then
            bgxgrp="${bgxgrp} ${bgxpid}"
            ((bgxcount = bgxcount + 1))
        fi
    done
}

# bgxlimit - start a sub-process with a limit.

#   Loops, calling bgxupdate until there is a free
#   slot to run another sub-process. Then runs it
#   an updates the process group.
# in:  $1     - the limit on processes.
# in:  $2+    - the command to run for new process.
# in:  bgxgrp - the current group of processes.
# out: bgxgrp - new group of processes

bgxlimit() {
    bgxmax=$1 ; shift
    bgxupdate
    while [[ ${bgxcount} -ge ${bgxmax} ]] ; do
        sleep 1
        bgxupdate
    done
    if [[ "$1" != "-" ]] ; then
        $* &
        bgxgrp="${bgxgrp} $!"
    fi
}

###
# Program Execution
###

bgxgrp="process_group_1"
for TABLE in `mysql ${DB} -e 'show tables'`
do
    bgxlimit ${JOBS_AT_ONCE} ${COMMAND} ${TABLE}
done

# Wait until all queued processes are done.

bgxupdate
while [[ ${bgxcount} -ne 0 ]] ; do
    oldcount=${bgxcount}
    while [[ ${oldcount} -eq ${bgxcount} ]] ; do
        sleep 1
        bgxupdate
    done
done