Unixy goodness: use awk to group and average data

As a software tester, sometimes you are called upon to performance test a web service and present results in a nice chart to impress your manager. JMeter is commonly used to thrash your server and produce insane amounts of throughput data. If you’re running 1000 tpm this can be rather a lot of data (180,000 transactions for a 3 hour test run). This is beyond the capability of JMeter’s inbuilt graphics package and is too much to import to Excel.

perf-excelMy solution is to group throughput per minute and average transaction time for each minute.  Attached below is a script for processing a JTL log file from JMeter. It reduces a 3-hour test run to 180 data points which is much easier to represent with a chart program such as Excel.

The script uses a few neat awk tricks, such as:

  • Rounding unix timestamps to nearest minute
  • Collect timestamps grouped by minute
  • Convert unix timestamp to YYYY-MM-dd etc.
  • Print Throughput for a minute increment
  • Print Average response time for a minute increment
  • Do all of the above in an efficient single pass through awk (this was the hardest bit!)

Hat tip: Jadu Saikia for excellent awk tips.

Recommended link: Improve the quality of your JMeter scripts

::Code is below the fold::

#!/bin/sh
# jtlmin.sh :
#   JMeter log processing script
#   Collects & Averages throughput data using 1-minute increments
#   Requires CSV-formatted log from JMeter "Simple Data Writer".
#
#   Version   Date          Author      Comment
#       2.0   2009-02-17    R. Papesch  Refined awk procedure, renamed variables
#       1.0   2006-11-28    R. Papesch

#set -x  #debug

USAGE="Usage: jtlmin.sh  \nSummarizes JMeter JTL output into 1-minute blocks"
[ $1 ] || { echo -e $USAGE; exit 1 ; }
echo -e "Processing \c"
ls $1 || { exit 1 ; }

main()
{
  WORKFILE=$1.jtlmin.$$.WORKING
  OUTFILE=$1.jtlmin.$$.OUT
  STEP=60       #  $WORKFILE

  echo "Outputting data to $OUTFILE .."
  echo "$PWD/$1" >> $OUTFILE
  echo -e "unixtime \tdate \ttime \tthruput(tpm) \tresponse(ms) " >> $OUTFILE
  awk_routine | sort >> $OUTFILE

  rm $WORKFILE
}

awk_routine()
{
  awk '
    NR!=1 {minute[$1]++; rsum[$1]=rsum[$1]+$2}
    END {
      for (i in minute) {
        printf("%d\t", i);
        printf("%s\t", strftime("%Y.%b.%d",i));
        printf("%s\t", strftime("%H:%M",i));
        printf("%d\t", minute[i]);
        printf("%d\n", rsum[i]/minute[i])
      }
    }' $WORKFILE
}

main $*
Advertisements

One response to “Unixy goodness: use awk to group and average data

  1. Pingback: Command Line = Good « earth is my favourite planet·

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s