Quantcast
Channel: Jared Still – Official Pythian Blog
Viewing all articles
Browse latest Browse all 34

Chart Your Course With sar

$
0
0

The sar (system activity reporter) is a Linux utility that actually started life long ago in System V Unix.

In spite of its ancient origins, sar is still quite useful, and has even been enhanced in recent versions of Linux.

The use of sar is not limited to root – any user can run sar and its utilities.  Most Linux systems have sar installed and enabled by default to collect sar data at 10 minute intervals and retain that data for 10 days.

The reports output by sar will look familiar:

 

jkstill@poirot ~/tmp $ sar -d | head -20
Linux 4.4.0-53-generic (poirot.jks.com)         08/07/17        _x86_64_        (1 CPU)

00:00:01          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:02:01       dev7-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:02:01       dev7-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:02:01       dev7-2      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:02:01       dev7-3      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:02:01       dev7-4      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:02:01       dev7-5      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:02:01       dev7-6      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:02:01       dev8-0      0.40      0.00      4.97     12.43      0.00      0.26      0.26      0.01
00:02:01     dev252-0      0.63      0.00      4.90      7.78      0.00      0.16      0.16      0.01
00:02:01     dev252-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:04:01       dev7-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:04:01       dev7-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:04:01       dev7-2      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:04:01       dev7-3      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:04:01       dev7-4      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:04:01       dev7-5      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:04:01       dev7-6      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

This report format is good for a quick look at different metrics to get an overall idea of system perform, disk metrics in the previously shown example.

However what if you would like to collate this data into something more readily consumed for charting an analysis, such as a CSV file?  The standard output of sar does not readily lend itself to such use without quite a bit of work.

The sar utility is part of the Linux  sysstat package and contains not only sar, but sadf.  If sar is available then so is sadf. What is sadf you say?  sadf is a the sar data formatting utility, and can output data in CSV, XML and other formats.

Using sadf a bash script can be created that will extract sar data for many Linux subsystems and write that data in the format of your choice, CSV in this case.

The asp.sh script does just that, creating CSV files that can then be used for charting and analysis.

 

#!/bin/bash

# Jared Still - Pythian
# still@pythian.com jkstill@gmail.com
# 2017-07-20

# asp - another sar processor

# tested on Red Hat Enterprise Linux Server release 6.6 (Santiago)
# also tested on Linux Mint

help() {
	echo
	echo $0 -s source-dir -d dest-dir
	echo
}



# variables can be set to identify multiple sets of copied sar files
sarSrcDir='/var/log/sa' # RedHat, CentOS ...
#sarSrcDir='/var/log/sysstat' # Debian, Ubuntu ...

sarDstDir="sar-csv"

csvConvertCmd=" sed -e 's/;/,/g' "


while getopts s:d:h arg
do
	case $arg in
		d) sarDstDir=$OPTARG;;
		s) sarSrcDir=$OPTARG;;
		h) help; exit 0;;
		*) help; exit 1;;
	esac
done


cat << EOF

Source: $sarSrcDir
  Dest: $sarDstDir

EOF

#exit


mkdir -p $sarDstDir || {

	echo
	echo Failed to create $sarDstDir
	echo
	exit 1

}

echo "lastSaDirEl: $lastSaDirEl"


# sar options
# -d activity per block device
  # -j LABEL: use label for device if possible. eg. sentryoraredo01 rather than /dev/dm-3
# -b IO and transfer rates
# -q load
# -u cpu
# -r memory utilization
# -R memory
# -B paging
# -S swap space utilization
# -W swap stats
# -n network
# -v kernel filesystem stats
# -w  context switches and task creation
#sarDestOptions=( '-d -j LABEL -p' '-b' '-q' '-u ALL' '-r' '-R' '-B' '-S' '-W' '-n DEV,EDEV,NFS,NFSD,SOCK,IP,EIP,ICMP,EICMP,TCP,ETCP,UDP' '-v' '-w')
# break up network into a separate file for each option
# not all options available depending on sar version
# for disk "-d" you may want one of ID, LABEL, PATH or UUID - check the output and the sar docs
sarDestOptions=( '-d -j ID -p' '-b' '-q' '-u ALL' '-r' '-R' '-B' '-S' '-W' '-n DEV' '-n EDEV' '-n NFS' '-n NFSD' '-n SOCK' '-n IP' '-n EIP' '-n ICMP' '-n EICMP' '-n TCP' '-n ETCP' '-n UDP' '-v' '-w')

#sarDestFiles=( sar-disk.csv sar-io.csv sar-load.csv sar-cpu.csv sar-mem-utilization.csv sar-mem.csv sar-paging.csv sar-swap-utilization.csv sar-swap-stats.csv sar-network.csv sar-kernel-fs.csv sar-context.csv)
sarDestFiles=( sar-disk.csv sar-io.csv sar-load.csv sar-cpu.csv sar-mem-utilization.csv sar-mem.csv sar-paging.csv sar-swap-utilization.csv sar-swap-stats.csv sar-net-dev.csv sar-net-ede.csv sar-net-nfs.csv sar-net-nfsd.csv sar-net-sock.csv sar-net-ip.csv sar-net-eip.csv sar-net-icmp.csv sar-net-eicmp.csv sar-net-tcp.csv sar-net-etcp.csv sar-net-udp.csv sar-kernel-fs.csv sar-context.csv)

lastSarOptEl=${#sarDestOptions[@]}
echo "lastSarOptEl: $lastSarOptEl"

#while [[ $i -lt ${#x[@]} ]]; do echo ${x[$i]}; (( i++ )); done;

# initialize files with header row
i=0
while [[ $i -lt $lastSarOptEl ]]
do
	CMD="sadf -d -- ${sarDestOptions[$i]}  | head -1 | $csvConvertCmd > ${sarDstDir}/${sarDestFiles[$i]} "
	echo CMD: $CMD
	eval $CMD
	#sadf -d -- ${sarDestOptions[$i]}  | head -1 | $csvConvertCmd > ${sarDstDir}/${sarDestFiles[$i]}
	echo "################"
	(( i++ ))
done

#exit

#for sarFiles in ${sarSrcDirs[$currentEl]}/sa??
for sarFiles in $(ls -1dtar ${sarSrcDir}/sa??)
do
	for sadfFile in $sarFiles
	do

		#echo CurrentEl: $currentEl
		# sadf options
		# -t is for local timestamp
		# -d : database semi-colon delimited output

		echo Processing File: $sadfFile

		i=0
		while [[ $i -lt $lastSarOptEl ]]
		do
			CMD="sadf -d -- ${sarDestOptions[$i]} $sadfFile | tail -n +2 | $csvConvertCmd  >> ${sarDstDir}/${sarDestFiles[$i]}"
			echo CMD: $CMD
			eval $CMD
			if [[ $? -ne 0 ]]; then
				echo "#############################################
				echo "## CMD Failed"
				echo "## $CMD"
				echo "#############################################

			fi
			(( i++ ))
		done

	done
done


echo
echo Processing complete
echo
echo files located in $sarDstDir
echo


# show the files created
i=0
while [[ $i -lt $lastSarOptEl ]]
do
	ls -ld ${sarDstDir}/${sarDestFiles[$i]}
	(( i++ ))
done

 

Following is an example execution of the script:

 

jkstill@poirot ~/pythian/blog/sar-tools-blog $ ./asp.sh -s /var/log/sysstat

Source: /var/log/sysstat
  Dest: sar-csv

lastSaDirEl:
lastSarOptEl: 23
CMD: sadf -d -- -d -j ID -p | head -1 | sed -e 's/;/,/g' > sar-csv/sar-disk.csv
################
CMD: sadf -d -- -b | head -1 | sed -e 's/;/,/g' > sar-csv/sar-io.csv
################
CMD: sadf -d -- -q | head -1 | sed -e 's/;/,/g' > sar-csv/sar-load.csv
################
CMD: sadf -d -- -u ALL | head -1 | sed -e 's/;/,/g' > sar-csv/sar-cpu.csv
################
CMD: sadf -d -- -r | head -1 | sed -e 's/;/,/g' > sar-csv/sar-mem-utilization.csv
################
CMD: sadf -d -- -R | head -1 | sed -e 's/;/,/g' > sar-csv/sar-mem.csv
################
CMD: sadf -d -- -B | head -1 | sed -e 's/;/,/g' > sar-csv/sar-paging.csv
################
CMD: sadf -d -- -S | head -1 | sed -e 's/;/,/g' > sar-csv/sar-swap-utilization.csv
################
CMD: sadf -d -- -W | head -1 | sed -e 's/;/,/g' > sar-csv/sar-swap-stats.csv
################
CMD: sadf -d -- -n DEV | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-dev.csv
################
CMD: sadf -d -- -n EDEV | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-ede.csv
################
CMD: sadf -d -- -n NFS | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-nfs.csv
################
CMD: sadf -d -- -n NFSD | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-nfsd.csv
################
CMD: sadf -d -- -n SOCK | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-sock.csv
################
CMD: sadf -d -- -n IP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-ip.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d -- -n EIP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-eip.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d -- -n ICMP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-icmp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d -- -n EICMP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-eicmp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d -- -n TCP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-tcp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d -- -n ETCP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-etcp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d -- -n UDP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-udp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d -- -v | head -1 | sed -e 's/;/,/g' > sar-csv/sar-kernel-fs.csv
################
CMD: sadf -d -- -w | head -1 | sed -e 's/;/,/g' > sar-csv/sar-context.csv
################
Processing File: /var/log/sysstat/sa30
CMD: sadf -d -- -d -j ID -p /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-disk.csv
CMD: sadf -d -- -b /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-io.csv
CMD: sadf -d -- -q /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-load.csv
CMD: sadf -d -- -u ALL /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-cpu.csv
CMD: sadf -d -- -r /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-mem-utilization.csv
CMD: sadf -d -- -R /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-mem.csv
CMD: sadf -d -- -B /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-paging.csv
CMD: sadf -d -- -S /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-swap-utilization.csv
CMD: sadf -d -- -W /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-swap-stats.csv
CMD: sadf -d -- -n DEV /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-dev.csv
CMD: sadf -d -- -n EDEV /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-ede.csv
CMD: sadf -d -- -n NFS /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-nfs.csv
CMD: sadf -d -- -n NFSD /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-nfsd.csv
CMD: sadf -d -- -n SOCK /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-sock.csv
CMD: sadf -d -- -n IP /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-ip.csv
Requested activities not available in file /var/log/sysstat/sa30
CMD: sadf -d -- -n EIP /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-eip.csv
Requested activities not available in file /var/log/sysstat/sa30
CMD: sadf -d -- -n ICMP /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-icmp.csv
Requested activities not available in file /var/log/sysstat/sa30
...

CMD: sadf -d -- -v /var/log/sysstat/sa07 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-kernel-fs.csv
CMD: sadf -d -- -w /var/log/sysstat/sa07 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-context.csv

Processing complete

files located in sar-csv

Note that for some commands the result is Requested activities not available.  What this means is that we have asked for something sar may be capable of collecting, but in this case there is no data. It may be that sar has not been configured to collect that particular metric. These messages can be ignored unless, of course, it is for a metric you would like to see. (sar configuration is outside the scope of this article)

Let’s break down the following sadf command line

sadf -d -- -d -j ID -p /var/log/sysstat/sa05 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-disk.csv

The sadf ‘-d’ option

This option tells sadf to output data in  CSV format.  As CSV is an acronym for Comma-Separated-Values, the use of a semi-colon as a delimiter may at first seem a little puzzling.  However it makes sense to use something other than a comma, as it may be that a comma could appear in the output.  If that were the case sar would need to then surround the data with quotes.  While many programs can deal with data that may or may not be quoted, it is just simpler to use a semi-colon, which is unlikely to appear in data seen in sar.

The ‘–‘ option

This option doesn’t appear to be too well known. When used on the shell command line, — signifies the end of arguments for the program. A typical use might be if you were using grep to search for ‘-j’ in some text files.  This following command fails due to grep interpreting the text we want to search as an argument. Since there is no ‘-j’ argument for grep, the command fails.

 

jkstill@poirot ~/pythian/blog/sar-tools-blog $ grep '-j' *.sh
grep: invalid option -- 'j'
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.

 

Using ‘–‘ tells the shell to stop processing arguments for the current command, which is grep in this case, and so grep searches for ‘-j’ as intended.

jkstill@poirot ~/pythian/blog/sar-tools-blog $ grep -- '-j' *.sh
  # -j LABEL: use label for device if possible. eg. sentryoraredo01 rather than /dev/dm-3
#sarDestOptions=( '-d -j LABEL -p' '-b' '-q' '-u ALL' '-r' '-R' '-B' '-S' '-W' '-n DEV,EDEV,NFS,NFSD,SOCK,IP,EIP,ICMP,EICMP,TCP,ETCP,UDP' '-v' '-w')
sarDestOptions=( '-d -j ID -p' '-b' '-q' '-u ALL' '-r' '-R' '-B' '-S' '-W' '-n DEV' '-n EDEV' '-n NFS' '-n NFSD' '-n SOCK' '-n IP' '-n EIP' '-n ICMP' '-n EICMP' '-n TCP' '-n ETCP' '-n UDP' '-v' '-w')

Why in this case are we using ‘–‘?  It is because the other options following -d are not for sadf, but are sar options that sadf will use when calling sar.

The ‘-d’, ‘-j’ and ‘-p’ options

Be reading the sar man page you can learn what each of these are:

  • -d block device activity
  • -j display persistent device names – options are ID, LABEL, PATH and UUID
  • -p pretty print device names

tail -n +2

Just get the rows after the header row.  asp.sh calls sadf for each day’s sar data.  The first step of creating the header has already been done.

 

sed

sed is being used to change semi-colons to commas.  So far I have not seen any commas in sar data, and using a comma delimiter is just more convenient.

 

Charting in Excel with Perl

 

Having the sar data in the CSV format is convenient, as it will load directly into Excel or Google Sheets. What isn’t convenient is all the work required to load the data, format it, and then create multiple charts from the data.  There could be a number of CSV files this needs to be done with.

Perl to the rescue!  The Excel::Writer::XLSX Perl module can directly create formatted Excel files, complete with charts. I am going to make some basic use of this quite capable module via the dynachart.pl Perl script. This script reads CSV data and creates Excel files, optionally with line charts of selected columns.

If you would like to see some simple examples of using Perl to create Excel files, the documentation for the Perl Module Writer::Excel::XLSX contains several good examples, such as this one:  Example-1

There are several others to be found in the documentation for  Excel::Writer::XLSX

Following is the entire script.  This script and others can be found at the csv-tools github repo.

#!/usr/bin/env perl

# dynachart.pl
# Jared Still 2017-07-23
# still@pythian.com jkstill@gmail.com

# data is from STDIN

use warnings;
use strict;
use Data::Dumper;
use Pod::Usage;
use Getopt::Long;
use Excel::Writer::XLSX;

my $debug = 0;
my $combinedChart = 0;
my %optctl = ();
my ($help,$man);
my @chartCols;
my $categoryColName='';
my $categoryColNum=0;

Getopt::Long::GetOptions(
	\%optctl,
	'spreadsheet-file=s',
	'debug!' => \$debug,
	'chart-cols=s{1,10}' => \@chartCols,
	'combined-chart!' => \$combinedChart,
	'worksheet-col=s',  # creates a separate worksheet per value of this column
	'category-col=s' => \$categoryColName,
	'h|help|?' => \$help, man => \$man
) or pod2usage(2) ;

pod2usage(1) if $help;
pod2usage(-verbose => 2) if $man;

my $xlFile = defined($optctl{'spreadsheet-file'}) ? $optctl{'spreadsheet-file'} : 'asm-metrics.xlsx';
my $workSheetCol = defined($optctl{'worksheet-col'}) ? $optctl{'worksheet-col'} : 0;

my %fonts = (
	fixed			=> 'Courier New',
	fixed_bold	=> 'Courier New',
	text			=> 'Arial',
	text_bold	=> 'Arial',
);

my %fontSizes = (
	fixed			=> 10,
	fixed_bold	=> 10,
	text			=> 10,
	text_bold	=> 10,
);

my $maxColWidth = 50;
my $counter = 0;
my $interval = 100;

# create workbook
my $workBook = Excel::Writer::XLSX->new($xlFile);
die "Problems creating new Excel file $xlFile: $!\n" unless defined $workBook;

# create formats
my $stdFormat = $workBook->add_format(bold => 0,font => $fonts{fixed}, size => $fontSizes{fixed}, color => 'black');
my $boldFormat = $workBook->add_format(bold => 1,font => $fonts{fixed_bold}, size => $fontSizes{fixed_bold}, color => 'black');
my $wrapFormat = $workBook->add_format(bold => 0,font => $fonts{text}, size => $fontSizes{text}, color => 'black');
$wrapFormat->set_align('vjustify');


my $labels=<>;
chomp $labels;
# sadf starts header lines with '# ' - remove that
$labels =~ s/^#\s+//;
my @labels = split(/,\s*/,$labels);

if ($debug) {

print qq{LABELS:\n};

print join("\n",@labels);

print "\n";

}

# get the X series category
if ( $categoryColName ) {

	my $want = $categoryColName;
	my $index = 0;
	++$index until ($labels[$index] eq $want) or ($index > $#labels);
	$categoryColNum = $index;

}

#print join("\n",@labels);

# get the element number of the column used to segregate into worksheets
my $workSheetColPos;
if ($workSheetCol) {
	my $i=0;
	foreach my $label ( @labels)  {
		if ($label eq $workSheetCol) {
			$workSheetColPos = $i;
			last;
		}
		$i++;
	}
}

print "\nworkSheetColPos: $workSheetColPos\n" if $debug;

# validate the columns to be charted
# use as an index into the labels
my @chartColPos=();
{
	my $i=0;
	foreach my $label ( @labels)  {
		foreach my $chartCol ( @chartCols ) {
			if ($label eq $chartCol) {
				push @chartColPos, $i;
				last;
			}
		}
		$i++;
	}
}

if ($debug) {
	print "\nChartCols:\n", Dumper(\@chartCols);
	print "\nChartColPos:\n", Dumper(\@chartColPos);
	print "\nLabels:\n", Dumper(\@labels);
}

my %lineCount=();
my %workSheets=();


# the first worksheet is a directory
my $directoryName='Directory';
my $directory;
my $noDirectory=0;
my $directoryLineCount=0;

unless ($noDirectory) {
	$directory = $workBook->add_worksheet($directoryName)	;
	$directory->set_column(0,0,30);
	$directory->write_row($directoryLineCount++,0,['Directory'],$boldFormat);
}

while (<>) {

	chomp;
	my @data=split(/,/);

	my $currWorkSheetName;
	if ($workSheetCol) {
		$currWorkSheetName=$data[$workSheetColPos];
		if (length($currWorkSheetName) > 31 ) {
			# cut some out of the middle of the name as the Excel worksheet name has max size of 31
			$currWorkSheetName = substr($currWorkSheetName,0,14) . '...' . substr($currWorkSheetName,-14);
		}
	} else {
		$currWorkSheetName='DynaChart';
	}

	print "Worksheet Name: $currWorkSheetName\n" if $debug;

	unless (defined $workSheets{$currWorkSheetName}) {
		$workSheets{$currWorkSheetName} = $workBook->add_worksheet($currWorkSheetName);
		$workSheets{$currWorkSheetName}->write_row($lineCount{$currWorkSheetName}++,0,\@labels,$boldFormat);
		# freeze pane at header
		$workSheets{$currWorkSheetName}->freeze_panes($lineCount{$currWorkSheetName},0);
	}

	# setup column widths
	#$workSheet->set_column($el,$el,$colWidth);
	$workSheets{$currWorkSheetName}->write_row($lineCount{$currWorkSheetName}++,0,\@data, $stdFormat);

}

if ($debug) {
	print "Worksheets:\n";
	print "$_\n" foreach keys %workSheets;
	print Dumper(\%lineCount);
}

# each row consumes about 18 pixels
my $rowHeight=18; # pixels
my $chartHeight=23; # rows

# triple from default width of 480
my $chartWidth = 480 * 3;

=head1 Write the Charts

 The default mode is to create a separate chart for each metric
 By specifying the command line option --combined-chart the values will be charted in a single chart
 Doing so is probably useful only for a limited number of sets of values

 Some may question the apparent duplication of code in the sections to combine or not combine the charts
 Doing so would be a bit convoluted - this is easier to read and modify

=cut

foreach my $workSheet ( keys %workSheets ) {
	print "Charting worksheet: $workSheet\n" if $debug;

	my $chartNum = 0;

	if ($combinedChart) {
		my $chart = $workBook->add_chart( type => 'line', name => "Combined" . '-' . $workSheets{$workSheet}->get_name(), embedded => 1 );
		$chart->set_size( width => $chartWidth, height => $chartHeight * $rowHeight);

		# each chart consumes about 16 rows
		$workSheets{$workSheet}->insert_chart((($chartNum * $chartHeight) + 2),3, $chart);

		foreach my $colPos ( @chartColPos ) {
			my $col2Chart=$labels[$colPos];
			# [ sheet, row_start, row_end, col_start, col_end]
			$chart->add_series(
				name => $col2Chart,
				#categories => [$workSheet, 1,$lineCount{$workSheet},2,2],
				categories => [$workSheet, 1,$lineCount{$workSheet},$categoryColNum,$categoryColNum],
				values => [$workSheet, 1,$lineCount{$workSheet},$colPos,$colPos]
			);
		}

	} else {
		foreach my $colPos ( @chartColPos ) {
			my $col2Chart=$labels[$colPos];
			print "\tCharting column: $col2Chart\n" if $debug;
			my $chart = $workBook->add_chart( type => 'line', name => "$col2Chart" . '-' . $workSheets{$workSheet}->get_name(), embedded => 1 );
			$chart->set_size( width => $chartWidth, height => $chartHeight * $rowHeight);

			# each chart consumes about 16 rows
			$workSheets{$workSheet}->insert_chart((($chartNum * $chartHeight) + 2),3, $chart);


			# [ sheet, row_start, row_end, col_start, col_end]
			$chart->add_series(
				name => $col2Chart,
				#categories => [$workSheet, 1,$lineCount{$workSheet},2,2],
				categories => [$workSheet, 1,$lineCount{$workSheet},$categoryColNum,$categoryColNum],
				values => [$workSheet, 1,$lineCount{$workSheet},$colPos,$colPos]
			);

			$chartNum++;
		}
	}
}


# write the directory page
my $urlFormat = $workBook->add_format( color => 'blue', underline => 1 );
my %sheetNames=();

foreach my $worksheet ( $workBook->sheets() ) {
	my $sheetName = $worksheet->get_name();
	next if $sheetName eq $directoryName;
	$sheetNames{$sheetName} = $worksheet;
}

foreach my $sheetName ( sort keys %sheetNames ) {
	$directory->write_url($directoryLineCount++,0, qq{internal:'$sheetName'!A1} ,$urlFormat, $sheetName);
}

__END__

=head1 NAME

dynachart.pl

  --help brief help message
  --man  full documentation
  --spreadsheet-file output file name - defaults to asm-metrics.xlsx
  --worksheet-col name of column used to segragate data into worksheets
    defaults to a single worksheet if not supplied
  --chart-cols list of columns to chart

 dynachart.pl accepts input from STDIN

 This script will read CSV data created by asm-metrics-collector.pl or asm-metrics-aggregator.pl


=head1 SYNOPSIS

dynachart.pl [options] [file ...]

 Options:
   --help brief help message
   --man  full documentation
   --spreadsheet-file output file name - defaults to asm-metrics.xlsx
   --worksheet-col name of column used to segragate data into worksheets
     defaults to a single worksheet if not supplied
  --chart-cols list of columns to chart
  --category-col specify the column for the X vector - a timestamp is typically used
    the name must exactly match that in the header
  --combined-chart create a single chart rather than a chart for each value specified in --chart-cols

 dynachart.pl accepts input from STDIN

 dynachart.pl --worksheet-col DISKGROUP_NAME < my_input_file.csv


 dynachart.pl --spreadsheet-file sar-disk-test.xlsx --combined-chart --worksheet-col DEV --category-col 'timestamp' --chart-cols 'rd_sec/s' --chart-cols 'wr_sec/s' < sar-disk-test.csv


=head1 OPTIONS

=over 8

=item B<-help>

Print a brief help message and exits.

=item B<-man>

Prints the manual page and exits.

=item B<--spreadsheet-file>

 The name of the Excel file to create.
 The default name is asm-metrics.xlsx

=item B<--worksheet-col>

 By default a single worksheet is created.
 When this option is used the column supplied as an argument will be used to segragate data into separate worksheets.

=item B<--chart-cols>

 List of columns to chart
 This should be the last option on the command line if used.

 It may be necessary to tell Getopt to stop processing arguments with '--' in some cases.

 eg.

 dynachart.pl dynachart.pl --worksheet-col DISKGROUP_NAME --chart-cols READS WRITES -- logs/asm-oravm-20150512_01-agg-dg.csv

=item B<--category-col>

 Column to use as the category for the X line in the chart - default to the first column
 The name must exactly match a column from the CSV file
 Typically this line is a timestamp

=back

=head1 DESCRIPTION

B<dynachart.pl> creates an excel file with charts for selected columns>

=head1 EXAMPLES

 dynachart.pl accepts data from STDIN

 dynachart.pl --worksheet-col DISKGROUP_NAME --spreadsheet-file mywork.xlsx

=cut


 

Creating Multiple Charts with Bash

 

A number of invocations of dynachart.pl can be found in sar-chart.sh as seen in the following example. This allows creating a number of Excel files from sar data for later analysis, all with one command.

 

 

#!/bin/bash


usage() {

	cat <<-EOF

	usage: $0 destination-directory

	example script that charts CSV data that has been generated from sar
	Excel XLXS files with charts are produced

	applicable to any CSV data


	example: sar-chart.sh data-dir

	This script is using data generated by asp.sh


	EOF

}


while getopts h arg
do
	case $arg in
		h) usage;exit 0;;
		*) usage;exit 1;;
	esac
done

destDir=$1

[ -d "$destDir" -a -w "$destDir" ] || {
	echo
	echo cannot read and/or write directory destDir: $destDir
	echo
	usage
	echo
	exit 1
}

### Template
#echo working on sar-
#dynachart.pl --spreadsheet-file ${destDir}/ --worksheet-col hostname --category-col 'timestamp'
################


# if dynachart.pl is in the path then use it.
# if not then check for local directory copy or link
# otherwise error exit

dynaChart=$(which dynachart.pl)

if [[ -z $dynaChart ]]; then
	if [[ -f ./dynachart.pl ]]; then
		dynaChart='./dynachart.pl'
		[ -x "$dynaChart" ] || {
			echo
			echo $dynaChart is not executable
			echo
			exit 2
		}
	else
		echo
		echo "dynachart.pl not found"
		echo
		exit 1
	fi
fi


# default of 1 chart per metric
echo working on sar-disk-default.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-disk-default.xlsx --worksheet-col DEV --category-col 'timestamp' --chart-cols 'rd_sec/s' --chart-cols 'wr_sec/s' < sar-disk.csv

# combine metrics into one chart
echo working on sar-disk-combined.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-disk-combined.xlsx --combined-chart --worksheet-col DEV --category-col 'timestamp' --chart-cols 'rd_sec/s' --chart-cols 'wr_sec/s' < sar-disk.csv


echo working on sar-network-device.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-network-device.xlsx --combined-chart --worksheet-col IFACE --category-col 'timestamp' --chart-cols 'rxkB/s' --chart-cols 'txkB/s' < sar-net-dev.csv

echo working on sar-network-error-device.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-network-error-device.xlsx --combined-chart --worksheet-col IFACE --category-col 'timestamp' --chart-cols 'rxerr/s' --chart-cols 'txerr/s' < sar-net-ede.csv

echo working on sar-network-nfs.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-network-nfs.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'call/s' --chart-cols 'retrans/' --chart-cols 'read/s' --chart-cols 'write/s' --chart-cols 'access/s' --chart-cols 'getatt/s'  < sar-net-nfs.csv

echo working on sar-network-nfsd.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-network-nfsd.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'scall/s' --chart-cols 'badcall/s' --chart-cols 'packet/s' --chart-cols 'udp/s' --chart-cols 'tcp/s' --chart-cols 'hit/s' --chart-cols 'miss/s' --chart-cols 'sread/s' --chart-cols 'swrite/s' --chart-cols 'saccess/s' --chart-cols 'sgetatt/s' < sar-net-nfsd.csv

echo working on sar-network-socket.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-network-socket.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'totsck' --chart-cols 'tcpsck' --chart-cols 'udpsck' --chart-cols 'rawsck' --chart-cols 'ip-frag' --chart-cols 'tcp-tw' < sar-net-sock.csv

echo working on sar-context.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-context.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'proc/s' --chart-cols 'cswch/s' < sar-context.csv

echo working on sar-cpu.xlsx
# extracted with -u ALL, so all CPU on one line
dynachart.pl --spreadsheet-file ${destDir}/sar-cpu.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols '%usr' --chart-cols '%nice' --chart-cols '%sys' --chart-cols '%iowait' --chart-cols '%steal' --chart-cols '%irq' --chart-cols '%soft' --chart-cols '%guest' --chart-cols '%idle' < sar-cpu.csv


echo working on sar-io-default.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-io-default.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'tps' --chart-cols 'rtps' --chart-cols 'wtps' --chart-cols 'bread/s' --chart-cols 'bwrtn/s' < sar-io.csv

echo working on sar-io-tps-combined.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-io-tps-combined.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'tps' --chart-cols 'rtps' --chart-cols 'wtps' < sar-io.csv

echo working on sar-io-blks-per-second-combined.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-io-blks-per-second-combined.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'bread/s' --chart-cols 'bwrtn/s' < sar-io.csv


echo working on sar-load-runq-threads.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-load-runq-threads.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'runq-sz' --chart-cols 'plist-sz' --chart-cols 'ldavg-1' --chart-cols 'ldavg-5' --chart-cols 'ldavg-15' < sar-load.csv

echo working on sar-load-runq.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-load-runq.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'runq-sz' --chart-cols 'ldavg-1' --chart-cols 'ldavg-5' --chart-cols 'ldavg-15' < sar-load.csv

echo working on sar-memory.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-memory.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'frmpg/s' --chart-cols  'bufpg/s' < sar-mem.csv


echo working on sar-paging-rate.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-paging-rate.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp'  --chart-cols 'pgpgin/s' --chart-cols  'pgpgout/s' < sar-paging.csv

echo working on sar-swap-rate.xlsx
dynachart.pl --spreadsheet-file ${destDir}/sar-swap-rate.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp'  --chart-cols 'pswpin/s' --chart-cols 'pswpout/s' < sar-swap-stats.csv

Filter, rename and aggregate the data

Perhaps you want to see a subset of some data. Let’s say you wish too see the data for only a select group of devices; those making up the DATA diskgroup in an Oracle database.  In addition you would like the sar report to use names that match those seen by ASM. Rather than /dev/mapper/mp1, mp2,…, you would like to see DATA00, DATA01,…

In addition you would like to see the IOPS, read rate and write rate data for a set of disks aggregated per timestamp.

We can do that. The script remap.sh has an example (real life BTW) of how to map the ASM disk names to the linux device names, and create a new CSV file.

This query was used to map device names to disk names:

 

select
	'dm-name-' || substr(d.path,instr(d.path,'/',-1)+1,length(d.path) - instr(d.path,'/',-1)) path
	, g.name || lpad(d.disk_number,2,'0') name
from v$asm_disk d
join v$asm_diskgroup g on g.group_number = d.group_number
order by 2

15:10:17 SYSDBA> /

PATH			                   NAME
------------------------------ --------------------
dm-name-mpathmp1	             DATA00
dm-name-mpathnp1	             DATA01
dm-name-mpathop1	             DATA02
dm-name-mpathqp1	             DATA03
dm-name-mpathrp1	             DATA04
dm-name-mpathsp1	             DATA05
dm-name-mpathtp1	             DATA06
dm-name-mpathvp1	             DATA07
dm-name-mpathwp1	             DATA08
dm-name-mpathcp1	             FRA00

10 rows selected.

 

This information was used to then create a sed command to rewrite the CSV data as needed.

 


originalFile=sar-disk-original.csv
newFile=sar-disk.csv

sed \
	-e 's/dm-name-mpathmp1/DATA00/g' \
	-e 's/dm-name-mpathnp1/DATA01/g' \
	-e 's/dm-name-mpathop1/DATA02/g' \
	-e 's/dm-name-mpathqp1/DATA03/g' \
	-e 's/dm-name-mpathrp1/DATA04/g' \
	-e 's/dm-name-mpathsp1/DATA05/g' \
	-e 's/dm-name-mpathtp1/DATA06/g' \
	-e 's/dm-name-mpathvp1/DATA07/g' \
	-e 's/dm-name-mpathwp1/DATA08/g' \
	-e 's/dm-name-mpathcp1/FRA00/g' \
< $originalFile > $newFile

Following that csv-aggregator.sh and csv-aggregator.pl were used to filter and aggregate the data.  The result is a much smaller CSV file with data only for the DATA diskgroup, with the disk data rolled up to the disk group level.

Rather than post several hundred more lines of Perl here, I will recommend you to follow the links if you want to see how this aggregation and filtering is done. The command line though is shown here:

 


csvFile=sar-csv/sar-disk-test.csv

./csv-aggregator.pl --filter-cols DEV  --filter-vals 'DATA..'  --key-cols hostname --key-cols timestamp  --grouping-cols DEV  --agg-cols tps --agg-cols 'rd_sec/s' --agg-cols 'wr_sec/s'  --agg-cols 'avgrq-sz' --agg-cols 'avgqu-sz' < $csvFile

 

This is a chart taken directly from the Excel file created from this CSV data via dynachart.pl.

 

image

 

 

 

Installing Excel::Writer::XLSX

So that the dynachart.pl script will work, the Excel::Writer::XLSX Perl module must be installed.

There are several possible methods by which this may be done, but if you are not already fairly familiar with the Perl environment some of the methods are a little daunting.  As Oracle by default always installs Perl in ORACLE_HOME, you can make use of that existing installation.  These instructions will install the Excel::Writer::XLSX module and it’s dependencies directly into the Oracle Perl installation. Because of that, you would want to do this in a test environment, such as Virtual Machine running on your laptop.  It isn’t too likely that this would break something in the Oracle home, but then again, why take the chance?

An even better option would be to install your own Perl, and not worry about affecting any other software. This is not too difficult to do via Perlbrew. Check out https://perlbrew.pl/ for the simple instructions.

The following example installs Excel::Writer::XLSX in Oracle’s copy of Perl on a test server.  For my own use, I do use Perlbrew, but am just including the following as an example

 

[oracle@oravm01 ~]$ $ORACLE_HOME/perl/bin/perl -MCPAN -e shell

...

Would you like me to configure as much as possible automatically? [yes] yes

...

These next commands tell CPAN to automatically answer appropriately rather than prompt you to continually type ‘yes’.
This lasts the duration of this CPAN session.  Following that is the command to install Excel::Writer::XLSX.

 

cpan> o conf prerequisites_policy 'follow'
cpan> o conf build_requires_install_policy yes


cpan> install Excel::Writer::XLSX

...


cpan> exit
Terminal does not support GetHistory.
Lockfile removed.

 

Now test the installation.  This command simply tells Perl the we require version 99 of the module. The failure is normal, as only version 0.95 is installed.  A different message would appear if no version of the module was installed.

[oracle@oravm01 ~]$ $ORACLE_HOME/perl/bin/perl -MExcel::Writer::XLSX=99
Excel::Writer::XLSX version 99 required--this is only version 0.95 at /u01/app/oracle/product/11.2.0/db_1/perl/lib/5.10.0/Exporter/Heavy.pm line 122.
BEGIN failed--compilation aborted.

Viewing all articles
Browse latest Browse all 34

Trending Articles