The sar (system activity reporter) is a Linux utility that actually started life long ago in System V Unix.
In spite of its ancient origins, sar is still quite useful, and has even been enhanced in recent versions of Linux.
The use of sar is not limited to root – any user can run sar and its utilities. Most Linux systems have sar installed and enabled by default to collect sar data at 10 minute intervals and retain that data for 10 days.
The reports output by sar will look familiar:
jkstill@poirot ~/tmp $ sar -d | head -20 Linux 4.4.0-53-generic (poirot.jks.com) 08/07/17 _x86_64_ (1 CPU) 00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 00:02:01 dev7-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:02:01 dev7-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:02:01 dev7-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:02:01 dev7-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:02:01 dev7-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:02:01 dev7-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:02:01 dev7-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:02:01 dev8-0 0.40 0.00 4.97 12.43 0.00 0.26 0.26 0.01 00:02:01 dev252-0 0.63 0.00 4.90 7.78 0.00 0.16 0.16 0.01 00:02:01 dev252-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:04:01 dev7-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:04:01 dev7-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:04:01 dev7-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:04:01 dev7-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:04:01 dev7-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:04:01 dev7-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 00:04:01 dev7-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
This report format is good for a quick look at different metrics to get an overall idea of system perform, disk metrics in the previously shown example.
However what if you would like to collate this data into something more readily consumed for charting an analysis, such as a CSV file? The standard output of sar does not readily lend itself to such use without quite a bit of work.
The sar utility is part of the Linux sysstat package and contains not only sar, but sadf. If sar is available then so is sadf. What is sadf you say? sadf is a the sar data formatting utility, and can output data in CSV, XML and other formats.
Using sadf a bash script can be created that will extract sar data for many Linux subsystems and write that data in the format of your choice, CSV in this case.
The asp.sh script does just that, creating CSV files that can then be used for charting and analysis.
#!/bin/bash # Jared Still - Pythian # still@pythian.com jkstill@gmail.com # 2017-07-20 # asp - another sar processor # tested on Red Hat Enterprise Linux Server release 6.6 (Santiago) # also tested on Linux Mint help() { echo echo $0 -s source-dir -d dest-dir echo } # variables can be set to identify multiple sets of copied sar files sarSrcDir='/var/log/sa' # RedHat, CentOS ... #sarSrcDir='/var/log/sysstat' # Debian, Ubuntu ... sarDstDir="sar-csv" csvConvertCmd=" sed -e 's/;/,/g' " while getopts s:d:h arg do case $arg in d) sarDstDir=$OPTARG;; s) sarSrcDir=$OPTARG;; h) help; exit 0;; *) help; exit 1;; esac done cat << EOF Source: $sarSrcDir Dest: $sarDstDir EOF #exit mkdir -p $sarDstDir || { echo echo Failed to create $sarDstDir echo exit 1 } echo "lastSaDirEl: $lastSaDirEl" # sar options # -d activity per block device # -j LABEL: use label for device if possible. eg. sentryoraredo01 rather than /dev/dm-3 # -b IO and transfer rates # -q load # -u cpu # -r memory utilization # -R memory # -B paging # -S swap space utilization # -W swap stats # -n network # -v kernel filesystem stats # -w context switches and task creation #sarDestOptions=( '-d -j LABEL -p' '-b' '-q' '-u ALL' '-r' '-R' '-B' '-S' '-W' '-n DEV,EDEV,NFS,NFSD,SOCK,IP,EIP,ICMP,EICMP,TCP,ETCP,UDP' '-v' '-w') # break up network into a separate file for each option # not all options available depending on sar version # for disk "-d" you may want one of ID, LABEL, PATH or UUID - check the output and the sar docs sarDestOptions=( '-d -j ID -p' '-b' '-q' '-u ALL' '-r' '-R' '-B' '-S' '-W' '-n DEV' '-n EDEV' '-n NFS' '-n NFSD' '-n SOCK' '-n IP' '-n EIP' '-n ICMP' '-n EICMP' '-n TCP' '-n ETCP' '-n UDP' '-v' '-w') #sarDestFiles=( sar-disk.csv sar-io.csv sar-load.csv sar-cpu.csv sar-mem-utilization.csv sar-mem.csv sar-paging.csv sar-swap-utilization.csv sar-swap-stats.csv sar-network.csv sar-kernel-fs.csv sar-context.csv) sarDestFiles=( sar-disk.csv sar-io.csv sar-load.csv sar-cpu.csv sar-mem-utilization.csv sar-mem.csv sar-paging.csv sar-swap-utilization.csv sar-swap-stats.csv sar-net-dev.csv sar-net-ede.csv sar-net-nfs.csv sar-net-nfsd.csv sar-net-sock.csv sar-net-ip.csv sar-net-eip.csv sar-net-icmp.csv sar-net-eicmp.csv sar-net-tcp.csv sar-net-etcp.csv sar-net-udp.csv sar-kernel-fs.csv sar-context.csv) lastSarOptEl=${#sarDestOptions[@]} echo "lastSarOptEl: $lastSarOptEl" #while [[ $i -lt ${#x[@]} ]]; do echo ${x[$i]}; (( i++ )); done; # initialize files with header row i=0 while [[ $i -lt $lastSarOptEl ]] do CMD="sadf -d -- ${sarDestOptions[$i]} | head -1 | $csvConvertCmd > ${sarDstDir}/${sarDestFiles[$i]} " echo CMD: $CMD eval $CMD #sadf -d -- ${sarDestOptions[$i]} | head -1 | $csvConvertCmd > ${sarDstDir}/${sarDestFiles[$i]} echo "################" (( i++ )) done #exit #for sarFiles in ${sarSrcDirs[$currentEl]}/sa?? for sarFiles in $(ls -1dtar ${sarSrcDir}/sa??) do for sadfFile in $sarFiles do #echo CurrentEl: $currentEl # sadf options # -t is for local timestamp # -d : database semi-colon delimited output echo Processing File: $sadfFile i=0 while [[ $i -lt $lastSarOptEl ]] do CMD="sadf -d -- ${sarDestOptions[$i]} $sadfFile | tail -n +2 | $csvConvertCmd >> ${sarDstDir}/${sarDestFiles[$i]}" echo CMD: $CMD eval $CMD if [[ $? -ne 0 ]]; then echo "############################################# echo "## CMD Failed" echo "## $CMD" echo "############################################# fi (( i++ )) done done done echo echo Processing complete echo echo files located in $sarDstDir echo # show the files created i=0 while [[ $i -lt $lastSarOptEl ]] do ls -ld ${sarDstDir}/${sarDestFiles[$i]} (( i++ )) done
Following is an example execution of the script:
jkstill@poirot ~/pythian/blog/sar-tools-blog $ ./asp.sh -s /var/log/sysstat Source: /var/log/sysstat Dest: sar-csv lastSaDirEl: lastSarOptEl: 23 CMD: sadf -d -- -d -j ID -p | head -1 | sed -e 's/;/,/g' > sar-csv/sar-disk.csv ################ CMD: sadf -d -- -b | head -1 | sed -e 's/;/,/g' > sar-csv/sar-io.csv ################ CMD: sadf -d -- -q | head -1 | sed -e 's/;/,/g' > sar-csv/sar-load.csv ################ CMD: sadf -d -- -u ALL | head -1 | sed -e 's/;/,/g' > sar-csv/sar-cpu.csv ################ CMD: sadf -d -- -r | head -1 | sed -e 's/;/,/g' > sar-csv/sar-mem-utilization.csv ################ CMD: sadf -d -- -R | head -1 | sed -e 's/;/,/g' > sar-csv/sar-mem.csv ################ CMD: sadf -d -- -B | head -1 | sed -e 's/;/,/g' > sar-csv/sar-paging.csv ################ CMD: sadf -d -- -S | head -1 | sed -e 's/;/,/g' > sar-csv/sar-swap-utilization.csv ################ CMD: sadf -d -- -W | head -1 | sed -e 's/;/,/g' > sar-csv/sar-swap-stats.csv ################ CMD: sadf -d -- -n DEV | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-dev.csv ################ CMD: sadf -d -- -n EDEV | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-ede.csv ################ CMD: sadf -d -- -n NFS | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-nfs.csv ################ CMD: sadf -d -- -n NFSD | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-nfsd.csv ################ CMD: sadf -d -- -n SOCK | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-sock.csv ################ CMD: sadf -d -- -n IP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-ip.csv Requested activities not available in file /var/log/sysstat/sa07 ################ CMD: sadf -d -- -n EIP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-eip.csv Requested activities not available in file /var/log/sysstat/sa07 ################ CMD: sadf -d -- -n ICMP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-icmp.csv Requested activities not available in file /var/log/sysstat/sa07 ################ CMD: sadf -d -- -n EICMP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-eicmp.csv Requested activities not available in file /var/log/sysstat/sa07 ################ CMD: sadf -d -- -n TCP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-tcp.csv Requested activities not available in file /var/log/sysstat/sa07 ################ CMD: sadf -d -- -n ETCP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-etcp.csv Requested activities not available in file /var/log/sysstat/sa07 ################ CMD: sadf -d -- -n UDP | head -1 | sed -e 's/;/,/g' > sar-csv/sar-net-udp.csv Requested activities not available in file /var/log/sysstat/sa07 ################ CMD: sadf -d -- -v | head -1 | sed -e 's/;/,/g' > sar-csv/sar-kernel-fs.csv ################ CMD: sadf -d -- -w | head -1 | sed -e 's/;/,/g' > sar-csv/sar-context.csv ################ Processing File: /var/log/sysstat/sa30 CMD: sadf -d -- -d -j ID -p /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-disk.csv CMD: sadf -d -- -b /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-io.csv CMD: sadf -d -- -q /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-load.csv CMD: sadf -d -- -u ALL /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-cpu.csv CMD: sadf -d -- -r /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-mem-utilization.csv CMD: sadf -d -- -R /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-mem.csv CMD: sadf -d -- -B /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-paging.csv CMD: sadf -d -- -S /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-swap-utilization.csv CMD: sadf -d -- -W /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-swap-stats.csv CMD: sadf -d -- -n DEV /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-dev.csv CMD: sadf -d -- -n EDEV /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-ede.csv CMD: sadf -d -- -n NFS /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-nfs.csv CMD: sadf -d -- -n NFSD /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-nfsd.csv CMD: sadf -d -- -n SOCK /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-sock.csv CMD: sadf -d -- -n IP /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-ip.csv Requested activities not available in file /var/log/sysstat/sa30 CMD: sadf -d -- -n EIP /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-eip.csv Requested activities not available in file /var/log/sysstat/sa30 CMD: sadf -d -- -n ICMP /var/log/sysstat/sa30 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-net-icmp.csv Requested activities not available in file /var/log/sysstat/sa30 ... CMD: sadf -d -- -v /var/log/sysstat/sa07 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-kernel-fs.csv CMD: sadf -d -- -w /var/log/sysstat/sa07 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-context.csv Processing complete files located in sar-csv
Note that for some commands the result is Requested activities not available. What this means is that we have asked for something sar may be capable of collecting, but in this case there is no data. It may be that sar has not been configured to collect that particular metric. These messages can be ignored unless, of course, it is for a metric you would like to see. (sar configuration is outside the scope of this article)
Let’s break down the following sadf command line
sadf -d -- -d -j ID -p /var/log/sysstat/sa05 | tail -n +2 | sed -e 's/;/,/g' >> sar-csv/sar-disk.csv
The sadf ‘-d’ option
This option tells sadf to output data in CSV format. As CSV is an acronym for Comma-Separated-Values, the use of a semi-colon as a delimiter may at first seem a little puzzling. However it makes sense to use something other than a comma, as it may be that a comma could appear in the output. If that were the case sar would need to then surround the data with quotes. While many programs can deal with data that may or may not be quoted, it is just simpler to use a semi-colon, which is unlikely to appear in data seen in sar.
The ‘–‘ option
This option doesn’t appear to be too well known. When used on the shell command line, — signifies the end of arguments for the program. A typical use might be if you were using grep to search for ‘-j’ in some text files. This following command fails due to grep interpreting the text we want to search as an argument. Since there is no ‘-j’ argument for grep, the command fails.
jkstill@poirot ~/pythian/blog/sar-tools-blog $ grep '-j' *.sh grep: invalid option -- 'j' Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.
Using ‘–‘ tells the shell to stop processing arguments for the current command, which is grep in this case, and so grep searches for ‘-j’ as intended.
jkstill@poirot ~/pythian/blog/sar-tools-blog $ grep -- '-j' *.sh # -j LABEL: use label for device if possible. eg. sentryoraredo01 rather than /dev/dm-3 #sarDestOptions=( '-d -j LABEL -p' '-b' '-q' '-u ALL' '-r' '-R' '-B' '-S' '-W' '-n DEV,EDEV,NFS,NFSD,SOCK,IP,EIP,ICMP,EICMP,TCP,ETCP,UDP' '-v' '-w') sarDestOptions=( '-d -j ID -p' '-b' '-q' '-u ALL' '-r' '-R' '-B' '-S' '-W' '-n DEV' '-n EDEV' '-n NFS' '-n NFSD' '-n SOCK' '-n IP' '-n EIP' '-n ICMP' '-n EICMP' '-n TCP' '-n ETCP' '-n UDP' '-v' '-w')
Why in this case are we using ‘–‘? It is because the other options following -d are not for sadf, but are sar options that sadf will use when calling sar.
The ‘-d’, ‘-j’ and ‘-p’ options
Be reading the sar man page you can learn what each of these are:
- -d block device activity
- -j display persistent device names – options are ID, LABEL, PATH and UUID
- -p pretty print device names
tail -n +2
Just get the rows after the header row. asp.sh calls sadf for each day’s sar data. The first step of creating the header has already been done.
sed
sed is being used to change semi-colons to commas. So far I have not seen any commas in sar data, and using a comma delimiter is just more convenient.
Charting in Excel with Perl
Having the sar data in the CSV format is convenient, as it will load directly into Excel or Google Sheets. What isn’t convenient is all the work required to load the data, format it, and then create multiple charts from the data. There could be a number of CSV files this needs to be done with.
Perl to the rescue! The Excel::Writer::XLSX Perl module can directly create formatted Excel files, complete with charts. I am going to make some basic use of this quite capable module via the dynachart.pl Perl script. This script reads CSV data and creates Excel files, optionally with line charts of selected columns.
If you would like to see some simple examples of using Perl to create Excel files, the documentation for the Perl Module Writer::Excel::XLSX contains several good examples, such as this one: Example-1
There are several others to be found in the documentation for Excel::Writer::XLSX
Following is the entire script. This script and others can be found at the csv-tools github repo.
#!/usr/bin/env perl # dynachart.pl # Jared Still 2017-07-23 # still@pythian.com jkstill@gmail.com # data is from STDIN use warnings; use strict; use Data::Dumper; use Pod::Usage; use Getopt::Long; use Excel::Writer::XLSX; my $debug = 0; my $combinedChart = 0; my %optctl = (); my ($help,$man); my @chartCols; my $categoryColName=''; my $categoryColNum=0; Getopt::Long::GetOptions( \%optctl, 'spreadsheet-file=s', 'debug!' => \$debug, 'chart-cols=s{1,10}' => \@chartCols, 'combined-chart!' => \$combinedChart, 'worksheet-col=s', # creates a separate worksheet per value of this column 'category-col=s' => \$categoryColName, 'h|help|?' => \$help, man => \$man ) or pod2usage(2) ; pod2usage(1) if $help; pod2usage(-verbose => 2) if $man; my $xlFile = defined($optctl{'spreadsheet-file'}) ? $optctl{'spreadsheet-file'} : 'asm-metrics.xlsx'; my $workSheetCol = defined($optctl{'worksheet-col'}) ? $optctl{'worksheet-col'} : 0; my %fonts = ( fixed => 'Courier New', fixed_bold => 'Courier New', text => 'Arial', text_bold => 'Arial', ); my %fontSizes = ( fixed => 10, fixed_bold => 10, text => 10, text_bold => 10, ); my $maxColWidth = 50; my $counter = 0; my $interval = 100; # create workbook my $workBook = Excel::Writer::XLSX->new($xlFile); die "Problems creating new Excel file $xlFile: $!\n" unless defined $workBook; # create formats my $stdFormat = $workBook->add_format(bold => 0,font => $fonts{fixed}, size => $fontSizes{fixed}, color => 'black'); my $boldFormat = $workBook->add_format(bold => 1,font => $fonts{fixed_bold}, size => $fontSizes{fixed_bold}, color => 'black'); my $wrapFormat = $workBook->add_format(bold => 0,font => $fonts{text}, size => $fontSizes{text}, color => 'black'); $wrapFormat->set_align('vjustify'); my $labels=<>; chomp $labels; # sadf starts header lines with '# ' - remove that $labels =~ s/^#\s+//; my @labels = split(/,\s*/,$labels); if ($debug) { print qq{LABELS:\n}; print join("\n",@labels); print "\n"; } # get the X series category if ( $categoryColName ) { my $want = $categoryColName; my $index = 0; ++$index until ($labels[$index] eq $want) or ($index > $#labels); $categoryColNum = $index; } #print join("\n",@labels); # get the element number of the column used to segregate into worksheets my $workSheetColPos; if ($workSheetCol) { my $i=0; foreach my $label ( @labels) { if ($label eq $workSheetCol) { $workSheetColPos = $i; last; } $i++; } } print "\nworkSheetColPos: $workSheetColPos\n" if $debug; # validate the columns to be charted # use as an index into the labels my @chartColPos=(); { my $i=0; foreach my $label ( @labels) { foreach my $chartCol ( @chartCols ) { if ($label eq $chartCol) { push @chartColPos, $i; last; } } $i++; } } if ($debug) { print "\nChartCols:\n", Dumper(\@chartCols); print "\nChartColPos:\n", Dumper(\@chartColPos); print "\nLabels:\n", Dumper(\@labels); } my %lineCount=(); my %workSheets=(); # the first worksheet is a directory my $directoryName='Directory'; my $directory; my $noDirectory=0; my $directoryLineCount=0; unless ($noDirectory) { $directory = $workBook->add_worksheet($directoryName) ; $directory->set_column(0,0,30); $directory->write_row($directoryLineCount++,0,['Directory'],$boldFormat); } while (<>) { chomp; my @data=split(/,/); my $currWorkSheetName; if ($workSheetCol) { $currWorkSheetName=$data[$workSheetColPos]; if (length($currWorkSheetName) > 31 ) { # cut some out of the middle of the name as the Excel worksheet name has max size of 31 $currWorkSheetName = substr($currWorkSheetName,0,14) . '...' . substr($currWorkSheetName,-14); } } else { $currWorkSheetName='DynaChart'; } print "Worksheet Name: $currWorkSheetName\n" if $debug; unless (defined $workSheets{$currWorkSheetName}) { $workSheets{$currWorkSheetName} = $workBook->add_worksheet($currWorkSheetName); $workSheets{$currWorkSheetName}->write_row($lineCount{$currWorkSheetName}++,0,\@labels,$boldFormat); # freeze pane at header $workSheets{$currWorkSheetName}->freeze_panes($lineCount{$currWorkSheetName},0); } # setup column widths #$workSheet->set_column($el,$el,$colWidth); $workSheets{$currWorkSheetName}->write_row($lineCount{$currWorkSheetName}++,0,\@data, $stdFormat); } if ($debug) { print "Worksheets:\n"; print "$_\n" foreach keys %workSheets; print Dumper(\%lineCount); } # each row consumes about 18 pixels my $rowHeight=18; # pixels my $chartHeight=23; # rows # triple from default width of 480 my $chartWidth = 480 * 3; =head1 Write the Charts The default mode is to create a separate chart for each metric By specifying the command line option --combined-chart the values will be charted in a single chart Doing so is probably useful only for a limited number of sets of values Some may question the apparent duplication of code in the sections to combine or not combine the charts Doing so would be a bit convoluted - this is easier to read and modify =cut foreach my $workSheet ( keys %workSheets ) { print "Charting worksheet: $workSheet\n" if $debug; my $chartNum = 0; if ($combinedChart) { my $chart = $workBook->add_chart( type => 'line', name => "Combined" . '-' . $workSheets{$workSheet}->get_name(), embedded => 1 ); $chart->set_size( width => $chartWidth, height => $chartHeight * $rowHeight); # each chart consumes about 16 rows $workSheets{$workSheet}->insert_chart((($chartNum * $chartHeight) + 2),3, $chart); foreach my $colPos ( @chartColPos ) { my $col2Chart=$labels[$colPos]; # [ sheet, row_start, row_end, col_start, col_end] $chart->add_series( name => $col2Chart, #categories => [$workSheet, 1,$lineCount{$workSheet},2,2], categories => [$workSheet, 1,$lineCount{$workSheet},$categoryColNum,$categoryColNum], values => [$workSheet, 1,$lineCount{$workSheet},$colPos,$colPos] ); } } else { foreach my $colPos ( @chartColPos ) { my $col2Chart=$labels[$colPos]; print "\tCharting column: $col2Chart\n" if $debug; my $chart = $workBook->add_chart( type => 'line', name => "$col2Chart" . '-' . $workSheets{$workSheet}->get_name(), embedded => 1 ); $chart->set_size( width => $chartWidth, height => $chartHeight * $rowHeight); # each chart consumes about 16 rows $workSheets{$workSheet}->insert_chart((($chartNum * $chartHeight) + 2),3, $chart); # [ sheet, row_start, row_end, col_start, col_end] $chart->add_series( name => $col2Chart, #categories => [$workSheet, 1,$lineCount{$workSheet},2,2], categories => [$workSheet, 1,$lineCount{$workSheet},$categoryColNum,$categoryColNum], values => [$workSheet, 1,$lineCount{$workSheet},$colPos,$colPos] ); $chartNum++; } } } # write the directory page my $urlFormat = $workBook->add_format( color => 'blue', underline => 1 ); my %sheetNames=(); foreach my $worksheet ( $workBook->sheets() ) { my $sheetName = $worksheet->get_name(); next if $sheetName eq $directoryName; $sheetNames{$sheetName} = $worksheet; } foreach my $sheetName ( sort keys %sheetNames ) { $directory->write_url($directoryLineCount++,0, qq{internal:'$sheetName'!A1} ,$urlFormat, $sheetName); } __END__ =head1 NAME dynachart.pl --help brief help message --man full documentation --spreadsheet-file output file name - defaults to asm-metrics.xlsx --worksheet-col name of column used to segragate data into worksheets defaults to a single worksheet if not supplied --chart-cols list of columns to chart dynachart.pl accepts input from STDIN This script will read CSV data created by asm-metrics-collector.pl or asm-metrics-aggregator.pl =head1 SYNOPSIS dynachart.pl [options] [file ...] Options: --help brief help message --man full documentation --spreadsheet-file output file name - defaults to asm-metrics.xlsx --worksheet-col name of column used to segragate data into worksheets defaults to a single worksheet if not supplied --chart-cols list of columns to chart --category-col specify the column for the X vector - a timestamp is typically used the name must exactly match that in the header --combined-chart create a single chart rather than a chart for each value specified in --chart-cols dynachart.pl accepts input from STDIN dynachart.pl --worksheet-col DISKGROUP_NAME < my_input_file.csv dynachart.pl --spreadsheet-file sar-disk-test.xlsx --combined-chart --worksheet-col DEV --category-col 'timestamp' --chart-cols 'rd_sec/s' --chart-cols 'wr_sec/s' < sar-disk-test.csv =head1 OPTIONS =over 8 =item B<-help> Print a brief help message and exits. =item B<-man> Prints the manual page and exits. =item B<--spreadsheet-file> The name of the Excel file to create. The default name is asm-metrics.xlsx =item B<--worksheet-col> By default a single worksheet is created. When this option is used the column supplied as an argument will be used to segragate data into separate worksheets. =item B<--chart-cols> List of columns to chart This should be the last option on the command line if used. It may be necessary to tell Getopt to stop processing arguments with '--' in some cases. eg. dynachart.pl dynachart.pl --worksheet-col DISKGROUP_NAME --chart-cols READS WRITES -- logs/asm-oravm-20150512_01-agg-dg.csv =item B<--category-col> Column to use as the category for the X line in the chart - default to the first column The name must exactly match a column from the CSV file Typically this line is a timestamp =back =head1 DESCRIPTION B<dynachart.pl> creates an excel file with charts for selected columns> =head1 EXAMPLES dynachart.pl accepts data from STDIN dynachart.pl --worksheet-col DISKGROUP_NAME --spreadsheet-file mywork.xlsx =cut
Creating Multiple Charts with Bash
A number of invocations of dynachart.pl can be found in sar-chart.sh as seen in the following example. This allows creating a number of Excel files from sar data for later analysis, all with one command.
#!/bin/bash usage() { cat <<-EOF usage: $0 destination-directory example script that charts CSV data that has been generated from sar Excel XLXS files with charts are produced applicable to any CSV data example: sar-chart.sh data-dir This script is using data generated by asp.sh EOF } while getopts h arg do case $arg in h) usage;exit 0;; *) usage;exit 1;; esac done destDir=$1 [ -d "$destDir" -a -w "$destDir" ] || { echo echo cannot read and/or write directory destDir: $destDir echo usage echo exit 1 } ### Template #echo working on sar- #dynachart.pl --spreadsheet-file ${destDir}/ --worksheet-col hostname --category-col 'timestamp' ################ # if dynachart.pl is in the path then use it. # if not then check for local directory copy or link # otherwise error exit dynaChart=$(which dynachart.pl) if [[ -z $dynaChart ]]; then if [[ -f ./dynachart.pl ]]; then dynaChart='./dynachart.pl' [ -x "$dynaChart" ] || { echo echo $dynaChart is not executable echo exit 2 } else echo echo "dynachart.pl not found" echo exit 1 fi fi # default of 1 chart per metric echo working on sar-disk-default.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-disk-default.xlsx --worksheet-col DEV --category-col 'timestamp' --chart-cols 'rd_sec/s' --chart-cols 'wr_sec/s' < sar-disk.csv # combine metrics into one chart echo working on sar-disk-combined.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-disk-combined.xlsx --combined-chart --worksheet-col DEV --category-col 'timestamp' --chart-cols 'rd_sec/s' --chart-cols 'wr_sec/s' < sar-disk.csv echo working on sar-network-device.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-network-device.xlsx --combined-chart --worksheet-col IFACE --category-col 'timestamp' --chart-cols 'rxkB/s' --chart-cols 'txkB/s' < sar-net-dev.csv echo working on sar-network-error-device.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-network-error-device.xlsx --combined-chart --worksheet-col IFACE --category-col 'timestamp' --chart-cols 'rxerr/s' --chart-cols 'txerr/s' < sar-net-ede.csv echo working on sar-network-nfs.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-network-nfs.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'call/s' --chart-cols 'retrans/' --chart-cols 'read/s' --chart-cols 'write/s' --chart-cols 'access/s' --chart-cols 'getatt/s' < sar-net-nfs.csv echo working on sar-network-nfsd.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-network-nfsd.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'scall/s' --chart-cols 'badcall/s' --chart-cols 'packet/s' --chart-cols 'udp/s' --chart-cols 'tcp/s' --chart-cols 'hit/s' --chart-cols 'miss/s' --chart-cols 'sread/s' --chart-cols 'swrite/s' --chart-cols 'saccess/s' --chart-cols 'sgetatt/s' < sar-net-nfsd.csv echo working on sar-network-socket.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-network-socket.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'totsck' --chart-cols 'tcpsck' --chart-cols 'udpsck' --chart-cols 'rawsck' --chart-cols 'ip-frag' --chart-cols 'tcp-tw' < sar-net-sock.csv echo working on sar-context.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-context.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'proc/s' --chart-cols 'cswch/s' < sar-context.csv echo working on sar-cpu.xlsx # extracted with -u ALL, so all CPU on one line dynachart.pl --spreadsheet-file ${destDir}/sar-cpu.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols '%usr' --chart-cols '%nice' --chart-cols '%sys' --chart-cols '%iowait' --chart-cols '%steal' --chart-cols '%irq' --chart-cols '%soft' --chart-cols '%guest' --chart-cols '%idle' < sar-cpu.csv echo working on sar-io-default.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-io-default.xlsx --worksheet-col hostname --category-col 'timestamp' --chart-cols 'tps' --chart-cols 'rtps' --chart-cols 'wtps' --chart-cols 'bread/s' --chart-cols 'bwrtn/s' < sar-io.csv echo working on sar-io-tps-combined.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-io-tps-combined.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'tps' --chart-cols 'rtps' --chart-cols 'wtps' < sar-io.csv echo working on sar-io-blks-per-second-combined.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-io-blks-per-second-combined.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'bread/s' --chart-cols 'bwrtn/s' < sar-io.csv echo working on sar-load-runq-threads.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-load-runq-threads.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'runq-sz' --chart-cols 'plist-sz' --chart-cols 'ldavg-1' --chart-cols 'ldavg-5' --chart-cols 'ldavg-15' < sar-load.csv echo working on sar-load-runq.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-load-runq.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'runq-sz' --chart-cols 'ldavg-1' --chart-cols 'ldavg-5' --chart-cols 'ldavg-15' < sar-load.csv echo working on sar-memory.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-memory.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'frmpg/s' --chart-cols 'bufpg/s' < sar-mem.csv echo working on sar-paging-rate.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-paging-rate.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'pgpgin/s' --chart-cols 'pgpgout/s' < sar-paging.csv echo working on sar-swap-rate.xlsx dynachart.pl --spreadsheet-file ${destDir}/sar-swap-rate.xlsx --combined-chart --worksheet-col hostname --category-col 'timestamp' --chart-cols 'pswpin/s' --chart-cols 'pswpout/s' < sar-swap-stats.csv
Filter, rename and aggregate the data
Perhaps you want to see a subset of some data. Let’s say you wish too see the data for only a select group of devices; those making up the DATA diskgroup in an Oracle database. In addition you would like the sar report to use names that match those seen by ASM. Rather than /dev/mapper/mp1, mp2,…, you would like to see DATA00, DATA01,…
In addition you would like to see the IOPS, read rate and write rate data for a set of disks aggregated per timestamp.
We can do that. The script remap.sh has an example (real life BTW) of how to map the ASM disk names to the linux device names, and create a new CSV file.
This query was used to map device names to disk names:
select 'dm-name-' || substr(d.path,instr(d.path,'/',-1)+1,length(d.path) - instr(d.path,'/',-1)) path , g.name || lpad(d.disk_number,2,'0') name from v$asm_disk d join v$asm_diskgroup g on g.group_number = d.group_number order by 2 15:10:17 SYSDBA> / PATH NAME ------------------------------ -------------------- dm-name-mpathmp1 DATA00 dm-name-mpathnp1 DATA01 dm-name-mpathop1 DATA02 dm-name-mpathqp1 DATA03 dm-name-mpathrp1 DATA04 dm-name-mpathsp1 DATA05 dm-name-mpathtp1 DATA06 dm-name-mpathvp1 DATA07 dm-name-mpathwp1 DATA08 dm-name-mpathcp1 FRA00 10 rows selected.
This information was used to then create a sed command to rewrite the CSV data as needed.
originalFile=sar-disk-original.csv newFile=sar-disk.csv sed \ -e 's/dm-name-mpathmp1/DATA00/g' \ -e 's/dm-name-mpathnp1/DATA01/g' \ -e 's/dm-name-mpathop1/DATA02/g' \ -e 's/dm-name-mpathqp1/DATA03/g' \ -e 's/dm-name-mpathrp1/DATA04/g' \ -e 's/dm-name-mpathsp1/DATA05/g' \ -e 's/dm-name-mpathtp1/DATA06/g' \ -e 's/dm-name-mpathvp1/DATA07/g' \ -e 's/dm-name-mpathwp1/DATA08/g' \ -e 's/dm-name-mpathcp1/FRA00/g' \ < $originalFile > $newFile
Following that csv-aggregator.sh and csv-aggregator.pl were used to filter and aggregate the data. The result is a much smaller CSV file with data only for the DATA diskgroup, with the disk data rolled up to the disk group level.
Rather than post several hundred more lines of Perl here, I will recommend you to follow the links if you want to see how this aggregation and filtering is done. The command line though is shown here:
csvFile=sar-csv/sar-disk-test.csv ./csv-aggregator.pl --filter-cols DEV --filter-vals 'DATA..' --key-cols hostname --key-cols timestamp --grouping-cols DEV --agg-cols tps --agg-cols 'rd_sec/s' --agg-cols 'wr_sec/s' --agg-cols 'avgrq-sz' --agg-cols 'avgqu-sz' < $csvFile
This is a chart taken directly from the Excel file created from this CSV data via dynachart.pl.
Installing Excel::Writer::XLSX
So that the dynachart.pl script will work, the Excel::Writer::XLSX Perl module must be installed.
There are several possible methods by which this may be done, but if you are not already fairly familiar with the Perl environment some of the methods are a little daunting. As Oracle by default always installs Perl in ORACLE_HOME, you can make use of that existing installation. These instructions will install the Excel::Writer::XLSX module and it’s dependencies directly into the Oracle Perl installation. Because of that, you would want to do this in a test environment, such as Virtual Machine running on your laptop. It isn’t too likely that this would break something in the Oracle home, but then again, why take the chance?
An even better option would be to install your own Perl, and not worry about affecting any other software. This is not too difficult to do via Perlbrew. Check out https://perlbrew.pl/ for the simple instructions.
The following example installs Excel::Writer::XLSX in Oracle’s copy of Perl on a test server. For my own use, I do use Perlbrew, but am just including the following as an example
[oracle@oravm01 ~]$ $ORACLE_HOME/perl/bin/perl -MCPAN -e shell ... Would you like me to configure as much as possible automatically? [yes] yes ...
These next commands tell CPAN to automatically answer appropriately rather than prompt you to continually type ‘yes’.
This lasts the duration of this CPAN session. Following that is the command to install Excel::Writer::XLSX.
cpan> o conf prerequisites_policy 'follow' cpan> o conf build_requires_install_policy yes cpan> install Excel::Writer::XLSX ... cpan> exit Terminal does not support GetHistory. Lockfile removed.
Now test the installation. This command simply tells Perl the we require version 99 of the module. The failure is normal, as only version 0.95 is installed. A different message would appear if no version of the module was installed.
[oracle@oravm01 ~]$ $ORACLE_HOME/perl/bin/perl -MExcel::Writer::XLSX=99 Excel::Writer::XLSX version 99 required--this is only version 0.95 at /u01/app/oracle/product/11.2.0/db_1/perl/lib/5.10.0/Exporter/Heavy.pm line 122. BEGIN failed--compilation aborted.