|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
|
| News | Unix System Monitoring | Performance Monitoring | Recommended Links | Reference | SAR integrators |
| Sarcheck | memtool | vmstat | iostat | Humor | Etc |
The SAR suite of utilities originated in Solaris but now is bundled with all major flavors of Unix It is not enabled by default. To enable SAR, you must run some utilities at periodic intervals through the cron facility
On Solaris 9 and 10 it is preinstalled but you need un-comment lines in the start script (/etc/rc2.d/S21perf) and crontab file (/var/spool/cron/crontabs/sys) associated with the tool. . For linux you need sysstat package. It contains sar, sadf, iostat, mpstat, and pidstat commands for Linux. RedHat has sysstat preinstalled and collect system activity data routinely. It logs it in the /var/log/sa directory.
The reason for sar existence is that gathering system activity data from vmstat and iostat can be a time-consuming job. You need to automate the gathering of system activity data, and the tool to use is sar. sar refers to the system activity reporter and is used to gather performance data from all the major components of the system.On Solaris sar can be enabled by uncommenting the appropriate lines in the system crontab and is described later in this section.
sar is part of the system activity reporter package that is included in the Solaris 9 operating environment. This package consists of three commands that are involved in automatic system activity data collection: sadc, sa1, and sa2.
When run, sadc creates binary files in the /usr/adm/sa directory. Each file is named sa<dd>, where <dd> is the current date. The syntax for the sadc command is as follows:
/usr/lib/sa/sadc [t n] <outputfile>
t is the interval (seconds) between samples, and n is the number of samples to take. The value of t should be greater than five seconds; otherwise, the activity of sadc itself might affect the sample. sadc then writes, in binary format, to the file named <outputfile> or to standard output if a filename is not specified. If t and n are omitted, a special file is written once.
The sadc command should be run first at system boot time to record the statistics from when the counters are reset to zero. To make sure sadc is run at boot time, the /etc/init.d/perf file must contain a command line that writes a record to the daily data file.
The command entry has the following format:
su sys -c "/usr/lib/sa/sadc /usr/adm/sa/sa´date +%d´"
This entry is already in the /etc/init.d/perf file, but it needs to be uncommented.
Typically, the system administrator sets up sadc to run at boot time when the system enters multiuser mode and then periodically—usually once each hour. To generate periodic records, you need to run sadc regularly. The simplest way to do this is by putting a line into the /var/spool/cron/crontabs/ sys file, which calls the shell script, sa1. This script invokes sadc and writes to the daily data files, /var/adm/sa/sa<dd>. The sa1 script gets installed with the other sar packages and has the following syntax:
/usr/lib/sa/sa1 [t n]
As with sadc, the arguments t and n cause records to be written n times at an interval of t seconds. If these arguments are omitted, the records are written only one time.
The sar command is used to report what has been collected in the daily activity files created by sadc. sadc creates binary files, and the only way to read this data is using the sar utility.
In addition, the sar command can be used to gather system activity data from the command line to look at performance either over different time slices or over a constricted period of time.
The syntax for the sar command is as follows:
sar [-aAbcdgkmpqruvwy] [-o <outputfile>] [t n ]
The options"
If no option is used, it is equivalent to calling the command with the -u option.
For example:
sar -A -o outfile 5 500
This will display all system activity data every 5 seconds and will gather 500 samples. The file named outfile is the output file where the data will get stored. The data is stored in binary format but can be read by using the -f option, as follows:
sar –f outfile
The following information is displayed:
SunOS ultra5 5.9 sun4u 05/14/2002
15:44:26 %usr %sys %wio %idle
15:44:31 1 2 0 97
15:44:36 1 2 0 96
15:44:41 9 2 9 80
15:44:46 2 3 0 95
15:44:51 0 1 0 99
15:44:56 3 2 0 96
15:45:01 0 0 0 100
15:45:06 15 0 0 85
15:45:11 2 0 0 97
15:45:16 3 2 0 95
15:45:21 1 1 0 98
15:45:26 2 2 0 97
15:45:31 0 0 0 100
15:45:36 0 0 0 100
15:45:41 8 1 0 92
15:45:46 1 4 0 95
15:45:51 2 3 0 96
15:45:56 2 0 0 98
15:46:01 0 0 0 100
15:46:06 15 0 0 85
Average 3 1 0 95
To start collecting system activity data with sar, you need to edit the /etc/ init.d/perf file and create an entry in the crontab file. Here are the steps:
0 * * * 0-6 /usr/lib/sa/sa1 20,40 8-17 * * 1-5 /usr/lib/sa/sa1 5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A
The first entry writes a record to /var/adm/sa/sa<dd> on the hour, every hour, seven days a week.
The second entry writes a record to /var/adm/sa/sa<dd> twice each hour during peak working hours: at 20 minutes and 40 minutes past the hour, from 8 a.m. to 5 p.m., Monday through Friday.
Thus, these two crontab entries cause a record to be written to /var/adm/sa/sa<dd> every 20 minutes from 8 a.m. to 5 p.m., Monday through Friday, and every hour on the hour otherwise. You can change these defaults to meet your needs.
The third entry runs the shell script named sa2 at 6:05 p.m., Monday through Friday. The sa2 script writes reports from the binary data.
The shell script sa2, a variant of sar, writes a daily report in the file /var/ adm/sa/sar<dd>. The report will summarize hourly activities for the day.
|
|||||||
The atsar command can be used to detect performance bottlenecks on Linux systems. It is similar to the sar command on other UNIX platforms. Atsar has the ability to show what is happening on the system at a given moment. It also keeps track of the past system load by maintaining history files from which information can be extracted. Statistics about the utilization of CPUs, disks and disk partitions, memory and swap, tty's, TCP/IP (v4/v6), NFS, and FTP/HTTP traffic are gathered. Most of the functionality of atsar has been incorporated in the atop project.Author:
Gerlof Langeveld [contact developer]
Sar P Plot is a simple application which takes the output of the atsar application and puts it into Gnuplot data files. It can be useful on server systems for performance analysis.
BSDsar generates a history of usage on a FreeBSD machine. It logs data such as CPU usage, disk activity, network bandwidth usage and activity, NFS information, memory, and swap. It is similar to atsar (for Linux) and sar (for Solaris).
2.5.4.3. The sadc command
As stated earlier, the sadc command collects system utilization data and writes it to a file for later analysis. By default, the data is written to files in the /var/log/sa/ directory. The files are named sa<dd>, where <dd> is the current day's two-digit date.
sadc is normally run by the sa1 script. This script is periodically invoked by cron via the file sysstat, which is located in /etc/crond.d. The sa1 script invokes sadc for a single one-second measuring interval. By default, cron runs sa1 every 10 minutes, adding the data collected during each interval to the current /var/log/sa/sa<dd> file.
2.5.4.4. The sar command
The sar command produces system utilization reports based on the data collected by sadc. As configured in Red Hat Linux, sar is automatically run to process the files automatically collected by sadc. The report files are written to /var/log/sa/ and are named sar<dd>, where <dd> is the two-digit representations of the previous day's two-digit date.
sar is normally run by the sa2 script. This script is periodically invoked by cron via the file sysstat, which is located in /etc/crond.d. By default, cron runs sa2 once a day at 23:53, allowing it to produce a report for the entire day's data.
2.5.4.4.1. Reading sar Reports
The format of a sar report produced by the default Red Hat Linux configuration consists of multiple sections, with each section containing a specific type of data, ordered by the time of day that the data was collected. Since sadc is configured to perform a one-second measurement interval every ten minutes, the default sar reports contain data in ten-minute increments, from 00:00 to 23:50[2].
Each section of the report starts with a heading that illustrates the data contained in the section. The heading is repeated at regular intervals throughout the section, making it easier to interpret the data while paging through the report. Each section ends with a line containing the average of the data reported in that section.
Here is a sample section sar report, with the data from 00:30 through 23:40 removed to save space:
00:00:01 CPU %user %nice %system %idle 00:10:00 all 6.39 1.96 0.66 90.98 00:20:01 all 1.61 3.16 1.09 94.14 … 23:50:01 all 44.07 0.02 0.77 55.14 Average: all 5.80 4.99 2.87 86.34In this section, CPU utilization information is displayed. This is very similar to the data displayed by iostat.
Other sections may have more than one line's worth of data per time, as shown by this section generated from CPU utilization data collected on a dual-processor system:
00:00:01 CPU %user %nice %system %idle 00:10:00 0 4.19 1.75 0.70 93.37 00:10:00 1 8.59 2.18 0.63 88.60 00:20:01 0 1.87 3.21 1.14 93.78 00:20:01 1 1.35 3.12 1.04 94.49 … 23:50:01 0 42.84 0.03 0.80 56.33 23:50:01 1 45.29 0.01 0.74 53.95 Average: 0 6.00 5.01 2.74 86.25 Average: 1 5.61 4.97 2.99 86.43There are a total of seventeen different sections present in reports generated by the default Red Hat Linux sar configuration; many are discussing in upcoming chapters. For more information about the data contained in each section, see the sar(1) man page.
Notes
[1] Device major numbers can be found by using ls -l to display the desired device file in /dev/. Here is sample output from ls -l /dev/hda:
brw-rw---- 1 root disk 3, 0 Aug 30 19:31 /dev/hdaThe major number in this example is 3, and appears between the file's group and its minor number.
[2] Due to changing system loads, the actual time that the data was collected may vary by a second or two.
An underused tool for looking into system performance, the sar command samples system activity counters available in the Unix kernel and prepares reports. Like most tools for measuring performance, sar provides a lot of data but little analysis, which probably explains why it doesn't get much more of a workout. It's up to the user to interpret the numbers and determine how a system is performing (or what is slowing it down).
Some companies bridge the gap between an excessive amount of available data and the bottom line system performance by creating or employing evaluation tools for the raw numbers and preparing a report that provides conclusions, not just numbers. SarCheck (a tool available from Aptitune Corporation) is one such tool. It provides some of the performance insights that might otherwise only be available to those staffs blessed by the presence of a performance specialist.
The sar command can be thought of as running in two modes: interactive or "real-time". "Real-time" mode reports on the system's current activity and "historical", which uses data previously collected and stored in log files. In both cases, the reports reflect data that is routinely collected in the kernel but, in the latter case, this data is sampled and stored so that past performance can be analyzed.
sar is not strictly a Solaris tool, either. It's available in other flavors of Unix as well, though configuration and default behavior may vary between implementations. RedHat Linux systems collect system activity data routinely and save it in files in the /var/log/sa directory. Solaris systems come prepared for running sar in either mode, but collection of data in performance logs must be specifically invoked by un-commenting lines in the start script (/etc/rc2.d/S21perf) and crontab file (/var/spool/cron/crontabs/sys) associated with the tool.
The Solaris package containing the sar commands is called SUNWaccu. The interactive and historical versions of the sar command differ only in where the data is coming from -- from the kernel moment by moment or from one of the log files containing previously collected performance data.
A common task for System Administrators is to monitor and care for a server. That's fairly easy to do at a moment's notice, but how to keep a record of this information over time? One way to monitor your server is to use the Sysstat package.
Sysstat is actually a collection of utilities designed to collect information about the performance of a linux installation, and record them over time.
It's fairly easy to install too, since it is included as a package on many distributions.To install on Centos 4.3, just type the following:
yum install sysstat
We now have the sysstat scripts installed on the system. Lets try the sar command.
sar
Linux 2.6.16-xen (xen30) 08/17/2006 11:00:02 AM CPU %user %nice %system %iowait %idle 11:10:01 AM all 0.00 0.00 0.00 0.00 99.99 Average: all 0.00 0.00 0.00 0.00 99.99Several bits of information, such as Linux kernel, hostname, and date are reported.
More importantly, the various ways CPU time being spent on the system is shown.
- %user, %nice, %system, %iowait, and %idle describe ways that the CPU may be utilized.
- %user and %nice refer to your software programs, such as MySQL or Apache.
- %system refers to the kernel's internal workings.
- %iowait is time spent waiting for Input/Output, such as a disk read or write. Finally, since the kernel accounts for 100% of the runnable time it can schedule, any unused time goes into %idle.
The information above is shown for a 1 second interval. How can we keep track of that information over time?
If our system was consistently running heavy in %iowait, we might surmise that a disk was getting overloaded, or going bad.
At least, we would know to investigate.So how do we track the information over time? We can schedule sar to run at regular intervals, say, every 10 minutes.
We then direct it to send the output to sysstat's special log files for later reports.
The way to do this is with the Cron daemon.By creating a file called sysstat in /etc/cron.d, we can tell cron to run sar every day.
Fortunately, the Systat package that yum installed already did this step for us.more /etc/cron.d/sysstat
# run system activity accounting tool every 10 minutes */10 * * * * root /usr/lib/sa/sa1 1 1 # generate a daily summary of process accounting at 23:53 53 23 * * * root /usr/lib/sa/sa2 -AThe sa1 script logs sar output into sysstat’s binary log file format, and sa2 reports it back in human readable format. The report is written to a file in /var/log/sa.
ls /var/log/sa
sa17 sar17sa17 is the binary sysstat log, sar17 is the report. (Today's date is the 17th)
There is quite alot of information contained in the sar report, but there are a few values that can tell us how busy the server is.
Values to watch are swap usage, disk IO wait, and the run queue. These can be obtained by running sar manually, which will report on those values.sar
Linux 2.6.16-xen (xen30) 08/17/2006 11:00:02 AM CPU %user %nice %system %iowait %idle 11:10:01 AM all 0.00 0.00 0.00 0.00 99.99 11:20:01 AM all 0.00 0.00 0.00 0.00 100.00 11:30:02 AM all 0.01 0.26 0.19 1.85 97.68 11:39:20 AM all 0.00 2.41 2.77 0.53 94.28 11:40:01 AM all 1.42 0.00 0.18 3.24 95.15 Average: all 0.03 0.62 0.69 0.64 98.02There were a few moments where of disk activity was high in the %iowait column, but it didn't stay that way for too long. An average of 0.64 is pretty good.
How about my swap usage, am I running out of Ram? Being swapped out is normal for the Linux kernel, which will swap from time to time. Constant swapping is bad, and generally means you need more Ram.
sar -W
Linux 2.6.16-xen (xen30) 08/17/2006 11:00:02 AM pswpin/s pswpout/s 11:10:01 AM 0.00 0.00 11:20:01 AM 0.00 0.00 11:30:02 AM 0.00 0.00 11:39:20 AM 0.00 0.00 11:40:01 AM 0.00 0.00 11:50:01 AM 0.00 0.00 Average: 0.00 0.00Nope, we are looking good. No persistant swapping has taken place.
How about system load? Are my processes waiting too long to run on the CPU?
sar -q
Linux 2.6.16-xen (xen30) 08/17/2006 11:00:02 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 11:10:01 AM 0 47 0.00 0.00 0.00 11:20:01 AM 0 47 0.00 0.00 0.00 11:30:02 AM 0 47 0.28 0.21 0.08 11:39:20 AM 0 45 0.01 0.24 0.17 11:40:01 AM 0 46 0.07 0.22 0.17 11:50:01 AM 0 46 0.00 0.02 0.07 Average: 0 46 0.06 0.12 0.08No, an average load of .06 is really good. Notice that there is a 1, 5, and 15 minute interval on the right.
Having the three time intervals gives you a feel for how much load the system is carrying.
A 3 or 4 in the 1 minute average is ok, but the same number in the 15 minute
column may indicate that work is not clearing out, and that a closer look is warranted.This was a short look at the Sysstat package.
We only looked at the out put of three of sar's attributes, but there are others.
Now, armed with sar in your toolbox, your system administration job just became a little easier.
Using sar
The next command,
sar, is the UNIX System Activity Reporting tool (part of the bos.acct fileset). It has been around for what seems like forever in the UNIX world. This command essentially writes to standard output the contents of the cumulative activity, which you would have selected as its flag. For example, the following command using the-uflag reports CPU statistics. As with vmstat, if you are using shared partitioning in a virtualized environment, it reports back two additional columns of information; physc and entc, which define the number of physical processors consumed by the partitions as well as the percentage of entitled capacity utilized.I ran this command on the system (see Listing 3) when there were no users around. Unless there were some batch jobs running, I would not expect to see a lot of activity.
Listing 3. Running sar with no users around
# sar -u 1 5 (or sar 1 5) AIX test01 3 5 03/18/07 System configuration: lcpu=2 17:36:53 %usr %sys %wio %idle physc 17:36:54 0 0 0 100 2.00 17:36:55 1 0 0 99 2.00 17:36:56 0 0 0 100 2.00 17:36:57 0 0 0 100 2.00 17:36:58 0 0 0 100 2.00 Average 0 0 0 100 2.00
Clearly, this system also shows no CPU bottleneck to speak of.
The columns used above are similar to vmstat entry outputs. The following table correlates sar and vmstat descriptives (see Table 1).
Table 1. sar output fields and the corresponding vmstat field
sar vmstat %usr us %sys sy %wio wa %idle id One of the reasons I prefer vmstat to sar is that it gives you the CPU utilization information, and it provides overall monitoring information on memory and I/O. With sar, you need to run separate commands to pull the information. One advantage that sar gives you is the ability to capture daily information and to run reports on this information (without writing your own script to do so). It does this by using a process called the System Activity Data Collector, which is essentially a back-end to the
sarcommand. When enabled, usually through cron (on a default AIX partition, you would usually find it commented out), it collects data periodically in binary format.
Solaris ps presents per-process stats.
Solaris prstat presents thread-level microstate accounting (with high-resolution time stamps) and per-project stats for resource management.
sar -v [entries/size for each table, evaluated once at sampling point, not available in SiteScope]:
proc-sz = number of process entries (proc structures) that are
currently being used, or allocated in the kernel.
inod-sz = total number of inodes in memory versus the maximum number
of inodes that are allocated in the kernel. This number is not a strict
high water mark. The number can overflow.
file-sz = size of the open system file table. The sz is given as 0, since
space is allocated dynamically for the file table.
ov = overflows that occur between sampling points for each table.
The number of shared memory record table entries currently being used or
allocated in the kernel. The sz is given as 0 because space is allocated
dynamically for the shared memory record table.
lock-sz = number of shared memory record table entries currently
being used or allocated in the kernel. The sz is given as 0 because space
is allocated dynamically for the shared memory record table.
sar -c [System calls]:
sar -m [Message and semaphore activities (for Interprocess Communication)]:
sar -t [translation lookaside buffer (TLB) activities, not available in SiteScope]:
sar -I [interrupt statistics, not available in SiteScope]:
sar -a [File access system routines]:
Extracting useful information
Data is being collected, but it must be queried to be useful. Running the
sarcommand without options generates basic statistics about CPU usage for the current day. Listing 2 shows the output ofsarwithout any parameters. (You might see different column names depending on the platform. In some UNIX flavors,sadccollects more or less data based on what's available.) The examples here are from Sun Solaris 10; whatever platform you're using will be similar, but might have slightly different column names.
Listing 2. Default output of sar (showing CPU usage
-bash-3.00$ sar SunOS unknown 5.10 Generic_118822-23 sun4u 01/20/2006 00:00:01 %usr %sys %wio %idle 00:10:00 0 0 0 100 . cut ... 09:30:00 4 47 0 49 Average 0 1 0 98Each line in the output of
saris a single measurement, with the timestamp in the left-most column. The other columns hold the data. (These columns vary depending on the command-line arguments you use.) In Listing 2, the CPU usage is broken into four categories:
- %usr: The percentage of time the CPU is spending on user processes, such as applications, shell scripts, or interacting with the user.
- %sys: The percentage of time the CPU is spending executing kernel tasks. In this example, the number is high, because I was pulling data from the kernel's random number generator.
- %wio: The percentage of time the CPU is waiting for input or output from a block device, such as a disk.
- %idle: The percentage of time the CPU isn't doing anything useful.
The last line is an average of all the datapoints. However, because most systems experience busy periods followed by idle periods, the average doesn't tell the entire story.
Disk activity is also monitored. High disk usage means that there will be a greater chance that an application requesting data from disk will block (pause) until the disk is ready for that process. The solution typically involves splitting file systems across disks or arrays; however, the first step is to know that you have a problem.
The output of
sar -dshows various disk-related statistics for one measurement period. For the sake of brevity, Listing 3 shows only hard disk drive activity.
Listing 3. Output of sar -d (showing disk activity)
$ sar -d SunOS unknown 5.10 Generic_118822-23 sun4u 01/22/2006 00:00:01 device %busy avque r+w/s blks/s avwait avserv . cut ... 14:00:02 dad0 31 0.6 78 16102 1.9 5.3 dad0,c 0 0.0 0 0 0.0 0.0 dad0,h 31 0.6 78 16102 1.9 5.3 dad1 0 0.0 0 1 1.6 1.3 dad1,a 0 0.0 0 1 1.6 1.3 dad1,b 0 0.0 0 0 0.0 0.0 dad1,c 0 0.0 0 0 0.0 0.0
As in the previous example, the time is along the left. The other columns are as follows:
- device: This is the disk, or disk partition, being measured. In Sun Solaris, you must translate this disk into a physical disk by looking up the reported name in /etc/path_to_inst, and then cross-reference that information to the entries in /dev/dsk. In Linux®, the major and minor numbers of the disk device are used.
- %busy: This is the percentage of time the device is being read from or written to.
- avque: This is the average depth of the queue that is used to serialize disk activity. The higher the avque value, the more blocking is occurring.
- r+w/s, blks/s: This is disk activity per second in terms of read or write operations and disk blocks, respectively.
- avwait: This is the average time (in milliseconds) that a disk read or write operation waits before it is performed.
- avserv: This is the average time (in milliseconds) that a disk read or write operation takes to execute.
Some of these numbers, such as avwait and avserv values, correlate directly into user experience. High wait times on the disk likely point to several people contending for the disk, which should be confirmed with high avque numbers. High avserv values point to slow disks.
Other metrics
Many other items are collected, with corresponding arguments to view them:
- The
-bargument shows information on buffers and the efficiency of using a buffer versus having to go to disk.- The
-cargument shows system calls broken down into some of the popular calls, such asfork(),exec(),read(), andwrite(). High process creation can lead to poor performance and is a sign that you might need to move some applications to another computer.- The
-g,-p, and-warguments show paging (swapping) activity. High paging is a sign of memory starvation. In particular, the-wargument shows the number of process switches: A high number can mean too many things are running on the computer, which is spending more time switching than working.- The
-qargument shows the size of the run queue, which is the same as the load average for the time.- The
-rargument shows free memory and swap space over time.Each UNIX flavor implements its own set of measurements and command-line arguments for
sar. Those I've shown are common and represent the elements that I find more useful.
ITworld.com - UNIX SYSTEM ADMINISTRATION - Introducing SAR
PDF] Solaris Performance monitoring - sar - NCCCS Systems Office Wiki
freshmeat.net Project details for BSDsar -- BSDsar generates a history of usage on a FreeBSD machine. It logs data such as CPU usage, disk activity, network bandwidth usage and activity, NFS information, memory, and swap. It is similar to atsar (for Linux) and sar (for Solaris).
The atsar command can be used to detect performance bottlenecks on Linux systems. It is similar to the sar command on other UNIX platforms. Atsar has the ability to show what is happening on the system at a given moment. It also keeps track of the past system load by maintaining history files from which information can be extracted. Statistics about the utilization of CPUs, disks and disk partitions, memory and swap, tty's, TCP/IP (v4/v6), NFS, and FTP/HTTP traffic are gathered. Most of the functionality of atsar has been incorporated in the atop project.Author:
Gerlof Langeveld [contact developer]
Sar P Plot is a simple application which takes the output of the atsar application and puts it into Gnuplot data files. It can be useful on server systems for performance analysis.
BSDsar generates a history of usage on a FreeBSD machine. It logs data such as CPU usage, disk activity, network bandwidth usage and activity, NFS information, memory, and swap. It is similar to atsar (for Linux) and sar (for Solaris).
The word "
sar" is used to refer to two related items:
- The system activity report package
- The system activity reporter
System Activity Report Package
This facility stores a great deal of performance data about a system. This information is invaluable when attempting to identify the source of a performance problem.The Report Package can be enabled by uncommenting the appropriate lines in the sys crontab. The
sa1program stores performance data in the/var/adm/sadirectory.sa2writes reports from this data, andsadcis a more general version ofsa1.In practice, I do not find that the
sa2-produced reports are terribly useful in most cases. Depending on the issue being examined, it may be sufficient to runsa1at intervals that can be set in the sys crontab.Alternatively,
sarcan be used on the command line to look at performance over different time slices or over a constricted period of time:
sar -A -o outfile 5 2000
(Here, "5" represents the time slice and "2000" represents the number of samples to be taken. "outfile" is the output file where the data will be stored.)The data from this file can be read by using the "-f" option (see below).
System Activity Reporter
sarhas several options that allow it to process the data collected bysa1in different ways:
- -a: Reports file system access statistics. Can be used to look at issues related to the DNLC.
- iget/s: Rate of requests for inodes not in the DNLC. An
igetwill be issued for each path component of the file's path.
- namei/s: Rate of file system path searches. (If the directory name is not in the DNLC,
igetcalls are made.)
- dirbk/s: Rate of directory block reads.
- -A: Reports all data.
- -b: Buffer activity reporter:
- bread/s, bwrit/s: Transfer rates (per second) between system buffers and block devices (such as disks).
- lread/s, lwrit/s: System buffer access rates (per second).
- %rcache, %wcache: Cache hit rates (%).
- pread/s, pwrit/s: Transfer rates between system buffers and character devices.
- -c: System call reporter:
- scall/s: System call rate (per second).
- sread/s, swrit/s, fork/s, exec/s: Call rate for these calls (per second).
- rchar/s, wchar/s: Transfer rate (characters per second).
- -d: Disk activity (actually, block device activity):
- %busy: % of time servicing a transfer request.
- avque: Average number of outstanding requests.
- r+w/s: Rate of reads+writes (transfers per second).
- blks/s: Rate of 512-byte blocks transferred (per second).
- avwait: Average wait time (ms).
- avserv: Average service time (ms). (For block devices, this includes seek rotation and data transfer times. Note that the
iostatsvc_tis equivalent to theavwait+avserv.)
- -e HH:MM: CPU useage up to time specified.
- -f filename: Use filename as the source for the binary
sardata. The default is to use today's filefrom /var/adm/sa.
- -g: Paging activity (see "Paging" for more details):
- pgout/s: Page-outs (requests per second).
- ppgout/s: Page-outs (pages per second).
- pgfree/s: Pages freed by the page scanner (pages per second).
- pgscan/s: Scan rate (pages per second).
- %ufs_ipf: Percentage of UFS inodes removed from the free list while still pointing at reuseable memory pages. This is the same as the percentage of igets that force page flushes.
- -i sec: Set the data collection interval to i seconds.
- -k: Kernel memory allocation:
- sml_mem: Amount of virtual memory available for the small pool (bytes). (Small requests are less than 256 bytes)
- lg_mem: Amount of virtual memory available for the large pool (bytes). (512 bytes-4 Kb)
- ovsz_alloc: Memory allocated to oversize requests (bytes). Oversize requests are dynamically allocated, so there is no pool. (Oversize requests are larger than 4 Kb)
- alloc: Amount of memory allocated to a pool (bytes). The total KMA useage is the sum of these columns.
- fail: Number of requests that failed.
- -m: Message and semaphore activities.
- msg/s, sema/s: Message and semaphore statistics (operations per second).
- -o filename: Saves output to filename.
- -p: Paging activities.
- atch/s: Attaches (per second). (This is the number of page faults that are filled by reclaiming a page already in memory.)
- pgin/s: Page-in requests (per second) to file systems.
- ppgin/s: Page-ins (per second). (Multiple pages may be affected by a single request.)
- pflt/s: Page faults from protection errors (per second).
- vflts/s: Address translation page faults (per second). (This happens when a valid page is not in memory. It is comparable to the
vmstat-reportedpage/mfvalue.)
- slock/s: Faults caused by software lock requests that require physical I/O (per second).
- -q: Run queue length and percentage of the time that the run queue is occupied.
- -r: Unused memory pages and disk blocks.
- freemem: Pages available for use (Use
pagesizeto determine the size of the pages).
- freeswap: Disk blocks available in swap (512-byte blocks).
- -s time: Start looking at data from time onward.
- -u: CPU utilization.
- %usr: User time.
- %sys: System time.
- %wio: Waiting for I/O (does not include time when another process could be schedule to the CPU).
- %idle: Idle time.
- -v: Status of process, inode, file tables.
- proc-sz: Number of process entries (proc structures) currently in use, compared with
max_nprocs.
- inod-sz: Number of inodes in memory compared with the number currently allocated in the kernel.
- file-sz: Number of entries in and size of the open file table in the kernel.
- lock-sz: Shared memory record table entries currently used/allocated in the kernel. This size is reported as 0 for standards compliance (space is allocated dynamically for this purpose).
- ov: Overflows between sampling points.
- -w: System swapping and switching activity.
- swpin/s, swpot/s, bswin/s, bswot/s: Number of LWP transfers or 512-byte blocks per second.
- pswch/s: Process switches (per second).
- -y: TTY device activity.
- rawch/s, canch/s, outch/s: Input character rate, character rate processed by canonical queue, output character rate.
- rcvin/s, xmtin/s, mdmin/s: Receive, transmit and modem interrupt rates.
Gathering system activity data from vmstat and iostat can be a time-consuming job. You need to automate the gathering of system activity data, and the tool to use is sar. sar refers to the system activity reporter and is used to gather performance data from all the major components of the system. sar comes standard on most UNIX systems, including Solaris. sar can be enabled by uncommenting the appropriate lines in the system crontab and is described later in this section.sar is part of the system activity reporter package that is included in the Solaris 9 operating environment. This package consists of three commands that are involved in automatic system activity data collection: sadc, sa1, and sa2.
sadc, a utility that also is part of the system activity reporter, collects system activity data and saves it in a binary format—one file for each 24-hour period. This data includes information on CPU utilization, buffer usage, disk and tape I/O activity, TTY device activity, switching and system-call activity, file access, queue activity, interprocess communications, and paging. The information saved is very much like the information you see displayed with the vmstat and iostat commands.
When run, sadc creates binary files in the /usr/adm/sa directory. Each file is named sa<d> n is the number of samples to take. The value of t should be greater than five seconds; otherwise, the activity of sadc itself might affect the sample. sadc then writes, in binary format, to the file named <outputfile> or to standard output if a filename is not specified. If t and n are omitted, a special file is written once.
The sadc command should be run first at system boot time to record the statistics from when the counters are reset to zero. To make sure sadc is run at boot time, the /etc/init.d/perf file must contain a command line that writes a record to the daily data file.
The command entry has the following format:
su sys -c "/usr/lib/sa/sadc /usr/adm/sa/sa´date +%d´"This entry is already in the /etc/init.d/perf file, but it needs to be uncommented.
Typically, the system administrator sets up sadc to run at boot time when the system enters multiuser mode and then periodically—usually once each hour. To generate periodic records, you need to run sadc regularly. The simplest way to do this is by putting a line into the /var/spool/cron/crontabs/ sys file, which calls the shell script, sa1. This script invokes sadc and writes to the daily data files, /var/adm/sa/sa<dd>. The sa1 script gets installed with the other sar packages and has the following syntax:
/usr/lib/sa/sa1 [t n]As with sadc, the arguments t and n cause records to be written n times at an interval of t seconds. If these arguments are omitted, the records are written only one time.
The sar command is used to report what has been collected in the daily activity files created by sadc. sadc creates binary files, and the only way to read this data is using the sar utility.
In addition, the sar command can be used to gather system activity data from the command line to look at performance either over different time slices or over a constricted period of time.
The syntax for the sar command is as follows:
sar [-aAbcdgkmpqruvwy] [-o <outputfile>] [t n ]The options to the sar command are described in Table 19.3.
Table 19.3 Options for the sar Command
Option Action a
Checks file access operations b
Checks buffer activity c
Checks system calls d
Checks activity for each block device g
Checks page-out and memory freeing k
Checks kernel memory allocation m
Checks interprocess communication p
Checks swap and dispatch activity q
Checks queue activity r
Checks unused memory u
Checks CPU utilization nv
Checks system table status w
Checks swapping and switching volume y
Checks terminal activity A
Reports overall system performance (same as entering all options)
NOTE
If no option is used, it is equivalent to calling the command with the -u option.
For example:
sar -A -o outfile 5 500This will display all system activity data every 5 seconds and will gather 500 samples. The file named outfile is the output file where the data will get stored. The data is stored in binary format but can be read by using the -f option, as follows:
sar –f outfileThe following information is displayed:
SunOS ultra5 5.9 sun4u 05/14/2002 15:44:26 %usr %sys %wio %idle 15:44:31 1 2 0 97 15:44:36 1 2 0 96 15:44:41 9 2 9 80 15:44:46 2 3 0 95 15:44:51 0 1 0 99 15:44:56 3 2 0 96 15:45:01 0 0 0 100 15:45:06 15 0 0 85 15:45:11 2 0 0 97 15:45:16 3 2 0 95 15:45:21 1 1 0 98 15:45:26 2 2 0 97 15:45:31 0 0 0 100 15:45:36 0 0 0 100 15:45:41 8 1 0 92 15:45:46 1 4 0 95 15:45:51 2 3 0 96 15:45:56 2 0 0 98 15:46:01 0 0 0 100 15:46:06 15 0 0 85 Average 3 1 0 95Setting Up sar
To start collecting system activity data with sar, you need to edit the /etc/ init.d/perf file and create an entry in the crontab file. Here are the steps:
- Become superuser.
- Edit the /etc/init.d/perf file and follow instructions in that file to uncomment lines that enable system activity gathering.
- The sadc command writes a special record that marks the time when the counters are reset to zero (boot time). The sadc output is put into the file sa<dd> (where dd is the current date), which acts as the daily system activity record.
- 3Edit the /var/spool/cron/crontabs/sys file (the system crontab file) and uncomment the following lines:
0 * * * 0-6 /usr/lib/sa/sa1 20,40 8-17 * * 1-5 /usr/lib/sa/sa1 5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -AThe first entry writes a record to /var/adm/sa/sa<dd> on the hour, every hour, seven days a week.
The second entry writes a record to /var/adm/sa/sa<dd> twice each hour during peak working hours: at 20 minutes and 40 minutes past the hour, from 8 a.m. to 5 p.m., Monday through Friday.
Thus, these two crontab entries cause a record to be written to /var/adm/sa/sa<dd> every 20 minutes from 8 a.m. to 5 p.m., Monday through Friday, and every hour on the hour otherwise. You can change these defaults to meet your needs.
The third entry runs the shell script named sa2 at 6:05 p.m., Monday through Friday. The sa2 script writes reports from the binary data.
The shell script sa2, a variant of sar, writes a daily report in the file /var/ adm/sa/sar<dd>. The report will summarize hourly activities for the day.
SyMON 3.0 was developed by the server division of Sun to provide a graphical user interface to its server products. It is deliberately very hardware specific, and it covers functionality that generic tools from other vendors do not address. In particular, it contains detailed knowledge of the configuration of a system and identifies low-level faults (such as correctable memory and disk errors) or high temperature levels that might lead to downtime if they are not identified and acted on. SyMON has a configuration browser that contains full-color images of system components as well as a simple hierarchy view.
The GUI monitors only a single system at a time but is a distributed application. A data collection process is all that runs on the server. An event monitor process runs on a separate machine and looks for problems. In the event of system downtime, the event monitor can raise the alarm even if the GUI is not in use. The Motif GUI can be installed on a separate desktop system. As a performance management tool, SyMON is equivalent to HP's GlancePlus product in that it does not store long-term data (a rolling two-hour history is available), but it does graph all the usual performance data. Its configurable alert capability has many predefined rules and can pass SNMP traps to a network management package; it also has an event log viewer.
SyMON can monitor any of Sun's UltraSPARC-based systems. It is supported only on server configurations, but because the Ultra 1 and Ultra 2 workstations are the basis for the Enterprise 150 and Enterprise 2, it works on them just as well. (The graphical view of an E150 appears if you use it on an Ultra 1, however.)
Solaris 9 System Monitoring and Tuning sar
Copyright © 1996-2008 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: October 23, 2008