Linux+ TechNotes
Performance Monitoring and Troubleshooting

 

Index
Syslog
Manipulating Log Files
Hardware Troubleshooting
Resource Monitoring


One of the key duties of any system administrator is to keep a system running smoothly and be able to fix and correct the problem when something goes wrong. Checking log files for errors, monitoring disk usage and identifying trends in system performance are tasks that you should be comfortable doing. Linux is no different than any other operating system in that it provides the administrator a number of methods for determining the health of a system.

Syslog

Syslog is a centralized service for generating message and error logs. While similar in concept to the Event Viewer found in Windows, syslog is a powerful tool that can be used to customize the logging of kernel and application events. The main configuration file, /etc/syslog.conf, is used to specify the locations where specific types of events are logged. The standard syslog configuration logs messages to these files:

/var/log/messages

All system messages except authentication and mail

/var/log/maillog

All mail related messages

/var/log/spooler

All critical NNTP messages

/var/log/secure

All authentication messages

/var/log/cron

All messages generated by cron (the task scheduler)


An entry in the syslog.conf file takes the following format:

facility.priority      destination

The facility is one of the following keywords designating a particular Linux function: auth, authpriv, cron, daemon, kern, lpr, mail, news, syslog, user, uucp, and local0-7. The priority is one of the following keywords listed from highest to lowest: emerg, alert, crit, error, warning, notice, info and debug. Messages will be logged if their priority is equal to or higher than the priority specified. The destination is the location where the messages are logged. The following syslog entry states that all kernel messages with a priority of error or higher will be logged to the file /var/log/kernel:

kern.error            /var/log/kernel

This example states that ALL messages with a priority of emergency will be logged to /var/log/errors:

*.emerg                 /var/log/errors

In most cases the destination is a file on the local machine, however, syslog supports logging to consoles or even remote machines. Configuring syslog to send messages to another machine can be used to make it easier to detect and clean up the mess caused by a hacker. When a hacker has found a way into a system, they will often attempt to conceal the break-in. Commonly, this involves deleting traces of their activity from log files. If all system events are immediately being sent to another machine, the hacker will not be able to hide their presence.
 

Manipulating Log Files

Once you have configured the logging of various system events, you need to be able to view and obtain information from the log files. There are several tools that will make it easier to sift the pages and pages of text in a log. Cat is the simplest command for viewing the contents of a text file. Unfortunately, cat displays the entire file from start to finish so unless the file is small (or your terminal has a large buffer) you will only see the last page of information. You may also have to wait a while as hundreds of pages of text fly by on the screen. To be able to view a file one page at a time you can use the commands more and less . When using more , the space bar is used to display the next screen of data. More also includes the ability to search for a string and skip forward a specified number of lines. Less is an improved version of more . The most significant improvement is the ability to scroll backwards as well as forwards though a file.

Often when using log files, you are looking for information on a problem that has occurred recently. Since most log files are ordered chronologically, the most recent information is at the end of the file. The tail command is useful for looking at the last few lines of a file without having to page though the entire document. By default, tail will display the last ten lines of a file. You can change the number of lines displayed by specifying it on the command line. For example:

tail –30 /var/log/messages

Tail can be used to display all lines in a file after a specified line number. This is done using the + symbol:

tail +200 /var/log/messages

This command displays everything after the two hundredth line in the messages file. Perhaps the most useful feature of tail is the ability to watch messages that are appended to a file in real time as they happen. This is done with the –f switch:

tail –f /var/log/xferlog

This command will initially display the last ten lines of the FTP transfer log but as new entries are added to the log, they will immediately be displayed on the screen. This can be very useful when troubleshooting a problem to determine the sequence of events or to find out which error messages are generated by a specific command. The tail command will continue to display updates to the file until you press Ctrl-C.

The head command performs a similar function to tail except that it operates on the beginning of a file, displaying the first ten lines by default.

In many cases it is quicker to do a keyword search in a file rather than scrolling though hundreds of pages of text. The more and less commands both support searches while viewing a file. After using more or less to being viewing a file, type a ‘/‘ followed by a search string. Press enter and the cursor will skip to the first instance of that string in the file. Press the letter n to jump to the next match.

For more powerful search options, you should use the grep command. Grep is an extremely powerful command and has multiple uses, one of which is searching a file for a given string. Its basic syntax is:

grep string file

Grep will output every line in the file that contains the search string. Grep has several options that allow an administrator to perform very complicated searches in a single step:

-A

Used to display a number of lines of context after a match

-B

Used to display a number of lines of context before a match

-C

Used to display a number of lines of context before and after a match

-c

Displays a count of the number of matches instead of normal output

-i

Makes a search case insensitive

-n

Displays line numbers in front of each matching line

-v

Does an inverse search. Displays all lines that do not match the search string


Hardware Troubleshooting

The dmesg command and the /proc filesystem are particularly useful for identifying hardware issues. The dmesg command will display all startup messages and can be used to control the level of information displayed on the screen during the boot process.

The /proc filesystem is actually a direct interface to kernel memory that takes the form of a normal directory tree. Even though it appears as just another filesystem, it does not use any hard disk space. This filesystem contains runtime information about currently running processes and the hardware being used in the system. Files in /proc also contain information about resources such as IRQs and I/O addresses. Some of the more useful files are listed below. Each of these can be viewed just like any other text file:

/proc/devices

Lists which hardware devices have been loaded (i.e. ide0, ttyS)

/proc/dma

Shows which DMA channels are in use

/proc/interrupts

Shows which IRQs are in use

/proc/ioports

Shows which I/O ports are in use

/proc/meminfo

Gives information about real and virtual memory usage

/proc/modules

Displays which kernel modules are currently loaded

/proc/pci

Displays information about each device attached to the PCI bus

/proc/scsi/scsi

Displays information about each SCSI device



Resource Monitoring


Not all system problems are of the kind that show up in log files or are due to a specific failure. Performance bottlenecks and limited system resources can also cause headaches for users and administrators. Linux provides a number of commands for monitoring hard disk, memory and processor usage. The df command is one of the most useful for checking the amount of free disk space. Df shows a list of each partition (or volume) on a system and displays the total amount of disk space, the amount of used disk space, the amount of free disk space and the percentage used in a table format. By default, df lists space as the number of 1K blocks but can be changed to show space in megabytes or gigabytes by using the -h switch.

To see the amount of space used by a particular file you can use the command ls –l to get the exact size. Try using ls –l to find the amount of disk space taken by a directory and instead you will get a listing of all the files in the directory. To find the amount of space consumed by a directory you should use the du command. Using du without any options gives an estimate of the space taken by the current directory. It also lists the amount of space used by all subdirectories. Du can also be used to determine the amount of space used by a file but since it only gives a rounded off estimate, you should use ls -l instead. Du, like df, recognizes the –h option to display sizes in a more readable format.

The most useful tool for monitoring processor and memory usage is called top. Top is an interactive summary of resource usage information. It displays a list of running processes, sorted by CPU usage by default. This list is automatically refreshed every five seconds. The top display contains the following information:
  • System time
  • System uptime
  • Number of users currently logged in
  • Load average (explained below)
  • Number of processes
  • Percentage of CPU usage allocated to system, user or idle time
  • Summary of physical memory usage
  • Summary of swap memory usage
  • Process information with the most CPU intensive tasks listed at the top


The load average is a somewhat vague statistic but is often a good indicator of when some process has gone wildly out of control and is hogging CPU time. The three numbers listed indicate the average number of processes that are waiting for CPU time in the last minute, five minute, and fifteen minute intervals. On a well managed system each of these values should be less than 1.

Since the top display is interactive, there are several single-key commands that can be issued. The most common are listed below:

c Displays the full path and file name for commands
h Displays a help menu
k Prompts for a process to kill
n Used to change the number of processes shown
q Quits
r Used to adjust the priority of a process
<space> Immediately refreshes the display

To find more detailed information on memory, paging and CPU usage, use the vmstat command. Vmstat by default displays a snapshot of system statistics but can easily be configured to generate recurring snapshots separated by a specified delay. This output can be directed to a file and is most commonly used to generate benchmarking statistics.

For even more detailed reporting, use the sar command. Sar allows you to select specific activity counters to track and generates a data file. The file is then can be read using sar with the –f switch. Sar allows you to track statistics from the following categories: disk I/O, swap memory, block device I/O, network I/O, physical memory, CPU, and filesystem inodes.


 
 

Current related exam topics for the Linux+ exam:

DOMAIN 3.0 Configuration

3.10 Configure log files (for example: syslog, remote logfile storage)

DOMAIN 5.0 Documentation

5.1 Establish and monitor system performance baseline (for example: top, sar, vmstat, pstree)

5.4 Troubleshoot errors using systems logs (for example: tail, head, grep)

5.5 Troubleshoot application errors using application logs (for example: tail, head, grep)

DOMAIN 6.0 Hardware

6.2 Diagnose hardware issues using Linux tools (for example: /proc, disk utilities, ifconfig, /dev, liveCD rescue disk, dmesg)


Date: September 02, 2005
TechExams.Net
Author: Drew Miller
Comptia A+ Network+ I-net+ Linux+ MCP