|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
|
| News | Syntax | Recommended Links | Sorting algorithms | Recommended Papers | Rcut | Reference | Pipes |
| Perl re-implemenations | uniq | sort | tr | AWK | Tips | Humor | Etc |
At first thought, grep ought to be able to perform a task like searching mailboxes for specific text. You can search mail files for text, but using grep has at least two disadvantages. First, you may want to retrieve the entire message(s) that matches the pattern, and grep only returns matching lines by default. Second, if any of the mailboxes contain lengthy MIME attachments, searching with grep can produce voluminous output arising from an unlucky false positive within the binary attachment.
A better tool for this job is grepmail, an open source utility written by David Coppit (see http://grepmail.sourceforge.net for more information). grepmail is designed specifically for searching mail folders. Here is a simple example of its use:
% grepmail -R -i -l hilton ~/Mail Mail/conf/acs_w01
I was looking for the phone number of a specific Hilton hotel, which was in a mail message somewhere, but I couldn’t remember where I’d filed it. This command searches for the string “hilton” (-I says to perform a case-insensitive search) in all mail folders under the specified starting directory (-R means recursive), and lists the names of files containing messages that match (-l option). The advantage of this approach is that I can search for the string I remember and find the telephone number even though the two items may be lines apart in the actual message. This command yields the phone number:
% grepmail -i hilton '!!' | grep -i telephone Telephone: 619-231-4040
This grepmail command searches for the same string in the mail folder returned by the previous command. This time, grepmail will return the entire message as its output (since -l is omitted). The result is then piped to grep to isolate the phone number.
Here is a somewhat more complicated command that uses grepmail twice. Its goal is to find messages from user nadia that mention something related to Naples, Italy:
% grepmail -R -h "^From: .*nadia" ~/Mail | grepmail -b -i \ "naples|napoli|neapolit"
The first command searches mail headers (-h) for “From” lines including “nadia” somewhere in their text. The second command searches only the body (-b) of the matching messages for the specified strings.
grepmail has several other useful options:
-d date — Limit search to messages on the specified date or within the specified date range. The date format is very flexible; see the manual page for details.
-v — Display only non-matching messages.
-u — Display only unique messages.
-M — Don’t search non-text MIME attachments.
-r — Display a report listing each folder searched and the total number of matching messages within it.
-m — Add an X-Mailfolder header to displayed messages; the header’s text will be the path to the message’s mail folder.
-H — Display only the headers of matching messages.It is also very easy to forward a mail message located in this manner. Here is a simple method:
% grepmail -m -u ... | mail -s subject someone@somewhere
Finally, some people prefer to view the search results from a mail client. This is usually easy to accomplish via a simple script that redirects grepmail’s output to the mailer’s default folder. Several have been created for this purpose:
Search Operations for Software Packages
Software packages are another item whose contents are hard to search with grep. More specifically, I often want to answer questions like these:
On many systems, one or more of these questions can be answered using the package management tools supplied with the operating system. For example, the following commands can be used to list all currently installed packages on various UNIX systems:
Linux: rpm -q -a
FreeBSD: pkg_info -a -I
Solaris: pkginfo
HP-UX: swlist
AIX: lslpp -l all
You can pipe any of these commands to grep to determine whether a specific package is present to find its actual package name. For example, the following command lists all packages related to LDAP installed on a Linux system:
% rpm -q -a | grep -i ldap nss_ldap-184-1 openldap-2.0.23-4 openldap-clients-2.0.23-4 openldap-servers-2.0.23-4
This system has the OpenLDAP servers and client utilities installed, as well as the modules that interface LDAP to PAM and to the name service switch file, /etc/nsswitch.
It’s often useful to find out which package a particular file is part of (e.g., when you delete it accidentally and need to restore it). These command forms will indicate which package installed the specified file:
Linux: rpm -q ---whatprovides path
Solaris: pkgchk -l -p path
AIX: lslpp -w path
Here is an example from a Solaris system:
% pkgchk -l -p /etc/init.d/sendmail
Pathname: /etc/init.d/sendmail
Type: editted file
Expected mode: 0744
Expected owner: root
Expected group: sys
Referenced by the following packages:
SUNWsndmr
Current status: installed
When you want to know what is contained in an installed package, use these commands:
Linux: rpm -q -l name
FreeBSD: pkg_info -L name
Solaris: pkgchk -l name | grep "^Pathname:"
HP-UX: swlist -l file
AIX: lslpp -f name
Here is an example from a FreeBSD system:
% pkg_info -L grub-0.91_1 Information for grub-0.91_1: Files: /usr/local/bin/mbchk /usr/local/info/grub.info /usr/local/info/multiboot.info /usr/local/sbin/grub ...
In general, if you want to list the contents of an uninstalled package, you can replace the package name with the path to the package file in the preceding commands. On Linux systems, however, you must precede the package name with the -p option.
Only HP-UX and AIX have easy-to-use commands for listing the packages available on CDs or other media:
HP-UX: swlist -s path-or-device
AIX: installp -l -d device
On Linux, FreeBSD, and Solaris systems, you must rely on GUI package management tools to handle this function. On Linux systems, you can use gnorpm and similar packages (as well as yast2 on SuSE Linux systems). Under FreeBSD systems, you can use the sysinstall utility and select the Configure=>Packages menu path. On Solaris systems, the Supplementary Software CD includes a GUI installation tool that starts automatically when the CD is inserted, and it can be used to view the contents of the CD as well. On all three systems, you can also examine the directory containing the package files with ls for a quick listing of what is available.
Searching Net-SNMP MIBs
The Simple Network Management Protocol (SNMP) can be used to monitor and reconfigure a wide variety of computer systems and other network devices. The items that can be queried or set are defined in Management Information Bases (MIBs). A MIB is a collection of value and property definitions, and the various items are organized as a tree structure. This hierarchical organizational scheme serves to group related data together. MIB definitions are stored in files and are implemented in the software on the actual computers and devices. The MIB does not hold any data — it is a schema, not a database.
Here is an example MIB item:
iso.org.dod.internet.mgmt.mib-2.system.sysLocation = "Machine Room"
The long string on the left is the setting’s name, and its value is the string to the right of the equals sign. The name is separated into components by periods, and each corresponds to successive levels of the MIB tree. Thus, we can see that the sysLocation node is eight levels from the top of the tree.
Although the MIB is organized as a tree, it is not uniformly populated. The top four levels of the standardized MIB tree exist mainly for historical reasons. Given this rather ad hoc structure, searching the MIB tree for specific items is often essential. However, it is not a job for grep.
Most SNMP implementations provide utilities for examining MIBs. The open source SNMP implementation Net-SNMP is used on Linux and FreeBSD systems (and other UNIX systems, if desired). The tool the package provides to examine the MIB structure is snmptranslate. This command provides information about the MIB structure and its items. For example, you can use it to display a MIB subtree, as in this example:
% snmptranslate -Tp .iso.org.dod.internet.mgmt.mib-2.system +--system(1) | +-- -R-- String sysDescr(1) | Textual Convention: DisplayString | Size: 0..255 +-- -R-- ObjID sysObjectID(2) +-- -R-- TimeTicks sysUpTime(3) +-- -RW- String sysContact(4) | Textual Convention: DisplayString | Size: 0..255 ...
I’ve truncated the output after four entries.
snmptranslate can also provide detailed information about a specific MIB item, as in this example using the sysLocation leaf:
% snmptranslate -Td .iso.org.dod.internet.mgmt.mib-2. \
system.sysLocation
1.3.6.1.2.1.1.6
sysLocation OBJECT-TYPE
-- FROM SNMPv2-MIB, RFC1213-MIB
-- TEXTUAL CONVENTION DisplayString
SYNTAX OCTET STRING (0..255)
DISPLAY-HINT "255a"
MAX-ACCESS read-write
STATUS current
DESCRIPTION "The physical location of this
node (e.g., 'telephone closet, 3rd
floor'). If the location is unknown,
the value is the zero-length string."
::= { iso(1) org(3) dod(6) internet(1) \
mgmt(2) mib-2(1) system(1) 6 }
However, the most important searching feature — finding the location within the tree of a specific leaf — is not provided automatically by snmptranslate. This command will provide that information for the memTotalReal item:
% snmptranslate -Ts | grep memTotalReal\$ .iso.org.dod.internet.private.enterprises. \ ucdavis.memory.memTotalReal
This item, the total real memory present on a system, is located at the specified point within the hierarchy. A slightly more complex command can provide both the full location and a description for a MIB leaf:
% snmptranslate -Td 'snmptranslate -Ts | grep memTotalReal\$'
I use it often enough that I’ve defined an alias for this command:
% alias snmpwhat 'snmptranslate -Td `snmptranslate -Ts | grep \!:1\$`'
Unusual Pattern Matching Requirements
I’ll conclude this article with a quick look at two searching/pattern matching topics that can be a bit tricky.
Filtering Foreign Language Email
Like many people, I use procmail to preprocess mail messages, including attempting to remove spam. My current recipes work reasonably well for mail messages in Western languages, but they fail for ones in many other languages (e.g., Japanese, Chinese, Russian). Currently, I get 15-20 such spam messages each day.
Some people deal with this situation by discarding all email from the corresponding countries, but this approach does not work for me as I get legitimate mail from these countries on a regular basis (from non-predictable senders). What I needed was a procmail recipe to identify the foreign characters, which are above the normal ASCII range. The trick here is to get all of these characters into the .procmailrc file. This is easiest to do by entering them on a system/application that supports two-byte characters. The next step is to copy that file in binary mode to the system where procmail is run where its contents can be pasted into the initialization file.
A quick and dirty procmail recipe will look something like this when viewed with most text editors:
:0BH: * [\200\201\202...\377][\200\201\202...\377][\200\201\202...\377] $MAILDIR/foreign_spam
For me, three such characters in a row was a good enough first attempt at solving this problem. There are many more elegant solutions available on the Web. One of the best is by Walter Dnes, and it is available at:
http://www.waltdnes.org/email/chinese/index.html
It takes advantage of procmail’s weighting capabilities to detect messages containing more than 5% non-ASCII characters.
Copyright © 1996-2008 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: September 15, 2008