4.9 4 Labs4.7 Editor - emacs4.8 grep and regular expressions

4.8 grep and regular expressions

Regular expressions are used in many UNIX tools. The string substitution used in grep, ex, and vi use regular expressions. The shell, in its file name expansion, uses a simplified regular expression format. awk and egrep use extended regular expressions.

grep is a program used to search a file, or a group of files, for a specific regular expression. It does wonders in finding lost functions and subroutines in a group of program sources, or checking to see if you have removed all references to an obsolete function or variable. An option of -n will cause the line number to be printed.

wc is a program used to count lines, words, and characters, usually all three. Its options are -l, -w and -c, for lines, words and characters. The default is -lwc.

In this lab you will search, using the program grep. an old "C" program of mine named cman6.c. It is a Mandelbrot set program written for the Borland "C" compiler and some graphics library additions. The file is avaliable at:
http://uml.lt.tucson.az.us/hl2.2007-fall/files/cman6.c
. Pipes are the use of the "|" character to direct the output of one command into the input of another command. For example:

grep 'define' ~cis137/cman6.c | wc -l

Will extract all lines from ~cis137/cman6.c that have the string define and send them onto wc. wc will then count the number of lines. grep also has a -c option that will count the lines that match.

Another interesting option to grep is -v. This will cause grep to invert the condition. If the regular expression matches a line it will NOT be output. However, all the lines that didn't match will be output instead.

The shell, when it is processing a command line, first looks for pipe and redirection symbols. If it finds these it effectively removes these from the command line. The individual commands are unaware that they ever existed. Then it searches for and replaces variables and file names with wild cards. Afterwards it executes the individual commands.

Write a shell script called greplab that searches ~cis137/cman6.c for the various items below.

Use the echo command to print out your name and TABER CIS137. For each item also provide a short description of what you are printing out.

Start your shell script with #!/bin/bash to use the bash shell.

  1. Use grep to find all lines that have the substring "closegraph" within them. Print out line numbers for the lines that contain the substring along with the complete content of the line Look at the grep manual pages for the option -n.
  2. The number of lines that have the string "include". Look at the grep manual pages for the option -c.
  3. The number of lines with a period ".". Be careful "." is a meta character that need to be escaped.
  4. The number of lines with a greater than symbol ">". Be careful ">" is a shell redirection character. Make sure that the character makes it to grep by quoting the regular expression.
  5. Print the line number and the actual line where the string "main(" is. This function is where a "C" program starts running.
  6. The number of lines that have the string "colormap".
  7. The number of lines that have the string "/*". Be careful "*" is a meta character, be sure to escape it.

When running grep it is best to always protect the regular expression from being interpreted by the shell bu placing it within a pair of single quotes. Also remember to escape all special character that need to retain their normal meaning within the regular expression.

Turn in a copy of your shell script and its results. Make a printout of your output and shell script, and mark it with:

your name
TABER CIS137
Lab 4.8: grep & regular expressions

Please turn your lab to Louis Taber or to Pima Community College employee in room A-115 of the Santa Rita Building. Ask them to place it in the dark blue folder in Louis Taber's mailbox.

4.8.1 shell rules

See the sh(1) & csh(1) manual pages for a complete description.

4.8.2 regular expression rules

See the ed(1) manual pages for a complete description.

4.8.3 extended regular expression rules

See the egrep(1) manual pages for a complete description. egrep can run up to 10 times faster than grep. Its memory usage is less predictable. awk also uses extended regular expressions.


Instructor: Louis Taber, louis.taber.at.pima at gmail dot com (520) 206-6850
My web site in Cleveland, OH
The Pima Community College web site


4.9 4 Labs4.7 Editor - emacs4.8 grep and regular expressions