4.8 grep and regular expressions |
grep
, ex
, and vi
use regular expressions.
The shell, in its file name expansion, uses a simplified regular
expression format. awk
and egrep
use extended regular expressions.
grep
is a program used to search a file, or a group of files, for
a specific regular expression. It does wonders in finding lost
functions and subroutines in a group of program sources, or
checking to see if you have removed all references to an obsolete
function or variable. An option of -n
will cause the line number
to be printed.
wc
is a program used to count lines, words, and characters, usually
all three. Its options are -l
, -w
and
-c
, for lines, words and
characters. The default is -lwc
.
In this lab you will search, using the program grep. an old
"C" program of mine named cman6.c
. It is a Mandelbrot set
program written for the Borland "C" compiler and some graphics
library additions. The file
cman6.c is
available by anonymous ftp at ftp://lt.tucson.az.us/pub/cman6.c
.
Pipes are the use of the "|
" character to direct the output
of one command into the input of another command. For example:
grep 'define' ~cis137/cman6.c | wc -lWill extract all lines from ~cis137/cman6.c that have the string
define
and send them onto wc
.
wc
will then count the number of lines.
grep
also has a -c
option that will count the
lines that match.
Another interesting option to grep
is -v
. This
will cause grep
to invert the condition. If the regular
expression matches a line it will NOT be output. However,
all the lines that didn't match will be output instead.
The shell, when it is processing a command line, first looks for pipe and redirection symbols. If it finds these it effectively removes these from the command line. The individual commands are unaware that they ever existed. Then it searches for and replaces variables and file names with wild cards. Afterwards it executes the individual commands.
Write a shell script called greplab that searches ~cis137/cman6.c for the various items below.
Use the echo
command to print out your name and TABER CIS137.
For each item also provide a short description of what you are printing out.
Start your shell script with #!/bin/bash
to use the bash
shell.
>
". (Be
careful ">
" is a shell redirection character.)
Turn in a copy of your shell script and its results. Make a printout of your output and shell script, and mark it with:
your name TABER CIS137 Lab 4.8: grep & regular expressionsPlease turn your lab to Louis Taber or to Pima Community College employee in room A-115 of the Santa Rita Building. Ask them to place it in the dark blue folder in Louis Taber's mailbox.
sh(1)
& csh(1)
manual
pages for a complete description.
*
Matches any string including a null string.
?
Matches any single character.
[ ]
Matches any enclosed character.
A range of characters can be specified with a "-". [a-d]
==
[abcd]
If the first character following a "[
" is a "^
" then any
character NOT enclosed is matched. bash
also lets you use !
to
negate the list.
/
"must be matched explicitly.
sh
only).
*
" & "?
" to look for
"*
" & "?
".
{
i1,
i2,
...}
" expands
list - bash
, csh
, and tcsh
.
ed
(1) manual pages for a complete description.
.
Is a one character regular
expression that matches any character.
*
Matches 0 or more of the preceding one character
regular expression.
[ ]
" matches any enclosed character.
A range can be specified
with a "-
". [a-d]
== [abcd]
. If the first character
following a "[
" is a
"^
"
then any character not
enclosed is matched.
^
at the beginning of the regular expression
forces the regular expression to match at the beginning of a line.
$
at the end of the regular expression forces
the regular expression to match the final segment
of a line.
*
^
$
[
]
.
egrep
(1) manual pages for
a complete description. egrep
can run up to 10 times faster
than grep
. Its memory usage is less predictable. awk
also
uses extended regular expressions.
+
Matches 1 or more of the preceding regular expression.
?
Matches 0 or 1 of the preceding regular expression.
|
Between two regular expressions |
will match
if either expression matches.
( )
Expressions may be enclosed in parentheses for grouping.
4.8 grep and regular expressions |