4.8 grep and regular expressions |
grep
, ex
, and vi
use regular expressions.
The shell, in its file name expansion, uses a simplified regular
expression format. awk
and egrep
use extended regular expressions.
grep
is a program used to search a file, or a group of files, for
a specific regular expression. It does wonders in finding lost
functions and subroutines in a group of program sources, or
checking to see if you have removed all references to an obsolete
function or variable. An option of -n
will cause the line number
to be printed.
wc
is a program used to count lines, words, and characters, usually
all three. Its options are -l
, -w
and
-c
, for lines, words and
characters. The default is -lwc
.
In this lab you will search, using the program grep. an old
"C" program of mine named cman6.c
. It is a Mandelbrot set
program written for the Borland "C" compiler and some graphics
library additions. The file
cman6.c is
available by anonymous ftp at ftp://phRed.dcccd.edu/pub/ltaber/cman6.c
.
Pipes are the use of the "|
" character to direct the output
of one command into the input of another command. For example:
Will extract all lines from ~csc137/cman6.c that have the stringgrep 'define' ~csc137/cman6.c | wc -l
define
and send them onto wc
.
wc
will then count the number of lines.
grep
also has a -c
option that will count the
lines that match.
Another interesting option to grep
is -v
. This
will cause grep
to invert the condition. If the regular
expression matches a line it will NOT be output. However,
all the lines that didn't match will be output instead.
The shell, when it is processing a command line, first looks for pipe and redirection symbols. If it finds these it effectively removes these from the command line. The individual commands are unaware that they ever existed. Then it searches for and replaces variables and file names with wild cards. Afterwards it executes the individual commands.
Write a shell script called greplab that searches ~csc137/cman6.c for the various items below.
Use the echo
command to print out your name and TABER CSC137.
For each item also provide a short description of what you are printing out.
Start your shell script with #!/bin/tcsh
to use the tcsh
shell.
>
". (Be
careful ">
" is a shell redirection character.)
Turn in a copy of your shell script and its results. Make a printout of your output and shell script, and mark it with:
Place the lab in the instructor hand-in box in BUS R6E, the "terminal room".your name TABER CSC137 Lab 4.8: grep & regular expressions
sh(1)
& csh(1)
manual
pages for a complete description.
*
Matches any string including a null string.
?
Matches any single character.
[ ]
Matches any enclosed character.
A range of characters can be specified with a "-". [a-d]
==
[abcd]
If the first character following a "[
" is a "^
" then any
character NOT enclosed is matched.
/
"must be matched explicitly.
sh
only).
*
" & "?
" to look for
"*
" & "?
".
{
i1,
i2,
...}
" expands
list - csh
and tcsh
.
ed
(1) manual pages for a complete description.
.
Is a one character re that matches any character.
*
Matches 0 or more of the preceding one character re.
[ ]
" matches any enclosed character.
A range can be specified
with a "-
". [a-d]
== [abcd]
. If the first character
following a "[
" is a
"^
"
then any character not
enclosed is matched.
^
at the beginning of the re forces the re to match
at the beginning of a line.
$
at the end of the re forces the re to match the final segment
of a line.
*
^
$
[
]
/
.
egrep
(1) manual pages for
a complete description. egrep
can run up to 10 times faster
than grep
. Its memory usage is less predictable. awk
also
uses extended regular expressions.
+
Matches 1 or more of the preceding regular expression.
?
Matches 0 or 1 of the preceding regular expression.
|
Between two regular expressions |
will match
if either expression matches.
( )
Expressions may be enclosed in parentheses for grouping.
4.8 grep and regular expressions |