1 GREP GREP is GNU grep, the Free Software Foundation's implementation of grep, which prints lines matching a regular expression. The command syntax is UNIX-like, not VMS-like. Grep searches the files listed in the arguments (or SYS$INPUT: if no files are given) for all lines that contain a match for the given expression. If any lines match, they are printed. Also, if any matches were found, grep exits with a status of 1, but if no matches found it exits with a status of 3. This is useful for building command procedures that use grep as a condition for, for example, the IF statement. The grep family also includes EGREP, FGREP and GGREP. When invoked as egrep, the syntax of the expression is slightly different; See the subtopic "Options". When invoked as fgrep, the expression can consist only of a fixed string (regular expressions cannot be used). Multiple search strings can be used with fgrep by creating a file containing one string per line and using the "-f" option. When invoked as ggrep, the syntax of the expression is the same as for egrep, but input files may be specified as a comma-separated list, and the device and directory of the preceding file are used as defaults if they are not specified. This simulates the "sticky" defaults used by VMS programs such as search. Also, if the "l" option is specified so as to output only the names of files which contain the expression, their complete specifications (device:[directory]name.extension;version) are always reported, uppercase, as for the output of the VMS search program. Ggrep was developed for use by VMS gopher servers. Format: grep [-options] [-num] [-AB num] [[-e] expr|-f file] ["files ..."] 2 Options -A print lines of context after every matching line -B print lines of context before every matching line -C print 2 lines of context on each side of every match - print lines of context on each side of every match -V print the version number on the diagnostic output -b print every match preceded by its byte offset -c print a total count of matching lines only -e search for ; useful if begins with - -f search for the expression contained in file -h don't display filenames on matches -i ignore case difference when comparing strings -l list files containing matches only -n print each match preceded by its line number -s run silently producing no output except error messages -v print only lines that contain no matches for the -w print only lines where the match is a complete word -x print only lines where the match is a whole line 2 Regular_Expressions The grep expression can include the following characters: grep egrep Explanation ---- ----- ----------- c c a single (non-meta) character matches itself. . . matches any single character except newline. \? ? postfix operator; preceeding item is optional. * * postfix operator; preceeding item 0 or more times. \+ + postfix operator; preceeding item 1 or more times. \| | infix operator; matches either argument. ^ ^ matches the empty string at the beginning of a line. $ $ matches the empty string at the end of a line. \< \< matches the empty string at the beginning of a word. \> \> matches the empty string at the end of a word. [chars] [chars] match any character in the given class; if the first character after [ is ^, match any character not in the given class; a range of characters may be specified by "first-last"; for example, \W (below) is equivalent to the class [^A-Za-z0-9] \( \) ( ) parentheses are used to override operator precedence. \digit \digit \n matches a repeat of the text matched earlier in the regexp by the subexpression inside the nth opening parenthesis. \ \ any special character may be preceded by a backslash to match it literally. (the following are for compatibility with GNU Emacs) \b \b matches the empty string at the edge of a word. \B \B matches the empty string if not at the edge of a word. \w \w matches word-constituent characters (letters & digits). \W \W matches characters that are not word-constituent. Operator precedence is (highest to lowest) ?, **, and +, concatenation, and finally |. All other constructs are syntactically identical to normal characters. For the truly interested, the file dfa.c describes (and implements) the exact grammar understood by the parser. 2 Incompatibilities The following incompatibilities with UNIX grep exist: o The context-dependent meaning of ** is not quite the same (grep only). o -b prints a byte offset instead of a block offset. o The {m,nfP} construct of System V grep is not implemented. o Exit statuses are different from UNIX grep: Meaning UNIX VMS ------- ---- --- matches found 0 1 no matches found 1 3 error 2 44 (SS$_ABORT) 2 Bugs GNU e?grep has been thoroughly debugged and tested over a period of several years; we think it's a reliable beast or we wouldn't distribute it. If by some fluke of the universe you discover a bug, send a detailed description (including options, regular expressions, and a copy of an input file that can reproduce it) to mike@ai.mit.edu. 2 Availability GNU grep is free; anyone may redistribute copies of grep to anyone under the terms stated in the GNU General Public License, a copy of which may be found in each copy of "GNU Emacs". See also the comment at the beginning of the source code file grep.c. Copies of GNU grep may sometimes be received packaged with distributions of Unix systems, but it is never included in the scope of any license covering those systems. Such inclusion violates the terms on which distribution is permitted. In fact, the primary purpose of the General Public License is to prohibit anyone from attaching any other restrictions to redistribution of any of the Free Software Foundation programs. 2 Authors Mike Haertel wrote the deterministic regexp code and the bulk of the program. Mike also wrote fgrep. James A. Woods is responsible for the hybridized search strategy of using Boyer-Moore-Gosper fixed-string search as a filter before calling the general regexp matcher. Arthur David Olson contributed code that finds fixed strings for the aforementioned BMG search for a large class of regexps. Richard Stallman wrote the backtracking regexp matcher that is used for \digit backreferences, as well as the getopt that is provided for 4.2BSD sites. The backtracking matcher was originally written for GNU Emacs. D. A. Gwyn wrote the C alloca emulation that is provided so System V machines can run this program. (Alloca is used only by RMS' backtracking matcher, and then only rarely, so there is no loss if your machine doesn't have a "real" alloca.) Scott Anderson and Henry Spencer designed the regression tests used in the "regress" script. Paul Placeway wrote the original version of this manual page. Hunter Goatley ported grep 1.6 and fgrep 1.1 to VMS and produced the on-line help for VMS. Foteos Macrides added ggrep for use with VMS gopher servers. 2 Examples 1. $ grep int *.c In this example, all files with the extension .C will be searched for the string "int". 2. $ grep -i "^..int" *.c In this example, all files with the extension .C are searched for lines that begin with any two characters followed by "int". The "-i" option tells grep to ignore case. 3. $ grep -ih "^..int" *.c In this example, the "h" option is specified to prevent grep from displaying the names of matching files. 4. $ create tmp.tmp search-string1 search-string2 ^Z $ fgrep -f tmp.tmp *.c In this example, a file containing more than one search string is created and used as input to fgrep. 5. $ ggrep -il >output gopher gopher_root:[_search]*.c;,*.h;,*.com; In this example, the "i" option is specified for case-insensitive searches, and the "l" option is specified to list only the names of files which contain the expression "gopher". The list will be piped to the file "OUTPUT." rather than to the screen. The wildcarded file names have been entered as comma-separated rather than space-separated lists, and all of the files will be sought in "GOPHER_ROOT:[_SEARCH]" (in contrast, grep, egrep and fgrep require spaces as input file separators, and would search files with the extension .H or .COM in the current default directory, rather than "sticking" to GOPHER_ROOT:[_SEARCH]).