Minimal Perl for Unix and Linux People: Part 2/Page 2 | WebReference

Minimal Perl for Unix and Linux People: Part 2/Page 2


[previous] [next]

Perl as a (Better) Find Command: Part 2

6.4.1 Defending against grep's messes

A valuable feature provided by the Shell is its ability to replace a command in backward quotes with that command's own output. This facility, called command substitution, and its Perl counterpart, called command interpolation, are covered in detail in section 8.5. In this section, we'll look briefly at how this powerful feature is used and how you sometimes need to use a command called xargs in its place.

The following command uses the Shell's command substitution facility to execute an ls|perl pipeline and deliver its output to grep as a set of filename arguments:

Even better, here's a version using the tiny Perl script presented earlier that embodies the code of that pipeline's Perl command:

The -d option tells ls to list directory names themselves (rather than their contents), which limits the generated filenames to those residing in the current directory. Because it has backward quotes around it, the ls|textfiles pipeline is replaced on the command line by its own output, causing the names of the resulting text files from the current directory to become the arguments to grep.

With that command, if the only text files in the current directory were ones named Larry, Moe, and Curly, the end result would be exactly as if the user had been willing and able to type those "Stoogeadelic" filenames as arguments to grep in the first place, like so:

The kinds of pipelines you've just seen are relevant to our current discussion because they provide a simple workaround for the screen-corruption problem discussed earlier. As a case in point, consider this command, which finds a match in the file named Moe:

This command is effectively a screen-safe version of the following, which is suitable only for extreme optimists (and gamblers) when the POSIX or classic grep is used:

Why is the first command of this pair superior? Because it filters the filenames generated by the "*" to remove any troublemakers that don't contain text.

For those restricted to using versions of grep that have the "search in a binary file and corrupt the screen" problem, a scripted version of that pipeline might come in handy.

A screen-safe grepper: text_grep

text_grep implements a case-insensitive grepper for text files:

Contrary to what you might expect, the textfiles script can't be used directly in implementing text_grep, because the former operates on filenames presented to its input, whereas the latter accepts them as arguments (like a real grep).

But you can easily implement text_grep using the techniques covered in chapter 2. First, test that the current input file has text contents using –T$ARGV. Then, if that test fails, close the filehandle (ARGV) to terminate the processing of the file before any matching is attempted, and to trigger the opening of the next file (if there is one).

Because the script accepts multiple filename arguments, it's important that it identifies each matching line with the name of its associated file, as shown in the earlier run that used $ARGV to prefix "Moe" to the matching line.

Here's the text_grep script:

This script even has value for those who already have access to improved GNU greppers, because it provides a framework for accessing Perl's superior collection of regex metacharacters and matching options (see table 3.2).

We'll look next at a convenient way to direct a grepper to search within an entire branch of the file-system tree for matches.


[previous] [next]

URL: