Minimal Perl for Unix and Linux People: Part 2/Page 4 | WebReference

Minimal Perl for Unix and Linux People: Part 2/Page 4


[previous] [next]

Perl as a (Better) Find Command: Part 2

6.5 Using find|xargs vs. Perl Alternatives

As shown earlier, find can be used to generate filenames that ultimately become arguments to another command. This is such an important service that find has its own option for processing such commands, called -exec.

How does it work? You insert {} symbols anywhere the current filename should be inserted within the –execcommand clause, followed by a "\;" sequence to mark the end of command's argument list. The usual command format is therefore

For simplicity, let's first consider the common task of removing those pesky files named core—which can be produced when a program dies—from the branch of the file-system tree rooted at the current directory. The appropriate command is

If the following three pathnames were found, that find -exec command would execute a separate rm for each one, just as if you had manually typed these commands:

Although this approach gets the job done, it's not economical. Why? Because if 100 pathnames were found, it would take 100 processes, one for each rm command, to handle them all.

They say that the more processes a task on Unix requires, the more time it takes to run, so to should think about minimizing process utilization—especially if a single rm command (i.e., one using 1 process with 100 arguments) could do all the work by itself!

Thanks to the efforts of generations before us who have grappled with this problem, modern Unix systems come equipped with a utility program designed to solve it, called xargs. Its job is to convert its input lines into arguments for the designated command, allowing the following rewrite of the earlier find-exec command:

With this approach, xargs bundles together as many filename arguments as possible for submission to each invocation of rm that's needed, in compliance with the OS's maximum allowed size for an argument list. This means xargs is guaranteed not only to handle all the arguments, but also to use the smallest possible number of processes in doing so. For example, if each command can handle 100 arguments, and there are 110 filenames to process, there will be two invocations of the command, respectively handling 100 and 10 arguments.

As is the case with any powerful tool, you must be careful not to use it improperly. After all, a rocket-propelled grenade is an appropriate device for punching holes in tanks, but it's not recommended for manicuring toenails. Unfortunately, Unix users are in constant danger of shooting themselves in the foot by using xargs in places where it doesn't belong. For a thorough briefing on how to use Perl to avoid these types of friendly fire situations, report to the next section—pronto!


[previous] [next]

URL: