Sams Teach Yourself XML in 24 Hours, Complete Starter Kit, 3rd Edition. Part 1 | 2
Files and Directories in Perl
Exercise: The Unix grep
As you get further along in this book, the exercises will present you with more and
more useful tools. This exercise presents a stripped-down version of the Unix grep
utility. The Unix grep
Ânot to be confused with Perl's grep
function, introduced in
Hour 6, "Pattern Matching"Âsearches files for patterns. This exercise presents a utility
that will prompt for a directory name and a pattern. Every file in that directory
will be searched for that pattern, and lines matching that pattern will be printed.
In future exercises, this utility will be modified to search subdirectories (see Hour 15, "Finding Permanence") and to take command-line arguments (see Hour 12, "Using Perl's Command-Line Tools"). Stay tuned for details.
Using your text editor, type the program from Listing 10.1 and save it as mygrep. If
possible, be sure to make the program executable according to the instructions you
learned in Hour 1, "Getting Started with Perl." Also, make sure that you don't
rename this file to grep on a Unix system because it could be mistaken for the real
grep
utility.
When you're all done, try running the program by typing the following at a command line:
perl -w mygrep
or, if your system enables you to make the file executable,
mygrep
Line 1: This line contains the path to the interpreter (you can change it so that
it's appropriate to your system) and the Âw switch
. Always have warnings
enabled!
Line 3: The use strict
directive means that all variables must be declared
with my and that bare words must be quoted.
Lines 5Â8: $dir
, the directory to be searched, and $pat
, the pattern to search
for, are retrieved from STDIN. The newlines at the end of each are removed.
Line 10: $file
is declared as private to satisfy use strict. $file
is used later
in this program.
Line 12: The directory $dir
is opened; an error message is printed if this operation
fails.
Line 13: The entries are retrieved from the directory one at a time and stored
in $file
.
Line 14: Any directory entry that's really a directory itself (-d )
is rejected.
Notice that the pathname checked is $dir/$file
. This path must be checked
because $file
doesn't necessarily exist in the current directory; it exists in
$dir
. So the full pathname to the file is $dir/$file
.
Lines 15Â18: The file is opened, again using the full pathname $dir/$file
,
and rejected if it does not open.
Lines 19Â23: The file is searched, line by line, for a line that contains $pat
. A
matching line is printed.
Listing 10.2 shows a sample of the mygrep
program's output.
Directories
Thus far in this hour, I've been sort of handwaving over the topic of directory structure.
Full pathnames are sometimes needed to open files, and the readdir
function
can read directories. But actually navigating directories, adding or removing them,
and cleaning them out takes a little bit more Perl.
Navigating Directories
When you run software, your operating system keeps track of what directory you're
in when you run the software. When you log in to a Unix machine and run a software
package, you are usually placed in your home directory. If you type the operating
system command pwd
, the shell shows you what directory you are in. If you
use MS-DOS or Windows and open a command prompt
, the prompt reflects what
directory you are in at the timeÂfor example, C:\WINDOWS
. Alternatively, you can
type the operating system command cd at the MS-DOS prompt, and MS-DOS tells
you what directory you are in. The directory that you're currently using is called
your current directory or your current working directory.
If you do not specify a full pathname when you try to open a fileÂfor example,
open(FH, "file") || die
ÂPerl will attempt to open the file in your current
working directory. To change your current directory, you can use the Perl chdir
function, as follows:
chdir newdir;
The chdir
function changes the current working directory to newdir
. If the newdir
directory does not exist, or you don't have permission to access newdir
, chdir
returns false. The directory change from chdir
is temporary; as soon as your Perl
program ends, you return to the directory that you were working in before you ran
the Perl
program.
Running the chdir
function without a directory as an argument causes chdir
to
change to your home directory. On a Unix system, the home directory is usually the
directory that you were placed in when you logged in. On a Windows 95, Windows
NT, or MS-DOS machine, chdir takes you to the directory indicated in the HOME
environment variable. If HOME
isn't set, chdir
doesn't change your current directory
at all.
Perl doesn't have a built-in function for figuring out what your current directory isÂ
because of the way some operating systems are written, it's not easy to tell. To find
the current directory, you must use two statements together. Somewhere in your programÂ
preferably near the beginningÂyou must use the statement use Cwd
and
then, when you want to retrieve the current directory, use the cwd
function:
use Cwd;
print "Your current directory is: ", cwd, "\n";
chdir Â/tmp' or warn "Directory /tmp not accessible: $!";
print "You are now in: ", cwd, "\n";
You have to execute the use Cwd
statement only once; afterward, you can use the
cwd
function as often as necessary.
Creating and Removing Directories
To create a new directory, you can use the Perl mkdir
function. The mkdir
function's
syntax is as follows:
mkdir newdir, permissions;
The mkdir
function returns true if the directory newdir
can be created. Otherwise, it
returns false
and sets $!
to the reason that mkdir
failed. The permissions
are really
important only on Unix implementations of Perl, but they must be present on all
versions. For the following example, use the value 0755
; this value will be explained
in the section "Unix Stuff" later in this hour. For MS-DOS and Windows users, just
use the value 0755
; it's good enough and will spare you a long explanation.
print "Directory to create?";
my $newdir=
chomp $newdir;
mkdir( $newdir, 0755 ) || die "Failed to create $newdir: $!";
To remove a directory, you use the rmdir
function. The syntax for rmdir
is as follows:
rmdir pathname;
The rmdir
function returns true
if the directory pathname
can be removed. If pathname
cannot be removed, rmdir
returns false
and sets $!
to the reason that rmdir
failed, as shown here:
print "Directory to be removed?";
my $baddir=
chomp $baddir;
rmdir($baddir) || die "Failed to remove $baddir: $!";
The rmdir
function removes only directories that are completely empty. This means
that before a directory can be removed, all the files and subdirectories must be
removed first.
Removing Files
To remove files from a directory, you use the unlink
function:
unlink list_of_files;
The unlink
function removes all the files in the list_of_files
and returns the
number of files removed. If list_of_files
is omitted, the filename named in $_
is
removed. Consider these examples:
unlink ;
$erased=unlink Âold.exe', Âa.out', Âpersonal.txt';
unlink @badfiles;
unlink; # Removes the filename in $_
To check whether the list of files was removed, you must compare the number of files you tried to remove with the number of files removed, as in this example:
my @files=;
my $erased=unlink @files;
# Compare actual erased number, to original number
if ($erased != @files) {
print "Files failed to erase: ",
join(Â,', ), "\n";
}
unlink
is stored in
$erased
. After the unlink
, $erased
is compared to the number of elements in
@files
: They should be the same. If they're not, an error message is printed showing
the "leftover" files.
Created: March 27, 2003
Revised: February 3, 2006
URL: https://webreference.com/programming/perl_24/1