Perl Pragma Primer (2/2) | WebReference

Perl Pragma Primer (2/2)

To page 1current page
[previous]

Perl Pragma Primer

Compile-Time Hints

On the previous page, I mentioned that the use statement is executed during the compile-phase of a script, and that standard modules can therefore export their various symbols (variables, functions, etc.) so they're available anywhere in the current package. As we've now learned, several of the pragmas only have an effect on the current code block. What is it about these pragmas that causes this difference in scope?

It turns out that most pragma modules are simple code blocks, containing only a few lines of codes and which concentrate their efforts almost entirely on the manipulation of global variables. One such key variable, which can be accessed via $^H1, is a key bitmap that the compiler consults as it compiles code; its various bits serve as flags that tell the compiler whether strict-ness is on or off (and at what level), whether strings should be treated as utf8 by character related operations, etc. In other words, this variable contains hints that the compiler uses to know how the user would like their code compiled, and with what specific features enabled. The import routine of several pragmas therefore only needs to alter key bits of this single global variable; the compiler then does the bulk of the work when it compiles the remaining code according to the resulting compiler hints settings.

The sharp-eyed among you may have noticed that I just said that $^H is a global variable; and that's true. In fact, $^H is one of the few special truly global variables in Perl shared across all scripts, packages and modules. A change to one of these variables changes it throughout your application! That being the case, how is it that some pragmas--which we just learned accomplish their goals by manipulating this global variable--are only effective for the block they are called in? If their import routines change this variable, then won't it be in effect for the remainder of the application?

It would--if not for the fact that Perl steps in and applies a rule especially to the compiler-hints flag (and others, such as the warning bits flag) during code compilation. Specifically, as Perl compiles a block of code, it stores the existing setting for the compiler-hints flag and then restores this saved value when it finishes compiling the block. Knowing this, the pragma can alter the value within the block and be assured that its residual effects will not remain outside the block--since the compiler effectively undoes its changes as soon as the block is complete, in other words:

# before compiling the following block, 
# the interpreter saves the current value of $^H.
if ($foo) {
   # The strict pragma now changes the settings of 
   # $^H, enabling strict-checking
   use strict;
   my $bar = 1;
   my $baz = 2;
   my $foobar = $bar + $baz;
   # ... etc.
}
# Now that the block is compiled, the interpreter
# restores the original settings for $^H and continues; 
# strict checking is therefore no longer in place

This mechanism is also what allows the pragmas to use the lexically scoped no directive; i.e., you can use no to turn off a pragma within a code block, with the default state of the pragma restored at the end of the block. i.e.:

# Set strict checking in our script
use strict;
my $foo = 1;
if ($foo) {
   # but turn it off for this block
   no strict;
   # no error here, since strict is turned off
   $bar = 2;
   print $bar;
}
# This will produce a compiler error, since 
# strict is still enforced outside the block 
$baz = 3;
print $baz;

Conditional Pragmas, Revisited

Let's put these various tidbits of information together and see if we can create a solution to the original problem: How do we conditionally load the utf8 pragma so its effects remain in place for the remainder of the script?

As we discovered at the end of the previous page, placing the use statement in a conditional block doesn't help us. The pragma is executed during compile-time, which is what we want; but its effects don't remain outside the block it is called within. Further, the actual condition won't be evaluated until the script is run, making the entire construct moot. Let's examine some other possibilities which are equally ineffective. See if you can spot the problems in these before the explanation is given.

BEGIN {
   if ($] 

This is just another way of saying the same thing. If you haven't seen it before, in Perl a BEGIN block is executed as soon as it has been completely compiled (even before the remaining code of the script is compiled2). This example solves one of our problems--namely that the code is fully compiled and therefore the if logic will be properly examined and executed; but our other problems remain. The use statement is still enclosed within a block. It won't be effective outside of it; it's executed as the BEGIN block is being compiled, and would be ineffective as soon as the BEGIN block finished compiling, anyway.

if ($] import();
}

require is an alternate code inclusion method; like use it allows us to pull an external chunk of code into our own script. Significantly unlike use, is the fact that require statements are executed at runtime, not compile-time. Also, a require won't automatically call the fetched code's import function, which explains why we had to add that step in the example above. Since it's a runtime construct, the above require based snippet solves both our problem of needing the code to execute to properly evaluate the conditional as well as bypasses the problem with the compile-time hints flag being restored when the block is complete (which only happens as the code is compiled). Unfortunately, this last point also means that the code is compleletely ineffective; since it's the manipulation of the compile-time hints settings during code compilation which actually triggers the effect of the pragma in the first place!

Solution: Combine BEGIN and require

The solution to the problem combines the two approaches above:

BEGIN {
   if ($] import();
   }
}

When the compiler hits this block, it saves the current setting of the compiler-hints flag. It then proceeds to compile the BEGIN block itself. Since the pragma is being loaded via a require and not a use, the compiler does not immediately execute the utf8 pragma code; it simply compiles it and moves on.

When the compilation of the BEGIN block is complete, the compiler-hints flag is restored to its original state (the state it was in before the compilation of the BEGIN block started). At this point, the compiler sees that compilation of the BEGIN block is finished and proceeds to execute it immediately, before the remaining code of the script is compiled. Our conditional is evaluated, the utf8 pragma is loaded (assuming we're on a Perl interpreter with a version number less than 5.8), and its import function called. The import function, in turn, modifies the compiler-hints flag to enable utf8-based processing. With the execution of the BEGIN block now completed, the interpreter returns to the process of compiling the remainder of the script--but now with the compiler-hints flag set the way we want it. Whew!

An Alternate: use if

An alternate, more elegant solution to the problem presents itself in the form of another pragma: if. The if pragma can be used to load a module (including another pragma) based on a supplied condition, which is exactly what we need to do here. use if takes a pair of arguments, the condition, and the module (optionally combined with its needed arguments) expressed as a hash. i.e.:

use if CONDITION, MODULE => ARGUMENTS;

or in our case:

use if ($] 

Note that utf8 is a string argument to the if pragma here, which is why it's quoted.

So why not just use if instead of the relatively long winded BEGIN and require method? While the if pragma is a more elegant solution to the problem, it doesn't appear to be available (by default) in Perl versions prior to version 5.8; the exact versions I'm testing for in my condition in the first place. If use if is available on your system, you might prefer to use it, instead. See the perldoc documentation of your system for more detail.

Conclusion

I often find it helpful in my own development efforts to take extra time to learn why something doesn't work as well as figuring out why the correct solution does; and it was exactly that type of analysis that led to the information presented in this article. I encourage you to perform the same type of analysis in your own projects, since the information you uncover may prove to be beneficial in other types of tasks and scripts--information you might not have known when you started.

For further information on Perl pragmas in general I refer you to your perldoc documentation, specifically:

perldoc perlmodlib
perldoc perllexwarn
perldoc perlvar

If you're unfamiliar with perldoc, my earlier primer on the subject should bring you up to speed.

As always, good luck with your own Perl projects!


1. But probably shouldn't be; that's what the pragmas are for.

2. And conversely an END block is executed as late as possible in a script, typically when the script is shutting down.

Digg This Add to del.icio.us


To page 1current page
[previous]

Created: May 2, 2008
Revised: May 2, 2008

URL: https://webreference.com/programming/perl/pragmas/2.html