© Copyright 1999 The Perl Journal. Reprinted with permission.
In my article Coping With Scoping I offered the advice ``Always use
my
; never use local
.'' The most common use for both is to provide your subroutines with
private variables, and for this application you should always use my
, and never local
. But many readers (and the tech editors) noted that local
isn't entirely useless; there are cases in which my
doesn't work, or doesn't do what you want. So I promised a followup article
on useful uses for
local
. Here they are.
my
makes most uses of local
obsolete. So it's not surprising that the most common useful uses of local
arise because of peculiar cases where my
happens to be illegal.
The most important examples are the punctuation variables such as
$"
, $/
, $^W
, and $_
. Long ago Larry decided that it would be too confusing if you could my
them; they're exempt from the normal package scheme for the same reason. So
if you want to change them, but have the change apply to only part of the
program, you'll have to use local
.
As an example of where this might be useful, let's consider a function whose job is to read in an entire file and return its contents as a single string:
sub getfile { my $filename = shift; open F, "< $filename" or die "Couldn't open `$filename': $!"; my $contents = ''; while (<F>) { $contents .= $_; } close F; return $contents; }
This is inefficient, because the <F>
operator makes Perl go to all the trouble of breaking the file into lines
and returning them one at a time, and then all we do is put them back
together again. It's cheaper to read the file all at once, without all the
splitting and reassembling. (Some people call this slurping the file.) Perl has a special feature to support this: If the $/
variable is undefined, the <...>
operator will read the entire file all at once:
sub getfile { my $filename = shift; open F, "< $filename" or die "Couldn't open `$filename': $!"; $/ = undef; # Read entire file at once $contents = <F>; # Return file as one single `line' close F; return $contents; }
There's a terrible problem here, which is that $/
is a global variable that affects the semantics of every <...>
in the entire program. If getfile
doesn't put it back the way it was, some other part of the program is
probably going to fail disastrously when it tries to read a line of input
and gets the whole rest of the file instead. Normally we'd like to use my
, to make the change local to the functions. But we can't here, because my
doesn't work on punctuation variables; we would get the error
Can't use global $/ in "my" ...
if we tried. Also, more to the point, Perl itself knows that it should look
in the global variable $/
to find the input record separator; even if we could create a new private
varible with the same name, Perl wouldn't know to look there. So instead,
we need to set a temporary value for the global variable $/
, and that is exactly what local
does:
sub getfile { my $filename = shift; open F, "< $filename" or die "Couldn't open `$filename': $!"; local $/ = undef; # Read entire file at once $contents = <F>; # Return file as one single `line' close F; return $contents; }
The old value of $/
is restored when the function returns. In this example, that's enough for
safety. In a more complicated function that might call some other functions
in a library somewhere, we'd still have to worry that we might be
sabotaging the library with our strange $/
. It's probably best to confine changes to punctuation variables to the
smallest possible part of the program:
sub getfile { my $filename = shift; open F, "< $filename" or die "Couldn't open `$filename': $!"; my $contents; { local $/ = undef; # Read entire file at once $contents = <F>; # Return file as one single `line' } # $/ regains its old value close F; return $contents; }
This is a good practice, even for simple functions like this that don't
call any other subroutines. By confining the changes to $/
to just the one line we want to affect, we've prevented the possibility
that someone in the future will insert some calls to other functions that
will break because of the change. This is called
defensive programming.
Although you may not think about it much, localizing $_
this way can be very important. Here's a slightly different version of getfile
, one which throws away comments and blank lines from the file that it
gets:
sub getfile { my $filename = shift; local *F; open F, "< $filename" or die "Couldn't open `$filename': $!"; my $contents; while (<F>) { s/#.*//; # Remove comments next unless /\S/; # Skip blank lines $contents .= $_; # Save current (nonblank) line } return $contents; }
This function has a terrible problem. Here's the terrible problem: If you call it like this:
foreach (@array) { ... $f = getfile($filename); ... }
it clobbers the elements of @array
. Why? Because inside a
foreach
loop, $_
is aliased to the elements of the array; if you change $_
, it changes the array. And getfile
does change
$_
. To prevent itself from sabotaging the $_
of anyone who calls it, getfile
should have local $_
at the top.
Other special variables present similar problems. For example, it's
sometimes convenient to change $"
, $,
, or $\
to alter the way
print
works, but if you don't arrange to put them back the way they were before
you call any other functions, you might get a big disaster:
# Good style: { local $" = ')('; print ''Array a: (@a)\n``; } # Program continues safely...
Another common situation in which you want to localize a special variable
is when you want to temporarily suppress warning messages. Warnings are
enabled by the -w
command-line option, which in turn sets the variable $^W
to a true value. If you reset $^W
to a false value, that turns the warnings off. Here's an example: My
Memoize
module creates a front-end to the user's function and then installs it into
the symbol table, replacing the original function. That's what it's for,
and it would be awfully annyoying to the user to get the warning
Subroutine factorial redefined at Memoize.pm line 113
every time they tried to use my module to do what it was supposed to do. So I have
{ local $^W = 0; # Shut UP! *{$name} = $tabent->{UNMEMOIZED}; # Otherwise this issues a warning }
which turns off the warning for just the one line. The old value of
$^W
is automatically restored after the chance of getting the warning is over.
Let's look back at that getfile
function. To read the file, it opened the filehandle F
. That's fine, unless some other part of the program happened to have
already opened a filehandle named F
, in which case the old file is closed, and when control returns from the
function, that other part of the program is going to become very confused
and upset. This is the `filehandle clobbering problem'.
This is exactly the sort of problem that local variables were supposed to solve. Unfortunately, there's no way to localize a filehandle directly in Perl.
Well, that's actually a fib. There are three ways to do it:
Filehandle
or IO::Handle
modules, which cast the spell I just described, and present you with the
results, so that you don't have to perform any sorcery yourself.
The simplest and cheapest way to solve the `filehandle clobbering problem'
is a little bit obscure. You can't localize the filehandle itself, but you
can localize the entry in Perl's symbol table that associates the
filehandle's name with the filehandle. This entry is called a `glob'. In
Perl, variables don't have names directly; instead the glob has a name, and
the glob gathers together the scalar, array, hash, subroutine, and
filehandle with that name. In Perl, the glob named F
is denoted with *F
.
To localize the filehandle, we actually localize the entire glob, which is a little hamfisted:
sub getfile { my $filename = shift; local *F; open F, "< $filename" or die "Couldn't open `$filename': $!"; local $/ = undef; # Read entire file at once $contents = <F>; # Return file as one single `line' close F; return $contents; }
local
on a glob does the same as any other local
: It saves the current value somewhere, creates a new value, and arranges
that the old value will be restored at the end of the current block. In
this case, that means that any filehandle that was formerly attached to the
old
*F
glob is saved, and the open
will apply to the filehandle in the new, local glob. At the end of the
block, filehandle F
will regain its old meaning again.
This works pretty well most of the time, except that you still have the
usual local
worries about called subroutines changing the
local
ized values on you. You can't use my
here because globs are all about the Perl symbol table; the lexical
variable mechanism is totally different, and there is no such thing as a
lexical glob.
With this technique, you have the new problem that getfile()
can't get at $F
, @F
, or %F
either, because you localized them all, along with the filehandle. But you
probably weren't using any global variables anyway. Were you? And getfile()
won't be able to call
&F
, for the same reason. There are a few ways around this, but the easiest
one is that if getfile()
needs to call &F
, it should name the local filehandle something other than F
.
use FileHandle
does have fewer strange problems. Unfortunately, it also sucks a few
thousand lines of code into your program. Now someone will probably write
in to complain that I'm exaggerating, because it isn't really 3,000 lines,
some of those are white space, blah blah blah. OK, let's say it's only 300
lines to use FileHandle, probably a gross underestimate. It's still only one line to localize the glob. For many programs, localizing the glob is a
good, cheap, simple way to solve the problem.
When a localized glob goes out of scope, its open filehandle is
automatically closed. So the close F
in getfile
is unnecessary:
sub getfile { my $filename = shift; local *F; open F, "< $filename" or die "Couldn't open `$filename': $!"; local $/ = undef; # Read entire file at once return <F>; # Return file as one single `line' } # F is automatically closed here
That's such a convenient feature that it's worth using even when you're not worried that you might be clobbering someone else's filehandle.
The filehandles that you get from FileHandle
and IO::Handle
do this also.
As I was researching this article, I kept finding common uses for
local
that turned out not to be useful, because there were simpler and more
straightforward ways to do the same thing without using
local
. Here is one that you see far too often:
People sometimes want to pass a filehandle to a subroutine, and they know that you can pass a filehandle by passing the entire glob, like this:
$rec = read_record(*INPUT_FILE);
sub read_record { local *FH = shift; my $record; read FH, $record, 1024; return $record; }
Here we pass in the entire glob INPUT_FILE
, which includes the filehandle of that name. Inside of read_record
, we temporarily alias FH
to INPUT_FILE
, so that the filehandle FH
inside the function is the same as whatever filehandle was passed in from
outside. The when we read from FH
, we're actually reading from the filehandle that the caller wanted. But
actually there's a more straightforward way to do the same thing:
$rec = read_record(*INPUT_FILE);
sub read_record { my $fh = shift; my $record; read $fh, $record, 1024; return $record; }
You can store a glob into a scalar variable, and you can use such a
variable in any of Perl's I/O functions wherever you might have used a
filehandle name. So the local
here was unnecessary.
Filehandles and dirhandles are stored in the same place in Perl, so everything this article says about filehandles applies to dirhandles in the same way.
Often you want to put filehandles into an array, or treat them like regular
scalars, or pass them to a function, and you can't, because filehandles
aren't really first-class objects in Perl. As noted above, you can use the FileHandle
or IO::Handle
packages to construct a scalar that acts something like a filehandle, but
there are some definite disadvantages to that approach.
Another approach is to use a glob as a filehandle; it turns out that a glob will fit into a scalar variable, so you can put it into an array or pass it to a function. The only problem with globs is that they are apt to have strange and magical effects on the Perl symbol table. What you really want is a glob that has been disconnected from the symbol table, so that you can just use it like a filehandle and forget that it might once have had an effect on the symbol table. It turns out that there is a simple way to do that:
my $filehandle = do { local *FH };
do
just introduces a block which will be evaluated, and will return the value
of the last expression that it contains, which in this case is local *FH
. The value of local *FH
is a glob. But what glob?
local
takes the existing FH
glob and temporarily replaces it with a new glob. But then it immediately
goes out of scope and puts the old glob back, leaving the new glob without
a name. But then it
returns the new, nameless glob, which is then stored into
$filehandle
. This is just what we wanted: A glob that has been disconnected from the
symbol table.
You can make a whole bunch of these, if you want:
for $i (0 .. 99) { $fharray[$i] = do { local *FH }; }
You can pass them to subroutines, return them from subroutines, put them in
data structures, and give them to Perl's I/O functions like
open
, close
, read
, print
, and <...>
and they'll work just fine.
Globs turn out to be very useful. You can assign an entire glob, as we saw above, and alias an entire symbol in the symbol table. But you don't have to do it all at once. If you say
*GLOB = $reference;
then Perl only changes the meaning of part of the glob. If the reference is
a scalar reference, it changes the meaning of $GLOB
, which now means the same as whatever scalar the reference referred to;
@GLOB, %GLOB
and the other parts don't change at all. If the
reference is a hash reference, Perl makes %GLOB
mean the same as whatever hash the reference referred to, but the other
parts stay the same. Similarly for other kinds of references.
You can use this for all sorts of wonderful tricks. For example, suppose
you have a function that is going to do a lot of operations on $_[0]{Time}[2]
for some reason. You can say
*arg = \$_[0]{Time}[2];
and from then on, $arg
is synonymous with $_[0]{Time}[2]
, which might make your code simpler, and probably more efficient, because
Perl won't have to go digging through three levels of indirection every
time. But you'd better use local
, or else you'll permanently clobber any $arg
variable that already exists. (Gurusamy Sarathy's
Alias
module does this, but without the local
.)
You can create locally-scoped subroutines that are invisible outside a block by saying
*mysub = sub { ... } ;
and then call them with mysub(...)
. But you must use local
, or else you'll permanently clobber any mysub
subroutine that already exists.
local
introduces what is called dynamic scope, which means that the `local' variable that it declares is inherited by
other functions called from the one with the declaration. Usually this
isn't what you want, and it's rather a strange feature, unavailable in many
programming languages. To see the difference, consider this example:
first();
sub first { local $x = 1; my $y = 1; second(); }
sub second { print "x=", $x, "\n"; print "y=", $y, "\n"; }
The variable $y
is a true local variable. It's available only from the place that it's
declared up to the end of the enclosing block. In particular, it's
unavailable inside of second()
, which prints
"y="
, not "y=1"
. This is is called lexical scope.
local
, in contrast, does not actually make a local variable. It creates a new
`local' value for a global variable, which persists until the end of the enclosing block. When control
exits the block, the old value is restored. But the variable, and its new `local' value, are still global,
and hence accessible to other subroutines that are called before the old
value is restored. second()
above prints "x=1"
, because $x
is a global variable that temporarily happens to have the value 1. Once first()
returns, the old value will be restored. This is called dynamic scope, which is a misnomer, because it's not really scope at all.
For `local' variables, you almost always want lexical scope, because it
ensures that variables that you declare in one subroutine can't be tampered
with by other subroutines. But every once in a strange while, you actually
do want dynamic scope, and that's the time to get
local
out of your bag of tricks.
Here's the most useful example I could find, and one that really does bear
careful study. We'll make our own iteration syntax, in the same family as
Perl's grep
and map
. Let's call it `listjoin'; it'll combine two lists into one:
@list1 = (1,2,3,4,5); @list2 = (2,3,5,7,11); @result = listjoin { $a + $b } @list1, @list2;
Now the @result
is (3,5,8,11,16)
. Each element of the result is the sum of the corresponding terms from @list1
and @list2
. If we wanted differences instead of sums, we could have put { $a - $b
}
. In general, we can supply any code fragment that does something with $a
and $b
, and listjoin
will use our code fragment to construct the elements in the result list.
Here's a first cut at listjoin
:
sub listjoin (&\@\@) {
Ooops! The first line already has a lot of magic. Let's stop here and
sightsee a while before we go on. The (&\@\@)
is a
prototype. In Perl, a prototype changes the way the function is parsed and the way
its arguments are passed.
In (&\@\@)
, The &
warns the Perl compiler to expect to see a brace-delimited block of code as
the first argument to this function, and tells Perl that it should pass listjoin
a reference to that block. The block behaves just like an anonymous
function. The \@\@
says that listjoin
should get two other arguments, which must be arrays; Perl will pass listjoin
references to these two arrays. If any of the arguments are missing, or
have the wrong type (a hash instead of an array, for example) Perl will
signal a compile-time error.
The result of this little wad of punctuation is that we will be able to write
listjoin { $a + $b } @list1, @list2;
and Perl will behave as if we had written
listjoin(sub { $a + $b }, \@list1, \@list2);
instead. With the prototype, Perl knows enough to let us leave out the
parentheses, the sub
, the first comma, and the slashes. Perl has too much punctuation already,
so we should take advantage of every opportunity to use less.
Now that that's out of the way, the rest of listjoin
is straightforward:
sub listjoin (&\@\@) { my $code = shift; # Get the code block my $arr1 = shift; # Get reference to first array my $arr2 = shift; # Get reference to second array my @result; while (@$arr1 && @$arr2) { my $a = shift @$arr1; # Element from array 1 into $a my $b = shift @$arr2; # Element from array 2 into $b push @result, &$code(); # Execute code block and get result } return @result; }
listjoin
simply runs a loop over the elements in the two arrays, putting elements
from each into $a
and $b
, respectively, and then executing the code and pushing the result into @result
. All very simple and nice, except that it doesn't work: By declaring $a
and $b
with my
, we've made them lexical, and they're unavailable to the $code
.
Removing the my
's from $a
and $b
makes it work:
$a = shift @$arr1; $b = shift @$arr2;
But this solution is boobytrapped. Without the my
declaration,
$a
and $b
are global variables, and whatever values they had before we ran listjoin
are lost now.
The correct solution is to use local
. This preserves the old values of the $a
and $b
variables, if there were any, and restores them when listjoin()
is finished. But because of dynamic scoping, the values set by listjoin()
are inherited by the code fragment. Here's the correct solution:
sub listjoin (&\@\@) { my $code = shift; my $arr1 = shift; my $arr2 = shift; my @result; while (@$arr1 && @$arr2) { local $a = shift @$arr1; local $b = shift @$arr2; push @result, &$code(); } return @result; }
You might worry about another problem: Suppose you had strict
'vars'
in force. Shouldn't listjoin { $a + $b }
be illegal? It should be, because $a
and $b
are global variables, and the purpose of strict 'vars'
is to forbid the use of unqualified global variables.
But actually, there's no problem here, because strict 'vars'
makes a special exception for $a
and $b
. These two names, and no others, are exempt from strict 'vars'
, because if they weren't,
sort
wouldn't work either, for exactly the same reason. We're taking advantage
of that here by giving listjoin
the same kind of syntax. It's a peculiar and arbitrary exception, but one
that we're happy to take advantage of.
Here's another example in the same vein:
sub printhash (&\%) { my $code = shift; my $hash = shift; local ($k, $v); while (($k, $v) = each %$hash) { print &$code(); } }
Now you can say
printhash { "$k => $v\n" } %capitals;
and you'll get something like
Athens => Greece Moscow => Russia Helsinki => Finland
or you can say
printhash { "$k," } %capitals;
and you'll get
Athens,Moscow,Helsinki,
Note that because I used $k
and $v
here, you might get into trouble with strict 'vars'
. You'll either have to change the definition of printhash
to use $a
and $b
instead, or you'll have to use vars qw($k $v)
.
Here's another possible use for dynamic scope: You have some subroutine
whose behavior depends on the setting of a global variable. This is usually
a result of bad design, and should be avoided unless the variable is large
and widely used. We'll suppose that this is the case, and that the variable
is called %CONFIG
.
You want to call the subroutine, but you want to change its behavior. Perhaps you want to trick it about what the configuration really is, or perhaps you want to see what it would do if the configuration were different, or you want to try out a fake configuration to see if it works. But you don't want to change the real global configuration, because you don't know what bizarre effects that will have on the rest of the program. So you do
local %CONFIG = (new configuration here); the_subroutine();
The changed %CONFIG
is inherited by the subroutine, and the original configuration is restored
automatically when the declaration goes out of scope.
Actually in this kind of circumstance you can sometimes do better. Here's
how: Suppose that the %CONFIG
hash has lots and lots of members, but we only want to change $CONFIG{VERBOSITY}
. The obvious thing to do is something like this:
my %new_config = %CONFIG; # Copy configuration $new_config{VERBOSITY} = 1000; # Change one member local %CONFIG = %new_config; # Copy changed back, temporarily the_subroutine(); # Subroutine inherits change
But there's a better way:
local $CONFIG{VERBOSITY} = 1000; # Temporary change to one member! the_subroutine();
You can actually localize a single element of an array or a hash. It works just like localizing any other scalar: The old value is saved, and restored at the end of the enclosing scope.
Like local filehandles, I kept finding examples of dynamic scoping that
seemed to require local
, but on further reflection didn't. Lest you be tempted to make one of
these mistakes, here they are.
One application people sometimes have for dynamic scoping is like this: Suppose you have a complicated subroutine that does a search of some sort and locates a bunch of items and returns a list of them. If the search function is complicated enough, you might like to have it simply deposit each item into a global array variable when its found, rather than returning the complete list from the subroutine, especially if the search subroutine is recursive in a complicated way:
sub search { # do something very complicated here if ($found) { push @solutions, $solution; } # do more complicated things }
This is dangerous, because @solutions
is a global variable, and you don't know who else might be using it.
In some languages, the best answer is to add a front-end to search
that localizes the global @solutions
variable:
sub search { local @solutions; realsearch(@_); return @solutions; }
sub realsearch { # ... as before ... }
Now the real work is done in realsearch
, which still gets to store its solutions into the global variable. But
since the user of
realsearch
is calling the front-end search
function, any old value that @solutions
might have had is saved beforehand and restored again afterwards.
There are two other ways to accomplish the same thing, and both of them are better than this way. Here's one:
{ my @solutions; # This is private, but available to both functions sub search { realsearch(@_); return @solutions; }
sub realsearch { # ... just as before ... # but now it modifies a private variable instead of a global one. } }
Here's the other:
sub search { my @solutions; realsearch(\@solutions, @_); return @solutions; }
sub realsearch { my $solutions_ref = shift; # do something very complicated here if ($found) { push @$solutions_ref, $solution; } # do more complicated things }
One or the other of these strategies will solve most problems where you
might think you would want to use a dynamic variable. They're both safer
than the solution with local
because you don't have to worry that the global variable will `leak' out
into the subroutines called by realsearch
.
One final example of a marginal use of local
: I can imagine an error-handling routine that examines the value of some
global error message variable such as $!
or $DBI::errstr
to decide what to do. If this routine seems to have a more general utility,
you might want to call it even when there wasn't an error, because you want
to invoke its cleanup behavor, or you like the way it issues the error
message, or whatever. It should accept the message as an argument instead of examining some fixed global
variable, but it was badly designed and now you can't change it. If you're
in this kind of situation, the best solution might turn out to be something
like this:
local $DBI::errstr = "Your shoelace is untied!"; handle_error();
Probably a better solution is to find the person responsible for the routine and to sternly remind them that functions are more flexible and easier to reuse if they don't depend on hardwired global variables. But sometimes time is short and you have to do what you can.
A lot of the useful uses for local
became obsolete with Perl 5; local was much more useful in Perl 4. The most
important of these was that my
wasn't available, so you needed local
for private variables.
If you find yourself programming in Perl 4, expect to use a lot of
local
. my
hadn't been invented yet, so we had to do the best we could with what we
had.
Useful uses for local
fall into two classes: First, places where you would like to use my
, but you can't because of some restriction, and second, rare, peculiar or
contrived situations.
For the vast majority of cases, you should use my, and avoid local whenever possible. In particular, when you want private variables, use my, because local variables aren't private.
Even the useful uses for local
are mostly not very useful.
Revised rule of when to use my
and when to use local
:
my
; never use local
unless you get an error when you try to use my
.
Return to: Universe of Discourse main page | What's new page | Perl Paraphernalia