Sample solutions and discussion Perl Quiz of The Week #11 (20030206) Question #1: Why does Perl have the 'defined' function? If you want to see if a variable contains an undefined value, why not just use something like this this? if ($var == undef) { ... } '==' is for comparing numbers. If its operands aren't numbers to begin with, they are converted to numbers before being compared. The 'undef' on the right is always converted to 0, so this test is that same as comparing for numeric equality with 0. In particular, the test returns true when $var is 0, even though it is not undefined. The test also fails for many strings: $var = "oops"; if ($var == undef) { die } This dies even though $var is certainly not undefined. ---------------------------------------------------------------- Question #2: What's wrong with this code? %hash = ...; while () { chomp; for my $key (keys %hash) { if ($key eq $_) { print "$key: $hash{$key}\n"; } } } The 'for' loop scans the hash looking for a particular key. But the whole point of a hash is that you *don't* have to scan it to find out if it contains a certain key or not. Hashes are organized so that Perl can look up any given key instantly, without having to examine each one. The code here is analogous to searching the telephone book one name at a time, starting from the first page, even though the telephone book is carefully organized (in alphabetical order) so that you don't have to do that. A better way to write the code would be: %hash = ...; while () { chomp; print "$_: $hash{$_}\n"; } This error is common in code written by beginning Perl programmers. Here's some code that one of my interns once wrote: foreach $k (keys %in) { if ($k eq q1) { if ($in{$k} eq agree) { $count{q10} = $count{q10} + 1; } if ($in{$k} eq disaagree) { $count{q11} = $count{q11} + 1; } } if ($k eq q2) { @q2split = split(/\0/, $in{$k}); foreach (@q2split) { $count{$_} = $count{$_} + 1; } } if ($k eq q3) { $count{$in{$k}} = $count{$in{$k}} + 1; } if ($k eq q4a) { if ($in{$k} eq care) { $count{q4a0} = $count{q4a0} + 1; } if ($in{$k} eq dontcare) { $count{q4a1} = $count{q4a1} + 1; } } if ($k eq q4b) { if ($in{$k} eq wish) { $count{q4b0} = $count{q4b0} + 1; } if ($in{$k} eq dontwish) { $count{q4b1} = $count{q4b1} + 1; } } if ($k eq q5) { if ($in{$k} eq yes) { $count{q50} = $count{q50} + 1; } if ($in{$k} eq "no") { $count{q51} = $count{q51} + 1; } } if ($k eq q6) { if ($in{$k} eq yes) { $count{q60} = $count{q60} + 1; } if ($in{$k} eq "no") { $count{q61} = $count{q61} + 1; } } if ($k eq q7) { if ($in{$k} eq "accept") { $count{q70} = $count{q70} + 1; } if ($in{$k} eq understand) { $count{q71} = $count{q71} + 1; } if ($in{$k} eq other) { $count{q72} = $count{q72} + 1; $htmlout = comments; open(COMMENTS, ">> /tmp/comments") || die "cant open comments"; print COMMENTS "$in{q7a}\n\n"; close (COMMENTS); } } if ($k eq q8) { if ($in{$k} eq yes) { $count{q80} = $count{q80} + 1; } if ($in{$k} eq "no") { $count{q81} = $count{q81} + 1; } } } #end of foreach loop Larry Wall, the inventor of Perl, has said: Doing linear scans over an associative array is like trying to club someone to death with a loaded Uzi. ---------------------------------------------------------------- Question #3: What's wrong with this code? @matching_words = grep search_for($_, $text_file), @words; sub search_for { my ($target, $file) = @_; return unless open F, "<", $file; while () { return 1 if index($_, $target) >= 1; } close F; return; } There are several things wrong with the code. Probably the biggest problem is that the search_for function inadvertently destroys the contents of @words. Inside a 'grep' loop or a 'foreach' loop with no control variable, the $_ variable is 'aliased' to the elements of the array. This means that you can look at $_ to see the current array element, and also that you can modify $_ to modify the current array element. A simpler example is: @n = (1,2,3); for (@n) { $_ = 'blah'; } print "@n\n"; This prints "blah blah blah". Since $_ is a global variable, the assignment to $_ inside the 'search_for' function overwrites the aliased values in @words. Other possible criticisms include: (a) search_for performs a repeated search that is probably wasteful; it would be better to convert it into a hash lookup of some sort, if possible. (b) If the rest of the program happened to have a filehandle named 'F', calling search_for will close it. For example, this doesn't work: open F, "myfile" or die ...; if (search_for("carrot", "otherfile")) { ... } my $next = ; because F has been closed by 'search_for'. This is a violation of function encapsulation rules. If the program who had F open before is not the same as the one who wrote search_for, this is going to create a bug that will be very difficult to track down.