Sample solutions and discussion
Perl Quiz of The Week #4 (20021106)

        Two words are said to be 'anagrams' if the letters of one word can be
        rearranged to form the other word.  For example, in English, 'ascot'
        and 'tacos' are anagrams; so are 'tacos' and 'coats'.  A set of words
        that are all anagrams of one another is an 'anagram set'.  For
        example,

                ascot tacos coast coats

        is an anagram set.

        Letter case doesn't matter, so, for example, 'liberating' and
        'Gilbertian' are considered to be anagrams.  

        Write a program, make_anagrams, which reads a list of words, one per
        line, and finds all the anagrams in the the word list.  It should
        output an anagram listing, as follows:

        * 'Words' that contain digits or punctuational characters should be
          ignored.

        * Anagram sets that contain only one word should be omitted from the output.

        * If an anagram set contains two words, say 'halls' and 'shall', the
          output should contain two lines:

                halls shall
                shall halls

        * If an anagram set contains more than two words, the entire set
          should be listed under the alphabetically first word; the others
          should cross-reference it.  For example:

                headskin nakedish sinkhead
                nakedish (See 'headskin')
                sinkhead (See 'headskin')

        * Finally, the output lines should be in alphabetic order.

        For example, if the input was

                5th
                ascot
                ate
                carrot
                coast
                coats
                cots
                Dorian
                eat
                halls
                headskin
                inroad
                nakedish
                ordain
                Ronald's
                shall
                sinkhead
                tacos
                tea

        then the output should be:

                Dorian inroad ordain
                ascot coast coats tacos
                ate eat tea        
                coast (See 'ascot')
                coats (See 'ascot')
                eat (See 'ate')
                halls shall
                headskin nakedish sinkhead
                inroad (See 'Dorian')
                nakedish (See 'headskin')
                ordain (See 'Dorian')
                shall halls
                sinkhead (See 'headskin')
                tacos (See 'ascot')
                tea (See 'ate')

        If you need a sample input, you may obtain English word lists from 
                http://perl.plover.com/qotw/words/

        If you prefer to do this quiz in a language other than English, please
        substitute whatever conventions are appropriate for that language.


----------------------------------------------------------------

Here's some sample code:

        #!/usr/bin/perl

        while (<>) {
          chomp;
          next if /[^A-Za-z]/;
          my $key = join "", sort(split //, lc $_);
          $sets{$key} .= "$_:";
        }

        for my $wl (values %sets) {
          my @wl = split /:/, $wl;
          if (@wl == 2) {
            push @output, "$wl[0] $wl[1]";
            push @output, "$wl[1] $wl[0]";
          } elsif (@wl > 2) {
            my ($first, @rest) = sort insensitive @wl;
            push @output, "$first @rest";
            for (@rest) {
              push @output, "$_ (See '$first')";
            }
          }
        }

        for my $line (sort insensitive @output) {
          print $line, "\n";
        }

        sub insensitive {lc($a) cmp lc($b)}

The key to solving this problem was figuring out how to decide if two
words were anagrams.  Two words are anagrams if they have the same
letters, possibly in a different order.  The easy way to find out if
this is the case is to take the letters and put them in some canonical
order, for example by sorting them; if the sorted letter lists are the
same, then the words were anagrams.

Nearly everyone who solved this problem did so by splitting the
incoming words into letters, sorting the letters, and joining them
back together to form a hash key.  For example, the hash key for the
word 'Ethiopian' would be 'aehiinopt'.  Then you just list each word
under its own hash key, and you have a hash where the hash values are
lists of anagrammatic words.

The first 'while' loop reads the input, computes the key for each
word, and installs it into the %sets hash under the appropriate key.
At the end of the loop, the %sets values are lists of anagrams,
separated by ':' characters.  It would have been preferable to use a
hash of arrays, but _Learning Perl_ doesn't cover that.  A typical
value: $sets{'einrs'} is "resin:rinse:risen:siren:".

The second loop prepares the output.  It scans over all the anagram
lists ('$wl' stands for 'word list') and decides what the output will
look like for each list.  It appends lines of output to @output; later
these lines will be sorted and printed.

If there are exactly two words in $wl, then two output lines are
appended.  If there are more than two, then they're sorted into order,
the entire word list is listed under the first word, and the rest have
crossreferences.

Finally, the output lines are printed in alphabetical order.  The
utility function 'insensitive', which compares two strings without
regard to case, is used in two places to produce slphabetically sorted
lists.

There was some discussion about what to do when the input contained
two words with the same spelling but different case, as with 'Polish'
(pertaining to Poland) and 'polish' (to make something shiny.)  The
version above includes such words, which means that there are some
silly outputs that inform you that 'polish' is an anagram of
'Polish'.  If you don't like this, add

        next if $seen{lc $_}++;

in the top loop; this discards any word that has been seen before with
any capitalization.  (This is another application of the 'canonical
form' idea; this time the canonical form of a string is the
all-lowercase version.)

The 'expert' quiz postmortem is delayed because the results are so
very interesting.  There were many solutions posted to the -discuss
list and much discussion, and I don't want to leave out anything
interesting.  Expect it tomorrow, along with a pair of new quizzes.

Thanks to everyone who participated, whether or not they sent mail
about it.