Sample solutions and discussion
Perl Expert Quiz of The Week #5 (20021113)


        You will write a function that lays out tree diagrams.

        layout_tree's arguments will be (1) an object that represents a tree,
        and (2) an optional hash containing some configuration parameters.
        The tree object is opaque to your function, but is guaranteed to
        support at least the following methods:

                $tree->children

                  Returns a list of tree objects that represent the subtrees
                  of this tree

                $tree->text

                  Returns a reference to an array of strings that show how the
                  root node of this tree should be displayed.

        An example $tree object might be manufactured by the following class:

                package ExampleTree;

                # Create a new tree object
                # arguments: ->(label of root node,
                #               [arguments for child 1],
                #               [arguments for child 2],
                #               ...);
                sub new {
                  my ($pack, $label, @subnodes) = @_;
                  my @subobjects = map $pack->new(@$_), @subnodes;
                  bless [$label, @subobjects] => $pack;
                }

                # If the label of the tree node is 'foo',
                # we format it like this:
                #    +-----+
                #    | foo |
                #    +-----+
                sub text {
                  my $self = shift;
                  my $label = $self->[0];
                  my $ln = length $label;
                  $label .= " " x (5 - $ln) if $ln < 5;  # pad with spaces to width 5
                  $ln = length $label;
                  my $top = "+" . "-" x $ln . "+";
                  return [$top, "|$label|", $top];
                }

                sub children {
                  my $self = shift;
                  my @children = @$self;
                  shift @children;  # throw away the label
                  @children;
                }

        And then one might create such an object by calling

                my $t = ExampleTree->new("5",
                                         ["4 1", 
                                          ["3 1 1", 
                                           ["2 1 1 1",
                                            ["1 1 1 1 1"],
                                           ],
                                          ],
                                          ["2 2 1"]
                                         ], 
                                         ["3 2"]);


        layout_tree($t) would then manufacture an array of strings, which,
        when printed, would look something like this:

                           +-----+
                           |5    |
                           +-----+
                           ,-' `--------.
                       +-----+        +-----+
                       |4 1  |        |3 2  |
                       +-----+        +-----+
                      ,-'  `----.
                  +-----+    +-----+
                  |3 1 1|    |2 2 1|
                  +-----+    +-----+
                     |
                 +-------+
                 |2 1 1 1|
                 +-------+
                     |
                +---------+
                |1 1 1 1 1|
                +---------+

        Details of exactly how this is laid out and what it looks like are up
        to you; you might produce output like this instead:


                +-----+   +-----+   +-----+   +-------+   +---------+
                |5    |-+-|4 1  |-+-|3 1 1|---|2 1 1 1|---|1 1 1 1 1|
                +-----+ | +-----+ | +-----+   +-------+   +---------+
                        |         |
                        |         | +-----+
                        |         +-|2 2 1|
                        |           +-----+
                        | +-----+
                        +-|3 2  |
                          +-----+

        However, the tree node displays returned by ->text must appear in the
        output exactly as they were returned, and it must be possible to
        understand the tree structure from looking at the output.  

----------------------------------------------------------------

I really like programs that draw.  Drawing is fun.  

I got really tired of hand-drawing all the ASCII-art for the diagrams
for my book, and there were a lot of tree diagrams.  So I wrote a
tree-drawing package, and hence my interest in the problem.  In the
original question, the first of the two example outputs was generated
by my package.  I drew the second one by hand.

I had hoped that this problem would bring up a lot of discussion about
the best way to draw trees, comparison of different methods for laying
out the nodes, and so on.  It didn't.  But there was some interesting
variety in the output produced by the programs that appeared on
qotw-perl-discuss.  I thought at first that the programs were almost
all the same, but they turned out to be rather different on the
inside.

Here are some sample outputs:

        barbie.pl

        +-----+   +-----+   +-----+
        |top  |-+-|a    |-+-|a1   |
        +-----+ | +-----+ | +-----+
                |         | +-----+
                |         +-|a2   |
                |           +-----+
                | +-----+
                +-|b    |
                | +-----+
                | +-----+
                +-|c    |
                  +-----+


        --------


        dax.pl

                            +-----+
                          ,-|a1   |
                  +-----+ | +-----+
                ,-|a    |-+
                | +-----+ | +-----+
                |         `-|a2   |
                |           +-----+
                | 
        +-----+ | +-----+
        |top  |-+-|b    |
        +-----+ | +-----+
                | 
                | +-----+
                `-|c    |
                  +-----+


        Notice how a1 and a2 are centered around the right edge of a,
        and how a, b, and c are centered around 'top', instead of
        hanging down below it.

        --------


        mjd.pl

                     +-----+
                     |top  |
                     +-----+
                  ,---' `-.-----.
               --'         `-    `----
            +-----+       +-----+  +-----+
            |a    |       |b    |  |c    |
            +-----+       +-----+  +-----+
           ,-'   `--
        +-----+  +-----+
        |a1   |  |a2   |
        +-----+  +-----+

        --------

        isaacson.pl

                      +-----+
                      |top  |
                      +-----+
               .---------^-.-------.   
            +-----+     +-----+ +-----+
            |a    |     |b    | |c    |
            +-----+     +-----+ +-----+
           .---^---.           
        +-----+ +-----+        
        |a1   | |a2   |        
        +-----+ +-----+        


        --------


        sainio.pl

        +-----+   +-----+   +-----+
        |top  |-+-|a    |-+-|a1   |
        +-----+ | +-----+ | +-----+
                |         | 
                |         | +-----+
                |         +-|a2   |
                |           +-----+
                | +-----+
                +-|b    |
                | +-----+
                | 
                | +-----+
                +-|c    |
                  +-----+


        --------


        schmidt.pl

        +-----+   +-----+   +-----+   
        |top  |-+-|a    |-+-|a1   |   
        +-----+ | +-----+ | +-----+   
                |         |           
                | +-----+ | +-----+   
                +-|b    | +-|a2   |   
                | +-----+   +-----+   
                |                     
                | +-----+             
                +-|c    |             
                  +-----+             


        Notice how the 'b' node has has cuddled in behind 'a2'. 

        --------

Unfortunately, all of the posted examples failed on one example or
another, for various reasons.  Ron Isaacson's code produced
good-looking output for eight of the nine examples I tried, but
aborted with a fatal error on one of them.  I would have tried to fix
it, but unfortunately he posted it only about half an hour ago.
Still, it's nearly the shortest of the solutions, and it might not be
hard to fix.


                 (Lines of code)

        sainio.pl      42
        isaacson.pl    44
        barbie.pl      47
        schmidt.pl     95
        dax.pl        168
        mjd.pl        219

It was difficult to decide which sample code to show.  The code I had
beforehand is much too big and complicated to be a good example.  Also
I had designed the interface a little differently, and had to attach a
rather peculiar glue module to the original code to get it to conform
to the interface required by the question.

I finally compromised and hacked on Garth Sainio's code, which was both
short and straightforward.  The result generated reasonably
good-looking outputs even for the weird test cases, and was even
shorter than the original.  (35 lines!  Woo hoo!)

        sub layout_tree {
            my($tree) = @_;

            # Get the representation of the root node
            my $node_lines = $tree->text;

            # Get the children of the node
            my @children = $tree->children;
            my $child_count = scalar(@children);

            # If no children, return the existing lines
            return @$node_lines unless($child_count);

            # Get the layout for each sub tree.
            my @child_lines;
            foreach my $child (@children) {
                my @lines = layout_tree($child);
        #        push @lines, "";
                push(@child_lines, \@lines);
            }


            # Combine the children with the current node
            my $node_height = scalar(@$node_lines);

            # find the index of the line nearest the middle  of the array
            my $middle = int(($node_height-1)/2);

            # Determine the width of the current node and create a string of
            # spaces to pad lines in the subtrees with.
            my $node_width = length($node_lines->[0]);
            my $node_pad = " " x $node_width;

            # Printing children to the right of the parent.
            # So prepend the lines to the child lines. Need to prepend spaces
            # to the other lines.
            # If there is only one child join with ---
            # multiple children are joined with -+- for the first and +- for subsequent

            my $yoff = $middle;
            for my $i (0..$child_count-1) {
                my $c_lines = $child_lines[$i];
                my $c_height = scalar(@$c_lines);

                # now need to combine node_lines and c_lines
                for my $j (0..$c_height-1) {
                    my $ypos = $yoff + $j;
                    my $separator = "   ";

                    if ($j == 0) {
                      if ($child_count == 1) {
                        $separator = " --";
                      } else {
                        $separator = " +-";
                      }
                    } elsif ($i < $child_count - 1) {
                      $separator = " | ";
                    }

                    # Attach on left side
                    substr($separator, 0, 1) = "-" if $ypos == $middle;
                    # Prepend the lines of the current node to the children lines
                    if($ypos < $node_height) {
                        $node_lines->[$ypos] = $node_lines->[$ypos] . $separator . $c_lines->[$j];
                    } else {
                        $node_lines->[$ypos] = $node_pad . $separator .  $c_lines->[$j];
                    }
                }
                $yoff += $c_height;
            }

            return @$node_lines;
        }

As with all the submissions, this program calls itself recursively to
lay out the subtrees of the current root node, and then tries to
assemble the subtree layouts and the root node text into a larger
diagram.

The program loops over the children (the 'for my $i' loop) and then
over each line in the child (the 'for my $j' loop) adding each line in
turn to $node_lines, which is its ultimate return value.  As it adds
the children, it inserts 'separator' text in between the root node (on
the left) and the child node (on the right).  Typically, this is just
three spaces.  If the current line happens to be exactly halfway down
the root node, the separator text is adjusted so that it attaches to
the root node; if it happens to be exactly at the top of a child node,
the separator is adjusted to that it attaches to the child node.
Inside the 'for my $j' loop, $j is the index of the current line
relative to the top of the current child node, and $ypos is the index
of the current line relative to the top of the parent node.

Here's a typical output:

        top-+-a-+-a1
            |   +-a2
            +-b
            +-c


If you want more space, you can uncomment the

        push @lines, "";

line; the resulting diagram will have extra space below each node:

        top-+-a-+-a1
            |   | 
            |   +-a2
            |     
            | 
            +-b
            | 
            +-c

----------------------------------------------------------------

1. If any of the child nodes are more than one line high, the output
   becomes skewed, although still accurate:

        +-----+
        |top  |-+-+-----+
        +-----+ | |a    |-+-+-----+
                | +-----+ | |a1   |
                |         | +-----+
                |         +-+-----+
                |           |a2   |
                |           +-----+
                +-+-----+
                | |b    |
                | +-----+
                +-+-----+
                  |c    |
                  +-----+

   My version of this program attaches each horizontal connector to
   the middle of the parent node, but to the top of the child nodes.
   Mr. Sainio's original code attached it to the middle of both nodes.

   Why did I change it?  Because the original program only worked when
   all the nodes were the same height; it assumed that a child was the
   same height as its parent, and didn't otherwise keep track of child
   node heights.  When the assumption was correct, the output looked
   fine.  When it wasn't, all sorts of odd effects appeared.  For
   example:

        Example 1                       Example 2


        +----+   +-+             +-----------------+   +-----+   +-+   
        |****|   |*|             |*****************|   |*****|   |*|   
        |****|-+-+-+             |*****************|   |*****|   +-+   
        |****|                   |*****************|   |*****|         
        |****|                   |*****************|   |*****|         
        +----+                   |*****************|   |*****|         
               |                 |*****************|   +-----+         
               | +-+             |*****************|                   
               | |*|             |*****************|                   
               +-+-+             |*****************|                   
               |                 |*****************|                   
               | +-+             |*****************|                   
               | |*|             |*****************|                   
               +-+-+             |*****************|                   
               |                 |*****************|                   
               | +-+             |*****************|                   
               | |*|             |*****************|                   
               +-+-+             |*****************|                   
               |                 +-----------------+                   
               | +-+                                 |                 
               | |*|                                 | +-+             
               +-+-+                                 | |*|             
                                                     | +-+             


   In example 1, the parent node has height 6, so the horizontal
   connector is attached to it is on the third row from the top;
   Sainio's program has attached the connector to the child nodes at
   _their_ third rows also.  In example 2, the root node has height
   18, so the horizontal connector is attached to it at the 9th row
   from the top; the program wants to attach the connector to the 9th
   row of the child nodes also, but they don't have 9th rows.  Because
   of a fluke in the design, the connector disappears entirely, except
   for a vestigial bit at the very bottom.

*. There are at least two ways to fix this.  One is to give up on
   attaching the horizontal connector to the middle of the child nodes,
   and attach it at the top instead.  This is what I did in the
   example program.

   The other way to fix this is to have the layout_tree() function
   return, a second value, in addition to the lines to be printed, to
   indicate how far down the picture the horizontal connector should
   be attached to the root node.  I did this, just to see how it would
   come out.  I had to rework some of the insides of the function,
   but the end result was not too much bigger than the original.  For
   the two examples above, this version of the program produces:

        +-----+   +-+    +-----------------+   +-----+   +-+      
        |*****| +-|*|    |*****************|   |*****| +-|*|      
        |*****| | +-+    |*****************|   |*****| | +-+      
        |*****|-+ +-+    |*****************| +-|*****|-+          
        |*****| +-|*|    |*****************| | |*****|            
        |*****| | +-+    |*****************| | |*****|            
        +-----+ | +-+    |*****************| | +-----+            
                +-|*|    |*****************| | +-+                
                | +-+    |*****************| +-|*|                
                | +-+    |*****************|-+ +-+                
                +-|*|    |*****************|                      
                | +-+    |*****************|                      
                | +-+    |*****************|                      
                +-|*|    |*****************|                      
                  +-+    |*****************|                      
                         |*****************|                      
                         |*****************|                      
                         |*****************|                      
                         +-----------------+                      

                                                          
   Similarly, the output produced by the original program is on the
   left; the output from the revised version is on the right:


        *-+-+-+   +-------+              *-+ +-+   +-------+
          | |*|-+-|*******|                +-|*|-+ |*******|
          | +-+ | |*******|                | +-+ | |*******|
          |     | |*******|                |     | |*******|
          |     | |*******|                |     +-|*******|
          |     | |*******|                |     | |*******|
          |     | |*******|                |     | |*******|
          |     | |*******|                |     | |*******|
          |     | +-------+                |     | +-------+
          |     | +-------+                |     | +-------+
          |     +-|*******|                |     | |*******|
          |       |*******|                |     | |*******|
          |       |*******|                |     | |*******|
          |       |*******|                |     +-|*******|
          |       |*******|                |       |*******|
          |       |*******|                |       |*******|
          |       |*******|                |       |*******|
          |       +-------+                |       +-------+
          +-+-+   +-------+                | +-+   +-------+
            |*|-+-|*******|                +-|*|-+ |*******|
            +-+ | |*******|                  +-+ | |*******|
                | |*******|                      | |*******|
                | |*******|                      +-|*******|
                | |*******|                      | |*******|
                | |*******|                      | |*******|
                | |*******|                      | |*******|
                | +-------+                      | +-------+
                | +-------+                      | +-------+
                +-|*******|                      | |*******|
                  |*******|                      | |*******|
                  |*******|                      | |*******|
                  |*******|                      +-|*******|
                  |*******|                        |*******|
                  |*******|                        |*******|
                  |*******|                        |*******|
                  +-------+                        +-------+


   Almost everyone got into trouble over this issue.
   http://perl.plover.com/qotw/misc/e005/outputs/out9 is a gallery of
   the results for example 2 above, a rather curious collection of
   trees with teratomas, fractures, and spina bifida.

2. Code and test harnesses are at

        http://perl.plover.com/qotw/misc/e005/

   Garth Sainio's original program is sainio.pl.  The modified version
   that I presented above is sainio-mjd.pl.  The version that always
   attaches the connectors in the centers of the nodes is
   sainio-mjd2.pl.

   Sample outputs are at 

        http://perl.plover.com/qotw/misc/e005/outputs/

3. dax.pl assumes that the return from ->text will always be exactly
   three lines high:

            ...
            elsif(@children == 1) {
                #Easy.
                return horiz_line_child(
                       "$space   ",
                    ["$text[0]   ",
                     "$text[1]---",
                     "$text[2]   "],
                       "$space   ",
                    $children[0]
                );
            }
            ...

   If the text was a different, size, the output would be garbled:


          barbie.pl     DAX.PL       isaacson.pl   sainio.pl    
                                                              
        top-+-a-+-a1     ,-  ,-a1a       top     top-+-a-+-a1 
            |   +-a2     | -+          .--^.-.       |   |    
            +-b          |   `-a2      a   b c       |   +-a2 
            +-c          |           .-^.            +-b      
                     -+-top | b      a1 a2           |        
                      |                              +-c      
                         |        
                         `-c      

   I think the lesson to learn from this is that if you have a magic
   constant like '2' in your program, you should pause and ask
   yourself 'why 2?'  


4. Dan Schmidt's cuddling algorithm was sometimes a little too
   affectionate.  For this tree:

        top-+-a-+-b-+-c   
            |   |   |     
            |   |   +-d   
            |   |         
            |   +-e---f   
            |
            +-g---h

   It produced this output:

        top-+-a-+-b-+-c   
            |   |   |     
            +-g-|-h +-d   
                |         
                +-e---f   


   See what happened? The g-h branch got moved up into the empty
   spaces below the a-b branch.  It fits there, but as a result the
   lines overlap.  One can almost imagine this as a feature, since the
   broken line from "g-|-h" correctly suggests that it passes
   underneath the a-b-e branch.  Unfortunately, it's *not* a feature,
   because for this tree:

        11-+-12-+-13-+-14
           |    |    | 
           |    |    +-24
           |    |    | 
           |    |    +-34
           |    |
           |    +-43---44
           |
           +-22-+-23
                | 
                +-33

   It produces this output:

        11-+-12-+-13-+-14        
           |    |    |           
           +-22-+-23 +-24        
                |    |           
                +-33 +-34        
                |                
                +-43---44


   Which isn't even a tree.  (Is 23 a child of 12 or 22?)

   I wanted to fix this, but I couldn't find an easy way to do it in
   the time available.

   In spite of this flaw, the algorithm is straightforward and
   interesting and the code was very clear.  I recommend that you have
   a look.   

5. Dan Schmidt used this construction to make an empty canvas with
   $total_height lines of $total_width spaces each:

     my @field = map {" " x ($total_width)} (0..($total_height-1));

   Here's a shortcut:

     my @field = (" " x $total_width) x $total_height;


Thanks to everyone who participated in the quiz, including both those
who mailed the -discuss list and those who did not.  I will send along
sample solutions for regular quiz #5 tomorrow sometime.

The perl-qotw list now has over 1,000 subscribers!  Thank you all for
your support and interest.