Sample solutions and discussion Perl Quiz of The Week #6 (20021120) Write a function, format_number_list, whose argument is a list of integers. It then returns a string which represents the input list in compact, human-readable form. For example, format_number_list(1, 2, 4, 5, 6, 7, 9, 13, 24, 25, 26, 27) will return "1-2, 4-7, 9, 13, 24-27" Also write a function, 'expand_number_list', which does the conversion in the opposite direction, so that expand_number_list("1-2, 4-7, 9, 13, 24-27") will return (1, 2, 4, 5, 6, 7, 9, 13, 24, 25, 26, 27) ---------------------------------------------------------------- I'll show solutions for format_number_list first. I'm going to present two sample solutions this week. One was contributed by Andreas Koenig: use Set::IntSpan; # CPAN rules :-) use strict; sub format_number_list { my(@n) = @_; my $set = Set::IntSpan->new(join ",", @n); my $run = $set->run_list; $run =~ s/,/, /g; # give them the spaces $run; } To summarize this solution: Set::IntSpan is a CPAN module that already does almost exactly what was requested; Andreas simply wrapped it. Andreas said: "I surely was amazed that nobody found it till monday." I was amazed also. Here's a synthetic but straightforward solution, from James Gray, slightly modified by me: sub format_number_list { my @output ; while (@_) { my $range_start = shift; my $range_end = $range_start; # check if the numbers go in sequence from here $range_end = shift while (@_ && $_[0] == $range_end + 1); # ...and add to output accordingly if ($range_start == $range_end) { push @output, $range_start; } else { push @output, "$range_start-$range_end"; } } join ", ", @output; } 1. This quiz generated a large amount of discussion about what to do if the input list was out of order, if the input list contained repeated numbers, if the input list contained negative numbers, and so on. had I thought about the problem more carefully before posting, I would have said something about some of these situations. The application I originally had in mind was .newsrc files. Lines of a .newsrc file indicate which news articles in a newsgroup have already beenread. News articles are numbered starting from 1, so the negative-number issue never comes up, and the .newsrc line represents a set of integers, not a list, so order and repetition is a nonissue. But I didn't say this in the question. 2. Then there was addition discussion about what the function should do if given numbers out of order. For example, given (1,2,3,7,4,5,6), should it produce "1-3, 7, 4-6" or "1-7" or something else? There were good arguments in both directions: * Maybe the order is significant; then you don't want to alter it. * If you sort the input before processing, you foreclose the possibility of the function ever treating (1, 3, 2) differently from (1, 2, 3). But if you're careful not to change the order, the user who wants you to treat them the same can still call format_number_list(sort numerically @nums); * If the purpose of the function is to generate 'human-readable lists' then reordering the numbers for maximum compression is more likely to achieve that. For the .newsrc case, since the lists are actually sets, reordering makes sense. For other applications, it may not. I tested these separately. There was also some discussion about whether (3, 2, 1) should turn into "3-1" or "3, 2, 1", supposing that the input was not to be reordered. 3. If negative numbers are allowed, the required output format is ugly and hard to read. (-3, -2, -1) would turn into "-3--1". Some people opted for "-3..-1" instead. 4. Some people also modified the output formats in various other ways. Then I had to hack on them to get them to work with the test harness. You'd be surprised at how difficult it was to make trivial formatting changes in some of these programs. In several cases I had to hunt down and change the same punctuation in several places. 5. With all this discussion, I was surprised to see how few of the submitted solutions actually worked properly for the straightforward cases: Postive integers in ascending order with no repeats. Of the 21 samples I tested, several failed some of the basic cases. (See http://perl.plover.com/qotw/misc/r006/RESULTS/format/NOTOK . ) 6. Even the programs which were advertised by their authors as handling certain special cases, often didn't. 7. My conclusion from all this is that maybe people would do well to pay more attention to basic correctness in the simple cases before worrying a lot about making the functionality as complete as possible. Now 'expand_number_list'. Here's Andreas Koenig's Set::IntSpan version: use Set::IntSpan; # CPAN rules :-) use strict; sub expand_number_list { my $run = shift; my $set = Set::IntSpan->new($run); $set->elements; } Here's Tom Varga's, cleaned up a little: sub expand_number_list { my @result ; for my $num (split(/\s* , \s*/x, $_[0])) { push(@result, ($num =~ /(\d+) \s* - \s* (\d+)/x) ? ($1 .. $2) : $num ) ; } @result ; } 8. Several people seemed to misunderstand that 'expand_number_list' was supposed to return a list of numbers, not a string. That's all for regular quiz #6. I'll send a postmortem of the expert quiz later on today, and new quizzes tomorrow. Thanks to everyone who participated.