Sample solutions and discussion
Perl Expert Quiz of The Week #2 (20021023)
The local high school baseball team, the Randal Schwartz High School
Phoenixes, will be playing a series of games against their rivals, the
Richard M. Nixon Memorial High School Growlin' Fungus. The series
lasts at most five games, and ends when one team has won three games.
You want to bet $80 on the Phoenixes to win the series, but your
bookie will only take bets on individual games. (The bookie pays even
money on all bets.)
A mathematically-inclined friend solves the problem for you, giving
you the following instructions:
Bet $30 on each of the first two games.
Bet $20 on the third game if either team has won both of the
first two games, $40 otherwise.
Bet $40 on the fourth game, if there is one. Bet $80 on the
fifth game, if there is one.
At the end of the series, you will be ahead by exactly $80
if the Phoenixes have won the series, and behind by exactly
$80 if the Growlin' Fungus have won.
We could summarize the instructions in a table like this:
If the game score is:
0 to 0, bet $30
1 to 0, bet 30
1 to 1, bet 40
2 to 0, bet 20
2 to 1, bet 40
2 to 2, bet 80
Write a function which calculates the appropriate bet for any such
contest, given the total amount you want to risk on the series, the
length of the series, and the current game score. For example,
bet(80, 5, 2, 1)
should return 40, because if you want to risk $80 on a 5-game
series, and one team is presently ahead 2 games to 1, you should bet
$40.
Similarly
bet(1000, 7, 2, 1)
should return 375.
(That is, if you're trying to bet $1000 on the current World Series
baseball match, you need to bet $375 on the outcome of tonight's game.
If you started with $1000, and followed this function's advice, you'd
have $625 left if you had been betting on the Giants and $1375 if you
had been betting on the Angels. For people living in places where the
World Series is irrelevant: the match is a best-four-of-seven series
of games; at present, the Anaheim Angels are beating the San Francisco
Giants two games to one, with the fourth game scheduled for tonight.)
There are two challenges here: first, how can you figure out the right
bet at all, and second, what's a good way to get the computer to
calculate it?
It seems that some folks made a table of the bets for the Phoenixes -
Fungus series:
Phoenix Wins:
0 1 2
Fungus 0 30 30 20
Wins 1 30 40 40
2 20 40 80
And these clever folks noted that each value was the average of values
just below it and just to the right of it. In fact, the pattern extends
further, since you stop betting once one team has 3 wins:
Phoenix Wins:
0 1 2 3
Fungus 0 30 30 20 0
Wins 1 30 40 40 0
2 20 40 80 0
3 0 0 0
In fact, this observation is correct in all cases. But suppose you
didn't have this fortunate inspiration? How might you solve the
problem?
Here's how I solved it. First, I considered the case where each team
had won two games; it is the final game of the series. If the Phoenixes
win this game, I should have a total of $160 in my pocket; if the Fungus
wins, I should have $0. How much must I have before the game starts?
Clearly the only possible answer is that I must have $80, and I must bet
it all. If I have any amount other than $80, then no bet can leave me
with the right amounts in both cases. For example, if I started with
$60, I'd have to bet $100 in order to have the right amount if the
Phoenixes won; but then I'd have -$40 if they lost, which is wrong.
Now suppose the Fungus have won 1 game and the Phoenixes have won 2.
How much money must I have? If the Phoenixes win another game, they win
the series and I want to have $160. If the Fungus wins another game,
the series will be tied 2-2 and I should have exactly $80 left, by the
previous paragraph. Clearly, the only way this can happen is if I start
with $120 and bet $40 of it.
Similarly I can work out the amount of money I must have at each stage.
Let's let [a, b] represent the situation in which the Phoenixes have won
a games and the Fungus have won b games, and let's say that the amount
of money I must have in situation [a, b] is the 'value' of [a, b]. Thus
the value of [3, 0] is $160; the value of [0, 3] is $0, and the value of
[2, 2] is $80.
To find out the value [a, b], I just average the values of [a+1, b] and
[a, b+1]. The value of [a, b] must be exactly halfway between the value
of [a+1, b] and the value of [a, b+1], because if I bet $X, then
value([a+1, b ]) = value([a, b]) + $X
value([a , b+1]) = value([a, b]) - $X
so therefore, by simple algebra:
value([a+1, b ]) +
value([a , b+1]) = 2 * value([a, b])
and the bet, $X, is value([a+1, b]) - value([a, b]).
Here's the first code I wrote to apply these ideas:
sub bet {
my ($amount, $length, $a, $b) = @_;
return value($amount, $length, $a+1, $b) -
value($amount, $length, $a, $b);
}
sub value {
my ($amount, $length, $a, $b) = @_;
return $amount*2 if $a > $length / 2; # Phoenixes win
return 0 if $b > $length / 2; # Fungus win
return (value($amount, $length, $a+1, $b) +
value($amount, $length, $a , $b+1))/2;
}
The 'value' function computes the value of a particular situation; 'bet'
computes the amount I should bet.
1. With the algebraic observations above, one can find a simpler solution:
bet([a, b]) = value([a+1, b]) - value([a, b])
= (value([a+2, b ]) + value([a+1, b+1]))/2 -
(value([a+1, b ]) + value([a , b+1]))/2
= ((value([a+2, b ]) + value([a+1, b ])) -
(value([a+1, b+1]) + value([a , b+1])))/2
= (bet([a+1, b]) + bet([a, b+1]))/2
which is what the clever folks I mentioned earlier were able to
obtain through inspiration. The resulting simpler code is:
sub bet {
my ($amount, $length, $a, $b) = @_;
return 0 if $a > $length/2 || $b > $length/2; # series over
return $amount if $a + $b == $length - 1; # series tied at final game
return (bet($amount, $length, $a+1, $b) +
bet($amount, $length, $a, $b+1))/2;
}
2. Although the code here is simple and straightforward, and works well
for the examples, it fails badly for longer tournaments. As Piers
Cawley pointed out, if you try to use it to compute bets on the
outcome of the final round of the World Snooker Tournament, in which
Peter Ebdon beat Stephen Hendry in a best-18-of-35-frames struggle,
it takes a long, long time. I estimate about 111 hours on my
computer.
The problem is too much redundant recursion, and as usual, an easy
solution is to memoize it:
use Memoize;
memoize 'bet';
sub bet {
# ... as before ...
}
It now quickly computes that your bet in the first frame should be 13
pounds 58 pence if you want to risk 100 pounds on the outcome of the
entire round.
3. The memoization approach happily computes the entire bet table and
stores it internally, but a number of people preferred to compute it
manually. Here's Michael Carman's solution:
{
my %tbl;
sub bet {
my ($amount, $games, $won, $lost) = @_;
my $s = int($games / 2);
unless (exists($tbl{$s})) {
my @t;
foreach my $i (reverse (0 .. $s)) {
foreach my $j (reverse (0 .. $s)) {
if ($i == $s && $j == $s) {
$t[$i][$j] = 1;
}
elsif ($i == $s) {
$t[$i][$j] = $t[$i][$j+1] / 2;
}
elsif ($j == $s) {
$t[$i][$j] = $t[$i+1][$j] / 2;
}
else {
$t[$i][$j] = ($t[$i+1][$j] + $t[$i][$j+1]) / 2;
}
}
}
$tbl{$s} = \@t;
}
return $amount * $tbl{$s}->[$won][$lost];
}
}
Michael realized that you don't have to compute a different table for
every possible bet amount; it's enough to compute the betting tables
for a $1 total risk, and then if you want to risk $X instead, just
multiply all the bets by X. Applying the same improvement to the
'Memoize' solution yields:
use Memoize;
memoize 'bet1';
sub bet {
my $total_risk = shift;
$total_risk * bet1(@_);
}
sub bet1 {
my ($length, $a, $b) = @_;
return 0 if $a > $length/2 || $b > $length/2; # series over
return $amount if $a + $b == $length - 1; # series tied at final game
return (bet1($length, $a+1, $b) +
bet1($length, $a, $b+1))/2;
}
4. John Macdonald then observed that you don't need to compute a
different betting table for each different length series, as Michael
was doing, since:
If you index the table with the number of games each team needs
to win, rather than the number they have won so far, one table
works for all situations. (For example, a one-game "series" has
the same characteristics as the final game of any longer series
that goes to the limit. Similarly, if the score is 2 games to 1
in a 7 game series, it is at exactly the same position as a 11
game series that has reached 4 games to 3.)
John provided a 'needwinner' function that computed the table out to
a specified number of games; then his 'bet' function was simple:
sub bet {
my( $amt, $games, $schwartz, $nixon ) = @_;
my $winner = int( ($games+1) / 2 );
die "invalid args" unless $schwartz < $winner && $nixon < $winner;
needwinner( $winner );
return $amt * $bet[$winner-$schwartz][$winner-$nixon];
}
5. While this was going on, John also noticed that the bet amounts obey
a very simple mathematical formula. Supposing that the total risk is
$1, the bet table is computed by putting $1 in the upper left corner,
and then each entry is half the sum of the two entries just above it
and jut to the left of it. Except for the 'half' part, this is
exactly a recipe for computing "Pascal's Triangle", a well-known
mathematical object. The 'half' part just as the effect of
multiplying the Pascal's Triangle entries by 2**(-n-1), where n is
the number of games left to play in the series. Pascal's Triangle
looks like this:
1 1 1 1 1 ...
1 2 3 4 ...
1 3 6 ...
1 4 ...
1 ...
...
The bets for the Phoenixes-Fungus series look like this:
1 1/2 1/4
1/2 1/2 3/8
1/4 3/8 3/8
The resemblance isn't obvious unless you rewrite the bets a little bit:
1/1 1/2 1/4
1/2 2/4 3/8
1/4 3/8 6/16
The numerators are the corresponding numbers from Pascal's Triangle;
the denominators are successive powers of 2.
The powers of 2 are of course trivial, and the entries in Pascal's
Triangle are also easy to compute: the value with n+1 games left to
play where the Phoenixes need to win k+1 more games is called
choose(n, k), and it's equal to
n! / (k! * (n-k)!)
where ! represents the factorial function. John's code based on this
observation is simple and efficient, even without memoization:
sub bet {
my ($amount, $games, $won, $lost) = @_;
my $s = int($games / 2);
my $n = 2 * $s - $won - $lost;
my $k = $s - $won;
return $amount * n_choose_k($n, $k) / 2 ** $n;
}
sub n_choose_k {
my ($n, $k) = @_;
return factorial($n) / (factorial($k) * factorial($n - $k));
}
sub factorial {
return 1 if $_[0] <= 1;
return $_[0] * factorial($_[0] - 1);
}
6. One problem with John's implementation is that the
n! / (k! * (n-k)!) formula is not a very good way to compute entries
from Pascal's Triangle. The reason is that factorials are very
large, while the Triangle entries are not so large; when you compute
n! / (k! * (n-k)!) you're computing a very large number divided by
another very large number, and the computer tends to lose precision.
Even though choose(52, 26) fits in a 32-bit integer, the intermediate
values of 52! and 26! are way too big. Should you ever need to
compute entries from Pascal's Triangle, here's the way to do it:
sub choose {
my ($n, $k) = @_;
my $t = 1;
for my $i (1 .. $k) {
$t *= $n - $k + $i;
$t /= $i;
}
$t;
}
$t here is always an integer, and never gets bigger than necessary.
John later posted a solution that fixed this problem; without this
fix, the program produces slightly inaccurate results for series of
33 games and longer.
7. While I was working on this earlier this month, I realized that you
can use the same techniques to manufacture unusual bets. For
example, suppose you'd like to bet $1000 that the World Series lasts
a full seven games. But your bookie won't take such a bet. No
problem!
VALUE
Angels |Giants wins
Wins |0 1 2 3 | 4
+--------------------------------+----
0 |1000 1000 800 400 | 0
| |
1 |1000 1200 1200 800 | 0
| |
2 | 800 1200 1600 1600 | 0
| |
3 | 400 800 1600 3200 |3200
--+--------------------------------+----
4 | 0 0 0 3200 |
As before, the bet in situation [a, b] is just value([a+1, b]) -
value([a, b]). So in the first game, you bet nothing; in the second
game you must bet $200 on the team that lost the first game; in the
third game you bet $400 on the team that lost the first two games,
or, if each team lost one game, you bet nothing; and so on.
Note that you must start with $1000, and you risk losing it all
against the chance of winning $3200; this means that the odds of the
World Series lasting seven games are 3200:1000, which is a
probability of 23.8%. (Assuming that the teams are equally matched.)
Similarly, if you'd like to manufacture a betting schedule that pays
off $37 if your team wins in exactly six games, pays of $19 if your
team wins in 4, 5, or 7 games, and loses $23 if your team loses, you
can do that.
8. The problem was originally posed to me by someone who expects to go
work for Goldman-Sachs, and investment bank. There is a very serious
application of this problem. Suppose you want to risk a total of
$1,000,000 in the stock market over the next year; you want to bet
that IBM stock goes up. Unfortunately, you can't make such a bet.
What you can do is buy and sell IBM stock each day for the next year.
If you buy more stock, you're increasing your bet on IBM to win; if
you sell stock, you are reducing your bet. When the stock goes up,
that's like your team winning, because your stock is now more
valuable; when the stock goes down, your team lost and you lose the
bet. By adjusting your holdings appropriately, you can arrange
whatever balance of payoffs you like. You can construct a
buy-or-sell schedule that pays off if, and only if, IBM stock goes
down between 5 and 10 percent.
Such a schedule will have at least a few hundred elements, and
perhaps thousands, since you can fine-tune it to tell you what to do
every hour. So Piers's comment about the 35-frame World Snooker
Championship finals isn't so far-fetched!
Thanks again to all the subscribers, and to those who participated in
the discussion. I will send another quiz on Wednesday.