Are the names counter-intuitive? Yes. Whatever. Let's show an example of a program that uses one. (BTW, if you want more practice, do the 500-point problem from division one of Topcoder SRM 337).
If you took CS202 from me, you may have done the Keno lab. I'll refresh your memory with the preamble to the lab:
You wander around the casino, and see this game called Keno. It's a bit like a lottery. There are 80 balls numbered 1 through 80, and they will pick 20 of them randomly. They have a catchy little flier about all the Keno bets you can make:
|
Now, we're talking entertainment! You know there's no way that these tempting little bets are going to make you money in the long run, but you have a little mathematical problem to solve, and that's better than gambling!
The Lucky Loser Bet
|
Mr. Thump thinks that this is a catchy game. People may well take out "insurance" on their Keno bets by choosing all of their Keno picks as Lucky Losers. At a very high level, it looks like a good bet for the following reason:
If your number is picked, nothing matters. However, if it is not picked, then it would seem as though you have a roughly 50-50 chance of having the closest number picked be higher than yours. You're getting $1.25 instead of $1.00 on what seems to be a 50-50 chance. I'd take those odds!
Now, of course, the odds aren't 50-50. Why? Suppose the numbers 4 and 8 are picked, but 5, 6 and 7 are not. 7 is winning bet, but 6 and 5 are not. In other words, when the interval between picked balls is an odd number i, then there are (i-1)/2 winners and (i+1)/2 losers. Of course, if the interval is an even number, then there are an equal number of winners and losers.
It would take some good math to figure out the closed-form probability of the Lucky Loser bet. However, we can simulate, and that's just as good, at least to Mr. Thump. We'll go through a sequence of programs that illustrate a number of points.
class Keno_LL { public: int NB; // Initial parameter: Number of balls in play (80 in our example) int NP; // Initial parameter: Number of balls picked each time (20 in our example) double Payout; // This is how much you win in the Lucky Loser bet (1.25 in our example) int Iterations; // Number of iterations for the simulation. 0 if interactive int Verbose; // Output on each iteration, or just at the end? set <int> Picked; // This is used for the balls picked at each iteration int Wins, Losses, Ties; // Stats - total wins, losses and ties double Winnings; // Total winnings (yes, we could calculate from above) double N; // Iteration so far void Do_Picking(); // Creates Picked randomly void Calculate_Payout(int b); // Given a ball b, and set Picked, calculates the payout and updates the stats. }; |
All these are straightforward. At each iteration, we'll call Do_Picking(), which will put NP random numbers from 1 to NB into Picked. Then we'll get a value of b, either from standard input or randomly, and call Calculate_Payout(), which will determine whether b is a winner, loser or tie, and update all those stats accordingly. I won't show the code for the main() -- it simply parses command line arguments, sets up an instance of the Keno_LL class, and makes the appropriate calls to Do_Picking() and Calculate_Payout().
Let's take a look at those methods:
/* Procedure to pick the keno balls randomly. The balls are put into the set "Picked," which is sentinelized so that the first ball is at the end of the set, after the maximum numbered ball, and the last ball is at the beginning of the set, before ball 1. */ void Keno_LL::Do_Picking() { int i, j, first, last; set <int>::iterator pbit; Picked.clear(); for (i = 0; i < NP; i++) { do j = random()%NB+1; while (Picked.find(j) != Picked.end()); Picked.insert(j); } if (Verbose) { cout << "Balls Picked:"; for (pbit = Picked.begin(); pbit != Picked.end(); pbit++) { cout << " " << *pbit; } cout << ".\n"; } first = *(Picked.begin()); /* Sentinelize Picked */ last = *(Picked.rbegin()); Picked.insert(NB+first); Picked.insert(last-NB); } |
Do_Picking() puts random numbers into a set. It takes care not to put duplicates into the set (although of course, the set will reject duplicates -- you could easily write this to take advantage of that fact). If specified, it prints out the balls picked. At the end, it sentinelizes the set. If the smallest element is s and the largest l, then it inserts (l-NB) and (NB+s) into the set. This is to handle the case when the ball that the contestant picks is lower than all the picked balls, or higher than all the picked balls. Then, you don't have to have any special case code in Calculate_Payout(). (I'll go over this in greater detail below.)
Here's Calculate_Payout():
void Keno_LL::Calculate_Payout(int b) { set <int>::iterator pbit; int u, l; double win; /* Determine whether b is picked (a tie), a winner or a loser. */ pbit = Picked.lower_bound(b); if (*pbit == b) { win = 0; Ties++; } else { u = *pbit; pbit--; l = *pbit; if (u - b < b - l) { win = Payout; Wins++; } else { win = -1; Losses++; } } /* Update stats, and print out what happened, if desired. */ Winnings += win; N++; if (Verbose) { if (win == 0) { printf(" Your ball was picked. +0: "); } else { printf(" D to higher: %d. D to lower: %d. %+.2lf: ", u-b, b-l, win); } printf("Total = %.2lf. Avg = %.6lf\n", Winnings, Winnings/N); } } |
I use the lower_bound method to find the smallest element in Picked that is greater than or equal to b. Since we sentinelized Picked, we're guaranteed that there will be an element greater than b and an element less than b. That's nice, because we don't have to test whether pit is equal to Picked.end() or Picked.begin().
If we didn't find the element, then we find the balls greater than and less than b with "u = *pbit; pbit--; l = *pbit;". This is because you're allowed to increment and decrement iterators to move around the set.
The rest of the code is straightforward.
2 3 5 12 13 26 31 35 36 38 44 45 51 54 60 65 67 68 70 76 |
And suppose that we don't use a sentinel in Picked. Let me give three different examples of numbers that the user chooses. Suppose that the user chooses 37. We'll look that up in Picked using Picked.lower_bound(). It will return an iterator to 38, because 38 is the smallest value greater than or equal to 37. We can decrement the iterator to find the greatest value less than 37 -- that is 36. Thus, we can determine that the user's pick was a loser quite easily.
In example 2, suppose that the user chooses 1. When we call Picked.lower_bound(), it will return an iterator to 2. Since 2 is the smallest value in the set, we'll need to write special-purpose code to calculate that 76 is the "greatest value less than" 1, and that its distance from 1 is 5. That's a drag.
In example 3, suppose that the user chooses 77. Now, Picked.lower_bound() is going to return Picked.end(), and we have to write more special-purpose code to determine that 76 is the lower value and 2 is the "higher" value.
We use the sentinels to avoid writing all of that special-purpose code. We insert 82 = 2+80 and -4 = 76-80 into the set, which now becomes:
-4 2 3 5 12 13 26 31 35 36 38 44 45 51 54 60 65 67 68 70 76 82 |
We are now guaranteed of two things when we call Picked.lower_bound():
UNIX> g++ -o bin/keno-ll src/keno-ll.cpp UNIX> bin/keno-ll usage: keno-ll #balls #picked payout iterations-(zero-to-play) verbose(y|n) UNIX> bin/keno-ll 80 20 1.25 0 y # Play the game interactively Pick your ball: 8 Balls Picked: 2 3 5 12 13 26 31 35 36 38 44 45 51 54 60 65 67 68 70 76. D to higher: 4. D to lower: 3. -1.00: Total = -1.00. Avg = -1.000000 Pick your ball: 8 Balls Picked: 1 2 4 7 8 20 31 35 36 37 39 48 51 58 65 67 70 73 75 80. Your ball was picked. +0: Total = -1.00. Avg = -0.500000 Pick your ball: 8 Balls Picked: 3 12 15 26 32 33 35 36 39 40 49 50 54 59 65 66 69 70 72 73. D to higher: 4. D to lower: 5. +1.25: Total = 0.25. Avg = 0.083333 Pick your ball: 8 Balls Picked: 1 7 8 12 16 22 28 30 35 41 45 46 51 52 62 64 69 71 73 75. Your ball was picked. +0: Total = 0.25. Avg = 0.062500 Pick your ball: 8 Balls Picked: 5 11 15 17 20 21 22 27 29 32 35 43 47 50 56 57 61 66 67 68. D to higher: 3. D to lower: 3. -1.00: Total = -0.75. Avg = -0.150000 Pick your ball: <CNTL-D> UNIX> bin/keno-ll 80 20 1.25 0 n # Play the game from the command line, output at the end. 8 8 8 8 8 <CNTL-D> Total = 4.00. Avg = 0.800000. W/L/T: 4 1 0 UNIX> bin/keno-ll 80 20 1.25 5 y # Five random games with verbose output Picked 59 Balls Picked: 5 8 9 14 15 19 23 24 33 45 54 61 66 69 70 71 75 78 79 80. D to higher: 2. D to lower: 5. +1.25: Total = 1.25. Avg = 1.250000 Picked 58 Balls Picked: 1 3 7 10 13 17 31 35 40 42 46 49 57 59 65 68 70 71 75 80. D to higher: 1. D to lower: 1. -1.00: Total = 0.25. Avg = 0.125000 Picked 34 Balls Picked: 6 7 8 15 22 23 26 35 39 41 49 55 56 59 64 69 72 75 76 77. D to higher: 1. D to lower: 8. +1.25: Total = 1.50. Avg = 0.500000 Picked 7 Balls Picked: 1 7 11 13 24 26 29 33 34 37 46 50 52 54 58 64 66 67 75 79. Your ball was picked. +0: Total = 1.50. Avg = 0.375000 Picked 68 Balls Picked: 4 6 8 10 12 20 23 26 28 31 33 38 43 49 58 67 71 74 76 77. D to higher: 3. D to lower: 1. -1.00: Total = 0.50. Avg = 0.100000 UNIX> bin/keno-ll 80 20 1.25 5 n # Five random games with output at the end Total = -2.00. Avg = -0.400000. W/L/T: 0 2 3 UNIX>If we choose a large number of iterations, we can start to see whether this is a good or bad bet over the long run. I use time to show how long each run takes (on my Macintosh):
UNIX> time bin/keno-ll 80 20 1.25 10 n Total = 3.00. Avg = 0.300000. W/L/T: 4 2 4 0.000u 0.000s 0:00.00 0.0% 0+0k 0+1io 0pf+0w UNIX> time bin/keno-ll 80 20 1.25 100 n Total = -21.00. Avg = -0.210000. W/L/T: 24 51 25 0.003u 0.001s 0:00.00 0.0% 0+0k 0+0io 0pf+0w UNIX> time bin/keno-ll 80 20 1.25 1000 n Total = -31.25. Avg = -0.031250. W/L/T: 315 425 260 0.022u 0.001s 0:00.02 100.0% 0+0k 0+0io 0pf+0w UNIX> time bin/keno-ll 80 20 1.25 10000 n Total = -566.75. Avg = -0.056675. W/L/T: 3069 4403 2528 0.170u 0.001s 0:00.17 100.0% 0+0k 0+0io 0pf+0w UNIX> time bin/keno-ll 80 20 1.25 100000 n Total = -3068.00. Avg = -0.030680. W/L/T: 32096 43188 24716 1.637u 0.001s 0:01.63 100.0% 0+0k 0+0io 0pf+0w UNIX> time bin/keno-ll 80 20 1.25 1000000 n Total = -26723.75. Avg = -0.026724. W/L/T: 321273 428315 250412 16.364u 0.009s 0:16.37 99.9% 0+0k 0+0io 0pf+0w UNIX>Well, it appears to be converging slightly, but man, that's slow. First, let's use the optimizer -- that usually speeds things up. There are four levels of optimization, and usually the -O3 flag gives you the best bang for the buck.
UNIX> g++ -o bin/keno-ll -O src/keno-ll.cpp UNIX> time bin/keno-ll 80 20 1.25 1000000 n Total = -28030.00. Avg = -0.028030. W/L/T: 321128 429440 249432 6.415u 0.001s 0:06.41 100.0% 0+0k 0+0io 0pf+0w UNIX> g++ -o bin/keno-ll -O2 src/keno-ll.cpp UNIX> time bin/keno-ll 80 20 1.25 1000000 n Total = -27554.00. Avg = -0.027554. W/L/T: 321096 428924 249980 6.185u 0.003s 0:06.18 100.0% 0+0k 0+0io 0pf+0w UNIX> g++ -o bin/keno-ll -O3 src/keno-ll.cpp UNIX> time bin/keno-ll 80 20 1.25 1000000 n Total = -27053.75. Avg = -0.027054. W/L/T: 321521 428955 249524 6.150u 0.003s 0:06.15 100.0% 0+0k 0+0io 0pf+0w UNIX> g++ -o bin/keno-ll -O4 src/keno-ll.cpp UNIX> time bin/keno-ll 80 20 1.25 1000000 n Total = -27600.75. Avg = -0.027601. W/L/T: 320969 428812 250219 6.150u 0.002s 0:06.15 100.0% 0+0k 0+0io 0pf+0w UNIX>(BTW, the makefile compiles with -O3).
A better way is to put all of the numbers from 1 to 80 into an array, and then randomly pull them out. Each time you "pull a number out", you move it to the end of the array, and then don't consider it for the next pick.
The new code is in src/keno-ll2.cpp. I've added a vector Balls to the Keno_LL class, and I have initialized it to hold the numbers 1 thorugh NB. Then, Do_Picking() works as follows:
void Keno_LL::Do_Picking() { int i, j, first, last, tmp; set <int>::iterator pbit; Picked.clear(); for (i = 0; i < NP; i++) { j = random()%(NB-i); tmp = Balls[j]; Balls[j] = Balls[NB-i-1]; Balls[NB-i-1] = tmp; Picked.insert(Balls[NB-i-1]); } ... |
It runs faster, but not by a huge amount (9 percent):
UNIX> make bin/keno-ll2 g++ -O3 -o bin/keno-ll2 src/keno-ll2.cpp UNIX> time bin/keno-ll2 80 20 1.25 1000000 n Total = -27893.75. Avg = -0.027894. W/L/T: 321237 429440 249323 5.652u 0.005s 0:05.65 100.0% 0+0k 0+0io 0pf+0w UNIX>
void Keno_LL::Calculate_All() { int b; for (b = 1; b <= NB; b++) Calculate_Payout(b); } |
We'll compile with optimization and run it:
UNIX> make bin/keno-ll3 g++ -O3 -o bin/keno-ll3 src/keno-ll3.cpp UNIX> time bin/keno-ll3 80 20 1.25 1000000 n Total = -27149.25. Avg = -0.027149. W/L/T: 321267 428733 250000 0.108u 0.001s 0:00.10 100.0% 0+0k 0+0io 0pf+0w UNIX>Dang, that was fast! It's because we're iterating 1000000/80 times instead of 1000000. Granted, we're doing a little more work at each iteration, but not much.
The code is in src/keno-ll4.cpp:
void Keno_LL::Calculate_All() { set <int>::iterator low, high; int x, highest; int nw, nl; highest = *Picked.rbegin(); low = Picked.begin(); high = low; high++; nw = 0; nl = 0; while (*high != highest) { x = *high - *low - 1; nw += (x/2); nl += (x - x/2); low++; high++; } Wins += nw; Losses += nl; Ties += NP; Winnings += nw*Payout; Winnings -= nl; N += NB; } |
Note once again how the sentinels help us keep our code clean. It speeds matters up even further:
UNIX> make bin/keno-ll4 g++ -O3 -o bin/keno-ll4 src/keno-ll4.cpp UNIX> time bin/keno-ll4 80 20 1.25 1000000 n Total = -27682.50. Avg = -0.027682. W/L/T: 321030 428970 250000 0.076u 0.000s 0:00.07 100.0% 0+0k 0+0io 0pf+0w UNIX>0.076 vs 0.102 may not seem like much, but it is over 25 percent.
The code is in src/keno-ll5.cpp, where we write a separate Do_Picking_Array() which puts the balls into an array PBalls:
void Keno_LL::Do_Picking_Array() { int i, j, first, last, tmp; for (i = 0; i < NP; i++) { j = random()%(NB-i); tmp = Balls[j]; Balls[j] = Balls[NB-i-1]; Balls[NB-i-1] = tmp; PBalls[i] = Balls[NB-i-1]; } PBalls[NP] = NB+1; /* Make room for the sentinel at the end */ sort(PBalls.begin(), PBalls.end()); PBalls[NP] = NB+PBalls[0]; /* Put a sentinel at the end */ if (Verbose) { cout << "Balls Picked:"; for (i = 0; i < NP; i++) cout << " " << PBalls[i]; cout << ".\n"; } } void Keno_LL::Calculate_All() { int i, x; int nw, nl; nw = 0; nl = 0; for (i = 0; i < NP; i++) { x = PBalls[i+1] - PBalls[i] - 1; nw += (x/2); nl += (x - x/2); } Wins += nw; Losses += nl; Ties += NP; Winnings += nw*Payout; Winnings -= nl; N += NB; } |
Note how the array is again sentinelized. The array PBalls has NP+1 elements. After putting the random balls in unsorted order into PBalls, we set PBalls[NP] to equal NB+1, and then we sort it. That way, we know that PBalls[NP] will remain equal to NB+1. After the sort, we know that the minimum element is in PBalls[0], so we can put PBalls[0]+NB into the last PBalls[NP]. Now we have all the intervals represented easily in PBalls.
When we run it, it's much, much faster -- this is the best code for the problem:
UNIX> make bin/keno-ll5 g++ -O3 -o bin/keno-ll5 src/keno-ll5.cpp UNIX> time bin/keno-ll5 80 20 1.25 1000000 n Total = -28366.50. Avg = -0.028366. W/L/T: 320726 429274 250000 0.015u 0.001s 0:00.01 100.0% 0+0k 0+0io 0pf+0w UNIX> time bin/keno-ll5 80 20 1.25 10000000 n Total = -278195.25. Avg = -0.027820. W/L/T: 3209691 4290309 2500000 0.114u 0.001s 0:00.11 100.0% 0+0k 0+0io 0pf+0w UNIX> time bin/keno-ll5 80 20 1.25 100000000 n Total = -2778566.25. Avg = -0.027786. W/L/T: 32098415 42901585 25000000 1.103u 0.001s 0:01.10 100.0% 0+0k 0+0io 0pf+0w UNIX> time bin/keno-ll5 80 20 1.25 1000000000 n Total = -27759207.00. Avg = -0.027759. W/L/T: 320995908 429004092 250000000 10.999u 0.004s 0:11.00 99.9% 0+0k 0+0io 0pf+0w UNIX>We can confidently tell Mr. Thump that this bet will make him 2.78 cents on every dollar bet. My guess is that he'd like a little more. If you make the payout $1.20 instead of $1.25, his profit goes to 4.38 cents. How does that compare? Well, Roulette is a profit of 5 cents. Three card poker goes anywhere from 1.96 cents to about 10, depending on the odds. I'd say you've invented a pretty good game!
I'll leave you with this summary graph of the original program and the improvements. The percentage is improvement over the previous improvement.