SRM 702, D2, 1000-Pointer (SubsetSumExtreme)

James S. Plank

Sun Nov 20 22:47:08 EST 2016

Problem Statement.
A main() with the examples compiled in.
A skeleton that compiles with main.cpp.
Problem Given in Topcoder: November, 2016
Competitors who opened the problem: 186
Competitors who submitted a solution: 18
Number of correct solutions: 11
Submission Accuracy (percentage correct vs those who opened): 5.9%
Submission Accuracy (percentage correct vs those who submitted): 61.1%
Average Correct Time: 35 minutes, 48 seconds.

This one feels like dynamic programming, and indeed it is, but you have to set up your data structures correctly. The constraints on the size of the block vector should give you a clue. How many collections of blocks are there? If there are B blocks, then the answer is 2B, and with B limited to 12, that's a small number (4096).

How can we represent a collection of blocks? Bit arithmetic, of course? If you have B blocks, you can use a B-bit number to represent any subset of the B blocks. (See the "Power Set" section of the CS302 lecture notes on enumeration).

To solve this problem, we are going to write a recursive method:

double GE(int setid);

This method takes as input the specification of a subset of the blocks (using bit arithmetic), and it returns the expected value of Bob playing the game when he starts with that subset.

Let's use Example 0 to illustrate. In this example, block is {1,2,3}, and face is {6,5}. Here are some calculations of GE():

GE(0x0) is 0. Bob has no blocks.
GE(0x1) is 1. Since bit (1 << 0) is set, this setid stands for block[0]. In other words the subset is { 1 }. The two die faces are 6 and 5, so Bob can't use either of them -- the game will end with him having the block with the number 1.
GE(0x2) is 2. Now, bit (1 << 1) is set, this setid stands for block[1]. In other words the subset is { 2 }. As in the last example, the die rolls are useless, and the game will end with Bob having the block with the number 2.
GE(0x3) is 3. Now, both bits (1 << 0) and (1 << 1) are set, so this setid stands for the subset { 1, 2 }. As in the previous examples, the game ends, and the sum of the blocks is three.
GE(0x4) is also 3. Now, bit (1 << 2) is set, so this setid stands for block[2]. In other words, the subset is { 3 }. So the game ends, and Bob has 3.
GE(0x5) returns 4. 0x5 represents the subset { 1, 3 }, which again will end the game.
GE(0x6) is the first interesting case. This is the subset { 2, 3 }, which means that when Bob rolls the 5, he has blocks that sum to 5. In that case, his remaining subset is 0x0, so his value will be GE(0x0), which is zero. Since he rolls a 5 half of the time, the contribution of the answer to the expected value is 0.5*0.
The other half of the time, he rolls a 6, and his sum is 5. So the contribution of rolling a six to the expected value is 0.5*5. Therefore, the exptected value is 0.5*0 + 0.5*5 = 2.5.
GE(0x7) is the last case. This is the subset { 1, 2, 3 }. And the expected value will be:

0.5 * GE(0x1) + 0.5 * GE(0x0) = 0.5*1 + 0.5*0 = 0.5

When Bob first plays the game, he has all of the pieces, so the answer to the problem is going to be GE(0x7).

The general structure of GE()

The way GE(s) works is as follows. Maintain a sum, which starts at zero. For each value f of face, you need to decide:

Are there any subsets that result when you remove blocks from s whose total is f? If so, then you need to call GE() of all of those subsets. The one that returns the smallest value is the best one that Bob can choose, so you will add that value to a sum.
If not, then you simply add the sum of the blocks in s to the sum.

At the end, divide the sum by faces.size(). There's your expected value.

There's a small subtlety

Given a set s and a value f, how do you determine all of the subsets that result when you remove blocks from s whose total is f?

Here's how I did it -- at the beginning of my program, I created two vectors:

SizesOfSets: This is a vector of integers. It is indexed by set id's, and it contains the sum of the blocks in the subset specified by the setid. So, in the example above:
- SizesOfSets[0x0] = 0.
- SizesOfSets[0x1] = 1.
- SizesOfSets[0x2] = 2.
- SizesOfSets[0x3] = 3.
- SizesOfSets[0x4] = 3.
- SizesOfSets[0x5] = 4.
- SizesOfSets[0x6] = 5.
- SizesOfSets[0x7] = 6.
SetsBySize: This is a vector of vector of integers. SetsBySize[x] is a vector of all the set ids s where SizesOfSets[s] = x. So, in the example above:
- SetsBySize[0] = { 0x0 }.
- SetsBySize[1] = { 0x1 }.
- SetsBySize[2] = { 0x2 }.
- SetsBySize[3] = { 0x3, 0x4 }.
- SetsBySize[4] = { 0x5 }.
- SetsBySize[5] = { 0x6 }.
- SetsBySize[6] = { 0x7 }.

Now, in GE(s), let's assume that SizesOfSets[s] = x. And, assume I'm considering a roll of f. What I do is look at all of the set ids in SetsBySize[x-f]. These are all of the subsets that have blocks sum to x-f, which of course, will be the sum when I remove blocks from x that sum to f. The problem is, though, you may not have the proper blocks to remove from s to get all of these sets.

Here's an example. Suppose that my blocks are { 1, 3, 2, 2 }, my faces are { 2, 6 }, and I am in GE(0xb), considering the face of size 2. The id 0xb corresponds to { 1, 3, 2 }, which has a sum of 6. So, when I consider f=2, I need to look at all sets whose blocks sum to 4. There will be two of these in SetsBySize[2]: { 0x3 and 0xc }. Just so we're on the same page, 0x3 is equal to { 1, 3 }, and 0xc is equal to { 2, 2 }.

Now, you can get to set { 1, 3 } from { 1, 3, 2 }, but you can't get to { 2, 2 }. Therefore, when you are considering face 2, you should be calling GE(0x3) and not GE(0xc). How do you determine that? With bit arithmetic! If you are at set s, and you are considering whether you can get to subset t, you take their intersection and see if it equals t. If so, then you can get to from s by removing blocks. If not, you can't. You can detemine intersection with bitwise AND.

In our example: (0xb & 0x3) is equal to 0x3, which means you can get to 0x3 from 0xb. On the other hand, (0xb & 0xc) is equal to 0x8, so you can't get to 0xc from 0xb.

Putting it all together

So, here are the steps:

Calculate the two vectors, SizesOfSets and SetsBySize.
Write the method GE(int s). The method has the following steps:
- Let x equal the sum of the blocks in s.
- Set a sum to equal zero.
- For each face f in faces, you want to calculate the smallest value GE(t) for a set t which is in SetsBySize[x-f], where the intersection of s and t is equal to t. Let that value be v. If there is no set t that works, then set v to equal x. Add x to the sum.
- After processing each face, divide the sum by face.size() (as a double, of course).
- That is your return value.
Your answer will be GE((1 << block.size())-1).
You'll have to memoize.

My solution.

/* Topcoder SRM 702, D2, 1000-pointer: SubsetSumExtreme.
   James S. Plank
   November 20, 2016
 */

#include <string>
#include <vector>
#include <iostream>
#include <cstdio>
#include <cstdlib>
using namespace std;

class SubsetSumExtreme {
  public:
    vector <int> SizesOfSets;                /* For each set, this is the sum of its blocks. */
    vector < vector <int> > SetsBySize;      /* All of the sets that have the same size. */
    vector <int> F;                          /* This is a copy of "face". */
    vector <double> Cache;                   /* The memoization cache. */

    double GE(int setid);
    double getExpectation(vector <int> block, vector <int> face);
};

double SubsetSumExtreme::GE(int s)
{
                    /* These variable names follow the writeup: */
  double sum;       /* This is the sum for the expected value computation. */
  int x;            /* This is the sum of the blocks in s */
  int f;            /* This is the face of the die that we are currently considering. */
  int t;            /* This holds set ids of sets whose blocks sum to x-f. */
  
  double best;      /* For each value of f, this is the best way to do x-f */

  int i, j;         /* These are temporary variables. */
  double r;

  if (s == 0) return 0;
  if (Cache[s] >= 0) return Cache[s];

  sum = 0;
  x = SizesOfSets[s];

  /* For each face f on the die: */

  for (i = 0; i < F.size(); i++) {

    /* Look at each set t whose blocks sum to x-f. */

    f = F[i];
    best = x;
    if (x-f >= 0) {
      for (j = 0; j < SetsBySize[x-f].size(); j++) {
        t = SetsBySize[x-f][j];

        /* If t is a proper subset of s, then call GE recursively. */

        if ((s & t) == t) {
          r = GE(t);
          if (r < best) best = r;
        } 
      }
    }

    /* Add the best answer for f to the sum. */

    sum += best;
  }

  /* At the end, divide sum by F.size() to complete the expected value computation. */

  sum /= (double) F.size();
  Cache[s] = sum;
  return sum;
}

double SubsetSumExtreme::getExpectation(vector <int> B, vector <int> face)
{
  int i, j;
  int sum;

  F = face;

  /* Do a power set enumeration to enumerate all subsets of the blocks.  
     In this enumeration, you're going to set all of the values of
     SizesOfSets and SetsBySize. */

  SetsBySize.resize(1000*12+1);

  for (i = 0; i < (1 << B.size()); i++) {
    sum = 0;
    for (j = 0; j < B.size(); j++) {
      if (i & (1 << j)) {
        sum += B[j];
      }
    }
    SizesOfSets.push_back(sum);
    SetsBySize[sum].push_back(i);
  }

  /* This loop prints out SetsBySize, in case you want to see it. 

  for (i = 0; i < SetsBySize.size(); i++) {
    if (SetsBySize[i].size() > 0) {
      printf("Size: %4d.  Sets:", i);
      for (j = 0; j < SetsBySize[i].size(); j++) printf(" 0x%x", SetsBySize[i][j]);
      printf("\n"); 
    }
  }
  */

  /* Set the memoization cache, and then call GE() on the set which contains
     all of the blocks. */

  Cache.resize(1 << B.size(), -1);
  return GE( (1 << B.size()) - 1 );
}