Weighted Random Numbers

I am trying to implement weighted random numbers. Currently, I am just banging my head against the wall and cannot understand this.

In my project (Hold'em hand-range, subjective all-in-one analysis) I use random Boost functions. So, let's say I want to select a random number from 1 to 3 (either 1, 2, or 3). Boost mersenne twister generator works like a charm for this. However, I want the selection to be weighted, for example, as follows:

1 (weight: 90) 2 (weight: 56) 3 (weight: 4) 

Does Boost have any features for this?

+74
c ++ boost random
Nov 19 '09 at 7:56
source share
7 answers

There is a simple algorithm for choosing an item at random, where the elements have individual weights:

1) calculate the sum of all weights

2) select a random number equal to 0 or more and less than the sum of the weights

3) view items one at a time, subtracting their weight from your random number until you get an item where the random number is less than the weight of this element.

Pseudocode illustrating this:

 int sum_of_weight = 0; for(int i=0; i<num_choices; i++) { sum_of_weight += choice_weight[i]; } int rnd = random(sum_of_weight); for(int i=0; i<num_choices; i++) { if(rnd < choice_weight[i]) return i; rnd -= choice_weight[i]; } assert(!"should never get here"); 

It should be easy to adapt to your containers with acceleration and the like.




If your weights rarely change, but you often choose them at random, and while your container stores pointers to objects or more than a few dozen elements (basically, you need to profile to find out if this helps or prevents), that is, optimization:

By storing the total weight in each element, you can use binary search to select the element corresponding to the weight.




If you do not know the number of elements in the list, then there is a very neat algorithm called collector sampling , which can be adapted for weighted.

+135
Nov 19. '09 at 8:00
source share

Updated answer to old question. You can easily do this in C ++ 11 with just std :: lib:

 #include <iostream> #include <random> #include <iterator> #include <ctime> #include <type_traits> #include <cassert> int main() { // Set up distribution double interval[] = {1, 2, 3, 4}; double weights[] = { .90, .56, .04}; std::piecewise_constant_distribution<> dist(std::begin(interval), std::end(interval), std::begin(weights)); // Choose generator std::mt19937 gen(std::time(0)); // seed as wanted // Demonstrate with N randomly generated numbers const unsigned N = 1000000; // Collect number of times each random number is generated double avg[std::extent<decltype(weights)>::value] = {0}; for (unsigned i = 0; i < N; ++i) { // Generate random number using gen, distributed according to dist unsigned r = static_cast<unsigned>(dist(gen)); // Sanity check assert(interval[0] <= r && r <= *(std::end(interval)-2)); // Save r for statistical test of distribution avg[r - 1]++; } // Compute averages for distribution for (double* i = std::begin(avg); i < std::end(avg); ++i) *i /= N; // Display distribution for (unsigned i = 1; i <= std::extent<decltype(avg)>::value; ++i) std::cout << "avg[" << i << "] = " << avg[i-1] << '\n'; } 

The output on my system is:

 avg[1] = 0.600115 avg[2] = 0.373341 avg[3] = 0.026544 

Note that most of the code above is devoted to simple display and analysis of output. The actual generation is just a few lines of code. The result shows that the requested "probabilities" were received. You should divide the requested result by 1.5, as this is what the queries add to.

+40
Apr 12 '11 at 1:10
source share

What I do when I need a weight number is a random number for weight.

For example: I need to generate random numbers from 1 to 3 with the following weights:

  • 10% random number may be 1
  • 30% random number may be 2
  • 60% random number may be 3

Then I use:

 weight = rand() % 10; switch( weight ) { case 0: randomNumber = 1; break; case 1: case 2: case 3: randomNumber = 2; break; case 4: case 5: case 6: case 7: case 8: case 9: randomNumber = 3; break; } 

Moreover, at random, he has 10% of the probabilities of 1, 30%, to be 2 and 60% equal to 3.

You can play with him as your needs.

Hope I can help you, Good luck!

+8
Nov 28 '13 at 21:49
source share

If your weights change more slowly than they are drawn, C ++ 11 discrete_distribution will be the easiest:

 #include <random> #include <vector> std::vector<double> weights{90,56,4}; std::discrete_distribution<int> dist(std::begin(weights), std::end(weights)); std::mt19937 gen; gen.seed(time(0));//if you want different results from different runs int N = 100000; std::vector<int> samples(N); for(auto & i: samples) i = dist(gen); //do something with your samples... 

Note, however, that C ++ 11 discrete_distribution calculates all the cumulative amounts during initialization. This is usually necessary because it speeds up the sampling time for a one-time cost of O (N). But for a rapidly changing distribution, it will carry heavy costing (and memory). For example, if the weights represented the number of elements that are, and each time you draw one, you delete it, you probably want to create your own algorithm.

Will answer overflow https://stackoverflow.com/a/2129609/29023/ ... , avoiding these overheads, but will be slower to draw than C ++ 11 because it cannot use binary search.

To see that it does this, you can see the corresponding lines ( /usr/include/c++/5/bits/random.tcc in my Ubuntu 16.04 + GCC 5.3 installation):

  template<typename _IntType> void discrete_distribution<_IntType>::param_type:: _M_initialize() { if (_M_prob.size() < 2) { _M_prob.clear(); return; } const double __sum = std::accumulate(_M_prob.begin(), _M_prob.end(), 0.0); // Now normalize the probabilites. __detail::__normalize(_M_prob.begin(), _M_prob.end(), _M_prob.begin(), __sum); // Accumulate partial sums. _M_cp.reserve(_M_prob.size()); std::partial_sum(_M_prob.begin(), _M_prob.end(), std::back_inserter(_M_cp)); // Make sure the last cumulative probability is one. _M_cp[_M_cp.size() - 1] = 1.0; } 
+5
Jul 6 '16 at 7:33
source share

Create a bag (or std :: vector) of all the items you can select.
Make sure the quantity of each item is proportional to the weight.

Example:

  • 1 60%
  • 2 35%
  • 3 5%

So, you have a bag with 100 items with 60 1, 35 2 and 5 3.
Now randomly sort the bag (std :: random_shuffle)

Select items from the package sequentially until it becomes empty.
After empty re-randomize the bag and start again.

+3
Nov 19 '09 at 10:48
source share

Select a random number at [0,1), which should be the default operator () to increase the RNG. Select an element with a cumulative probability density function> = this number:

 template <class It,class P> It choose_p(It begin,It end,P const& p) { if (begin==end) return end; double sum=0.; for (It i=begin;i!=end;++i) sum+=p(*i); double choice=sum*random01(); for (It i=begin;;) { choice -= p(*i); It r=i; ++i; if (choice<0 || i==end) return r; } return begin; //unreachable } 

If random01 () returns double> = 0 and <1. Note that the above does not require the probabilities to be summed with 1; he normalizes them for you.

p is just a function that determines the probability of an element in a collection [start, end). You can omit it (or use an identifier) ​​if you only have a sequence of probabilities.

0
Nov 19 '09 at 8:05
source share
-one
Jan 25 '15 at 20:23
source share



All Articles