(Python) for randomly selecting a key based on proportionality / weight

I don’t understand a bit how to find a clean algorithm to do the following:

Suppose I have dict k:

>>> k = {'A': 68, 'B': 62, 'C': 47, 'D': 16, 'E': 81} 

Now I want to randomly select one of these keys based on the "weight" that they have in the sum (i.e. the sum) of the keys.

 >>> sum(k.values()) >>> 274 

So there

 >>> 68.0/274.0 >>> 0.24817518248175183 

24.81% percentage change, selected A.

How do you write an algorithm that takes care of this? In other words, does it guarantee that with 10,000 random samples A will be selected 2.481 times?

+8
python
Apr 03 '10 at 8:41
source share
6 answers

It uses a weighted selection function with some code that executes it.

 import random def WeightedPick(d): r = random.uniform(0, sum(d.itervalues())) s = 0.0 for k, w in d.iteritems(): s += w if r < s: return k return k def Test(): k = {'A': 68, 'B': 62, 'C': 47, 'D': 16, 'E': 81} results = {} for x in xrange(10000): p = WeightedPick(k) results[p] = results.get(p, 0) + 1 print results Test() 
+10
Apr 03 '10 at 9:40
source share

This should do the trick:

 >>> k = {'A': 68, 'B': 62, 'C': 47, 'D': 16, 'E': 81} >>> import random >>> def weighted_pick(dic): ... total = sum(dic.itervalues()) ... pick = random.randint(0, total-1) ... tmp = 0 ... for key, weight in dic.iteritems(): ... tmp += weight ... if pick < tmp: ... return key 
+9
Apr 3 2018-10-10T00:
source share

The algorithm will be like that ..

Choose a random number from 1 to 274. To do this, call the rand () function (suppose it returns a value from 0 to 1), multiply rand () by 274. The resulting value should now be in the range, If it is between 1 and 68, select A if between 69 and 130 select B and so on. That way, your likelihood will survive, and your operation will be successful.

PS: I'm a Java guy, I don't know the Python syntax.

+2
Apr 3 '10 at 8:52
source share

The easiest way to do this, when your weights are relatively small integers (for example, in your example), is to build a long line containing all the characters in the corresponding weights and select a random case from it:

 import random d = {'A': 68, 'B': 62, 'C': 47, 'D': 16, 'E': 81} s = ''.join(k*v for k,v in d.items()) random.choice(s) 

Please note that this method will use a lot of memory if your scales are large, in which case you may prefer a different solution.

+2
Apr 03 '10 at 8:52
source share

should also take a look at this link

create two lists for k, say xk and yk

 from scipy import stats custm = stats.rv_discrete(name='test', values=(xk, yk)) custm.rvs(size=1) 
+2
Jan 22 '15 at 6:58
source share

I developed the algorithm several years ago, with an application in Perl and SQL, you can read about it here , complete with analysis and tests, why this is (most likely) correct.

The concept is simple: for each element, choose a random number, drag it through some function, which depends on the weight of the elements and selects the element with the smallest value.

This function:

 x[i] = -log(1 - rand())/weight[i] 
-one
Apr 03 '10 at 11:28
source share



All Articles