Very fast fetch from a set with a fixed number of elements in python

I need to selectively randomly select a number from a set with a fixed size, do some calculations and return a new number to the set. (The required numerical samples are very large)

I tried to save the numbers in a list and use random.choice () to select an item, delete it, and then add a new item. But it is too slow!

I am going to store numbers in a numpy array, an example of a list of indices and for each index to perform a calculation.

  • Is there a faster way to complete this process?
+7
source share
3 answers

Python lists are implemented internally as arrays (for example, Java ArrayList s, C ++ std::vector s, etc.), so removing an element from the middle is relatively slow: all subsequent elements need to be reindexed. (For more details see http://www.laurentluce.com/posts/python-list-implementation/ .) Since the order of the elements does not seem relevant to you, I would recommend that you just use random.randint(0, len(L) - 1) to select index i , then use L[i] = calculation(L[i]) to update the i th element.

+7
source

I need to selectively randomly select a number from a set with a fixed size, do some calculations and return the new number to the set.

 s = list(someset) # store the set as a list while 1: i = randrange(len(s)) # choose a random element x = s[i] y = your_calculation(x) # do some calculation s[i] = y # put the new number back into the set 
+3
source

random .sample (set or list or Numpy, Nsample array) is very fast, but it is not clear to me if you want something like this:

 import random Setsize = 10000 Samplesize = 100 Max = 1 << 20 bigset = set( random.sample( xrange(Max), Setsize )) # initial subset of 0 .. Max def calc( aset ): return set( x + 1 for x in aset ) # << your code here # sample, calc a new subset of bigset, add it -- for iter in range(3): asample = random.sample( bigset, Samplesize ) newset = calc( asample ) # new subset of 0 .. Max bigset |= newset 

You can use Numpy or bitarray arrays instead of set , but I would expect time in calc () to dominate.

What are your Setsize and Samplesize, roughly?

+2
source

All Articles