random .sample (set or list or Numpy, Nsample array) is very fast, but it is not clear to me if you want something like this:
import random Setsize = 10000 Samplesize = 100 Max = 1 << 20 bigset = set( random.sample( xrange(Max), Setsize )) # initial subset of 0 .. Max def calc( aset ): return set( x + 1 for x in aset ) # << your code here # sample, calc a new subset of bigset, add it -- for iter in range(3): asample = random.sample( bigset, Samplesize ) newset = calc( asample ) # new subset of 0 .. Max bigset |= newset
You can use Numpy or bitarray arrays instead of set , but I would expect time in calc () to dominate.
What are your Setsize and Samplesize, roughly?
denis
source share