This may be a special case, but you should be able to use numpy digitize . It should be indicated here that the bunkers should monotonously decrease or increase.
>>> import numpy >>> a = numpy.array([10,7,2,0]) >>> b = numpy.array([10,9,8,7,6,5,4,3,2,1]) >>> indices = [numpy.where(a<=x)[0][0] for x in b] [0, 1, 1, 1, 2, 2, 2, 2, 2, 3] >>> numpy.digitize(b,a) array([0, 1, 1, 1, 2, 2, 2, 2, 2, 3])
Setting for time test:
a = np.arange(50)[::-1] b = np.random.randint(0,50,1E3) np.allclose([np.where(a<=x)[0][0] for x in b],np.digitize(b,a)) Out[55]: True
Some timings:
%timeit [np.where(a<=x)[0][0] for x in b] 100 loops, best of 3: 4.97 ms per loop %timeit np.digitize(b,a) 10000 loops, best of 3: 48.1 Β΅s per loop
It seems that it is accelerating by two orders of magnitude, this will largely depend on the number of boxes. Your timings will be different.
To compare with Jamie's answer, I timed the following two code snippets. Since I basically wanted to focus on the speed of searchsorted vs digitize , I cut Jamie's code a bit. The corresponding snippet is here:
a = np.arange(size_a)[::-1] b = np.random.randint(0, size_a, size_b) ja = np.take(a, np.searchsorted(a, b, side='right', sorter=a)-1)
This slightly exceeds my limited matplotlib ability, so this is done in the DataGraph. I built a logarithmic relation timing_digitize/timing_searchsorted so that values ββgreater than zero are searchsorted faster and values ββless than zero digitize faster. Colors also give relative speeds. For example, it is shown that in the upper right corner (a = 1E6, b = 1E6), digitize is ~ 300 times slower than searchsorted , while for smaller sizes, digitize can be up to 10 times faster. The black line is the breakeven point:
It seems like raw searchsorted almost always faster for large cases for speed, but the simple digitize syntax is almost as good if the number of bins is small.