Fast way to count non-zero bits in a natural number

I need a quick way to count the number of bits in an integer in python. My current decisions

bin(n).count("1") 

but I wonder if there is a faster way to do this?

PS: (I represent a large two-dimensional binary array as a single list of numbers and perform bitwise operations, and this reduces the time from hours to minutes, and now I would like to get rid of these extra minutes.

Edit: 1. It should be in python 2.7 or 2.6

and optimization for small numbers doesn't really matter, as that would not be a clear bottleneck, but in some places I have numbers with 10,000 + bits.

For example, this is a 2000 bit case:

12448057941136394342297748548545082997815840357634948550739612798732309975923280685245876950055614362283769710705811182976142803324242407017104841062064840113262840137625582646683068904149296501029754654149991842951570880471230098259905004533869130509989042199261339990315125973721454059973605358766253998615919997174542922163484086066438120268185904663422979603026066685824578356173882166747093246377302371176167843247359636030248569148734824287739046916641832890744168385253915508446422276378715722482359321205673933317512861336054835392844676749610712462818600179225635467147870208L
+102
python binary counting
Mar 22 2018-12-22T00:
source share
10 answers

For integers of arbitrary length, bin(n).count("1") is the fastest I could find in pure Python.

I tried to adapt ร“scar and Adam solutions for processing integers in 64-bit and 32-bit blocks, respectively. Both were at least ten times slower than bin(n).count("1") (the 32-bit version took about twice as much time).

On the other hand, gmpy popcount() took roughly popcount() from bin(n).count("1") . So if you can install gmpy, use this.

To answer the question in the comments, for bytes I would use a lookup table. You can create it at runtime:

 counts = bytes(bin(x).count("1") for x in range(256)) # py2: use bytearray 

Or simply define it literally:

 counts = (b'\x00\x01\x01\x02\x01\x02\x02\x03\x01\x02\x02\x03\x02\x03\x03\x04' b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05' b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05' b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06' b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05' b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06' b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06' b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07' b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05' b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06' b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06' b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07' b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06' b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07' b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07' b'\x04\x05\x05\x06\x05\x06\x06\x07\x05\x06\x06\x07\x06\x07\x07\x08') 

Then it counts[x] to get the number 1 bit at x where 0 โ‰ค x โ‰ค 255.

+109
Mar 22 2018-12-22T00:
source share

You can adapt the following algorithm:

 def CountBits(n): n = (n & 0x5555555555555555) + ((n & 0xAAAAAAAAAAAAAAAA) >> 1) n = (n & 0x3333333333333333) + ((n & 0xCCCCCCCCCCCCCCCC) >> 2) n = (n & 0x0F0F0F0F0F0F0F0F) + ((n & 0xF0F0F0F0F0F0F0F0) >> 4) n = (n & 0x00FF00FF00FF00FF) + ((n & 0xFF00FF00FF00FF00) >> 8) n = (n & 0x0000FFFF0000FFFF) + ((n & 0xFFFF0000FFFF0000) >> 16) n = (n & 0x00000000FFFFFFFF) + ((n & 0xFFFFFFFF00000000) >> 32) # This last & isn't strictly necessary. return n 

This works for 64-bit positive numbers, but it easily expands and increases the number of operations with the argument logarithm (i.e. linearly with the argument bit size).

To understand how this works, imagine that you split the entire 64-bit string into 64 1-bit buckets. Each bucket value is equal to the number of bits set in the bucket (0 if no bits are set, and 1 if one bit is set). The first conversion leads to a similar state, but with 32 buckets each 2-bit. This is achieved by properly moving the buckets and adding their values โ€‹โ€‹(one addition takes care of all buckets, since transfer is not possible in all codes - an n-bit number is always long enough to encode number n). Further transformations lead to states with an exponentially decreasing number of buckets of exponentially growing size, until we reach one 64-bit long bucket. This gives the number of bits set in the original argument.

+22
Mar 22 2018-12-22T00:
source share

The implementation of the population counting algorithm in Python is implemented here, as described in this article:

 def numberOfSetBits(i): i = i - ((i >> 1) & 0x55555555) i = (i & 0x33333333) + ((i >> 2) & 0x33333333) return (((i + (i >> 4) & 0xF0F0F0F) * 0x1010101) & 0xffffffff) >> 24 

It will work for 0 <= i < 0x100000000 .

+13
Mar 22 2018-12-22T00:
source share

According to this post , this is apparently one of the fastest Hamming Weight implementations (unless you mind using about 64K of memory).

 #http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable POPCOUNT_TABLE16 = [0] * 2**16 for index in range(len(POPCOUNT_TABLE16)): POPCOUNT_TABLE16[index] = (index & 1) + POPCOUNT_TABLE16[index >> 1] def popcount32_table16(v): return (POPCOUNT_TABLE16[ v & 0xffff] + POPCOUNT_TABLE16[(v >> 16) & 0xffff]) 

In Python 2.x, you have to replace range with xrange .

Edit

If you need better performance (and your numbers are larger integers), check out the GMP library. It contains handwritten assemblies for different architectures.

gmpy is a C-coded Python extension module that wraps the GMP library.

 >>> import gmpy >>> gmpy.popcount(2**1024-1) 1024 
+8
Mar 22 '12 at 20:19
source share

I really like this method. It is simple and fairly fast, but also not limited in bit length, since python has infinite integers.

This is actually trickier than it sounds because it avoids wasting time scanning zeros. For example, to count the set bits to 1000000000000000000000010100000001 will take as much time as in 1111.

 def get_bit_count(value): n = 0 while value: n += 1 value &= value-1 return n 
+3
Apr 25 '19 at 0:03
source share

You said that Naming was too slow. Do you use it to store individual bits? Why not extend the idea of โ€‹โ€‹using int as bitmaps, but use Numpy to store them?

Store n bits as an array from ceil(n/32.) 32-bit ints. Then you can work with the numpy array in the same (well, fairly similar) way to use ints, including using them to index another array.

The algorithm basically consists in calculating in parallel the number of bits set in each cell and summing the bitboat of each cell.

 setup = """ import numpy as np #Using Paolo Moretti answer http://stackoverflow.com/a/9829855/2963903 POPCOUNT_TABLE16 = np.zeros(2**16, dtype=int) #has to be an array for index in range(len(POPCOUNT_TABLE16)): POPCOUNT_TABLE16[index] = (index & 1) + POPCOUNT_TABLE16[index >> 1] def popcount32_table16(v): return (POPCOUNT_TABLE16[ v & 0xffff] + POPCOUNT_TABLE16[(v >> 16) & 0xffff]) def count1s(v): return popcount32_table16(v).sum() v1 = np.arange(1000)*1234567 #numpy array v2 = sum(int(x)<<(32*i) for i, x in enumerate(v1)) #single int """ from timeit import timeit timeit("count1s(v1)", setup=setup) #49.55184188873349 timeit("bin(v2).count('1')", setup=setup) #225.1857464598633 

Although I am surprised, no one suggested you write module C.

+2
May 03 '14 at 6:01
source share

You can use the algorithm to get the binary string [1] of an integer, but instead of concatenating the string by counting the number of units:

 def count_ones(a): s = 0 t = {'0':0, '1':1, '2':1, '3':2, '4':1, '5':2, '6':2, '7':3} for c in oct(a)[1:]: s += t[c] return s 

[1] https://wiki.python.org/moin/BitManipulation

+1
Aug 31 '17 at 6:17
source share

number = 789 number.bit_length ()

0
Apr 16 '19 at 17:22
source share
 #Python prg to count set bits #Function to count set bits def bin(n): count=0 while(n>=1): if(n%2==0): n=n//2 else: count+=1 n=n//2 print("Count of set bits:",count) #Fetch the input from user num=int(input("Enter number: ")) #Output bin(num) 
0
Jun 15 '19 at 5:14
source share

Turns out your initial view is a list of lists of integers that are either 1 or 0. Just recount them in that view.




The number of bits in integers is a constant in python.

However, if you want to count the number of bits set, the fastest way is to create a list that matches the following pseudo-code: [numberofsetbits(n) for n in range(MAXINT)]

This will provide you with a constant search for time after creating a list. See @PaoloMoretti's answer for a good implementation of this. Of course, you do not need to store all this in memory - you can use some kind of persistent keystore or even MySql. (Another option is to implement your own disk storage).

-2
Mar 22 2018-12-22T00:
source share



All Articles