Convert int to bytes in Python 3

I tried to create this byte object in Python 3:

b'3\r\n'

so I tried the obvious (for me) and found some weird behavior:

 >>> bytes(3) + b'\r\n' b'\x00\x00\x00\r\n' 

Apparently:

 >>> bytes(10) b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' 

I was unable to see any pointers to why byte conversion works this way while reading the documentation. However, in this Python problem, I found some unexpected messages about adding format to bytes (see also Formatting Python 3 bytes ):

http://bugs.python.org/issue3982

This interacts even worse with oddities like bytes (int) returning zeros now

and

It would be much more convenient for me if bytes (int) returned the ASCII identification of this int; but frankly, even a mistake would be better than such a behavior. (If I wanted this behavior, which I never had, I would prefer it to be a cool method, called as "bytes.zeroes (n)".)

Can someone explain to me where this comes from?

+116
python
Jan 09 '14 at
source share
12 answers

The way it was designed - and that makes sense, because usually you call bytes in an iterable instead of a single whole:

 >>> bytes([3]) b'\x03' 

docs indicate this , as well as docstring for bytes :

  >>> help(bytes) ... bytes(int) -> bytes object of size given by the parameter initialized with null bytes 
+121
Jan 09 '14 at 10:37
source share

With Python 3.2 you can do

 >>> (1024).to_bytes(2, byteorder='big') b'\x04\x00' 

https://docs.python.org/3/library/stdtypes.html#int.to_bytes

 def int_to_bytes(x: int) -> bytes: return x.to_bytes((x.bit_length() + 7) // 8, 'big') def int_from_bytes(xbytes: bytes) -> int: return int.from_bytes(xbytes, 'big') 

Accordingly, x == int_from_bytes(int_to_bytes(x)) . Please note that this encoding only works for unsigned (non-negative) integers.

+133
May 21 '15 at
source share

You can use struct pack :

 In [11]: struct.pack(">I", 1) Out[11]: '\x00\x00\x00\x01' 

The ">" is a byte-order (big-endian) , and the symbol "I" is a symbol. Therefore, you can be specific if you want to do something else:

 In [12]: struct.pack("<H", 1) Out[12]: '\x01\x00' In [13]: struct.pack("B", 1) Out[13]: '\x01' 

This works the same for both python 2 and python 3 .

Note: the reverse operation (bytes for int) can be performed using unpack .

+34
Nov 14 '14 at 0:25
source share

Python 3. 5+ introduces% -interpolation ( printf style formatting) for bytes :

 >>> b'%d\r\n' % 3 b'3\r\n' 

See PEP 0461 - Adding% Formatting to Bytes and Byte Arrays .

In earlier versions, you could use str and .encode('ascii') result:

 >>> s = '%d\r\n' % 3 >>> s.encode('ascii') b'3\r\n' 

Note: this is different from what int.to_bytes :

 >>> n = 3 >>> n.to_bytes((n.bit_length() + 7) // 8, 'big') or b'\0' b'\x03' >>> b'3' == b'\x33' != '\x03' True 
+19
Aug 01 '15 at 12:13
source share

The documentation says:

 bytes(int) -> bytes object of size given by the parameter initialized with null bytes 

Sequence:

 b'3\r\n' 

This is the character "3" (decimal 51), the character "\ r" (13) and "\ n" (10).

Therefore, the method will consider it as such, for example:

 >>> bytes([51, 13, 10]) b'3\r\n' >>> bytes('3', 'utf8') + b'\r\n' b'3\r\n' >>> n = 3 >>> bytes(str(n), 'ascii') + b'\r\n' b'3\r\n' 

Tested on IPython 1.1.0 and Python 3.2.3

+10
Jan 09 '14 at 13:15
source share

ASCIIfication 3 - "\x33" not "\x03" !

This is what python does for str(3) , but that would be completely wrong for bytes, since they should be treated as arrays of binary data and not be abused as strings.

The easiest way to achieve what you want is bytes((3,)) , which is better than bytes([3]) , because initializing a list is much more expensive, so never use lists when you can use tuples. You can convert large integers using int.to_bytes(3, "little") .

Initializing bytes with a given length makes sense and is most useful since they are often used to create some type of buffer, for which you need some allocated memory size. I often use this when initializing arrays or expanding a file by writing zeros to it.

+5
Aug 01 '15 at 10:40
source share

int (including Python2 long ) can be converted to bytes using the following function:

 import codecs def int2bytes(i): hex_value = '{0:x}'.format(i) # make length of hex_value a multiple of two hex_value = '0' * (len(hex_value) % 2) + hex_value return codecs.decode(hex_value, 'hex_codec') 

The inverse transform may be performed by another:

 import codecs import six # should be installed via 'pip install six' long = six.integer_types[-1] def bytes2int(b): return long(codecs.encode(b, 'hex_codec'), 16) 

Both functions work in both Python2 and Python3.

+5
Aug 09 '17 at 8:57
source share

From bytes of docs :

Accordingly, constructor arguments are interpreted as for bytearray ().

Then from bytearray docs :

The optional source parameter can be used to initialize the array in several ways:

  • If it is an integer, the array will have this size and will be initialized to zero bytes.

Note that this is different from 2.x behavior (where x> = 6), where bytes just str :

 >>> bytes is str True 

PEP 3112 :

2.6 str differs from 3.0s bytes in various ways; in particular, the constructor is completely different.

+3
Jan 9 '14 at 10:39 on
source share

The behavior is due to the fact that in Python prior to version 3, bytes was just an alias for str . In Python3.x, bytes is an immutable version of bytearray - a completely new type, not backward compatibility.

+3
Jan 09 '14 at 10:44
source share

I was interested to learn about the performance of various methods for a single whole in the range [0, 255] , so I decided to conduct some temporary tests.

Based on the time below and the general trend that I observed when I tried many different values ​​and configurations, struct.pack seems to be the fastest, followed by int.to_bytes , bytes , and str.encode (which is not surprising) is the slowest. Note that the results show slightly more variations than presented, and int.to_bytes and bytes sometimes switch the speed ranking during testing, but struct.pack clearly the fastest.

Results in CPython 3.7 for Windows:

 Testing with 63: bytes_: 100000 loops, best of 5: 3.3 usec per loop to_bytes: 100000 loops, best of 5: 2.72 usec per loop struct_pack: 100000 loops, best of 5: 2.32 usec per loop chr_encode: 50000 loops, best of 5: 3.66 usec per loop 

Test module (named int_to_byte.py ):

 """Functions for converting a single int to a bytes object with that int value.""" import random import shlex import struct import timeit def bytes_(i): """From Tim Pietzcker answer: https://stackoverflow.com/a/21017834/8117067 """ return bytes([i]) def to_bytes(i): """From brunsgaard answer: https://stackoverflow.com/a/30375198/8117067 """ return i.to_bytes(1, byteorder='big') def struct_pack(i): """From Andy Hayden answer: https://stackoverflow.com/a/26920966/8117067 """ return struct.pack('B', i) # Originally, jfs answer was considered for testing, # but the result is not identical to the other methods # https://stackoverflow.com/a/31761722/8117067 def chr_encode(i): """Another method, from Quuxplusone answer here: https://codereview.stackexchange.com/a/210789/140921 Similar to g10guang answer: https://stackoverflow.com/a/51558790/8117067 """ return chr(i).encode('latin1') converters = [bytes_, to_bytes, struct_pack, chr_encode] def one_byte_equality_test(): """Test that results are identical for ints in the range [0, 255].""" for i in range(256): results = [c(i) for c in converters] # Test that all results are equal start = results[0] if any(start != b for b in results): raise ValueError(results) def timing_tests(value=None): """Test each of the functions with a random int.""" if value is None: # random.randint takes more time than int to byte conversion # so it can't be a part of the timeit call value = random.randint(0, 255) print(f'Testing with {value}:') for c in converters: print(f'{c.__name__}: ', end='') # Uses technique borrowed from https://stackoverflow.com/q/19062202/8117067 timeit.main(args=shlex.split( f"-s 'from int_to_byte import {c.__name__}; value = {value}' " + f"'{c.__name__}(value)'" )) 
+3
Jan 03 '19 at 18:37
source share

Although brunsgaard's previous answer is efficient encoding, it only works for unsigned integers. This is based on the fact that it works for signed and unsigned integers.

 def int_to_bytes(i: int, *, signed: bool = False) -> bytes: length = ((i + ((i * signed) < 0)).bit_length() + 7 + signed) // 8 return i.to_bytes(length, byteorder='big', signed=signed) def bytes_to_int(b: bytes, *, signed: bool = False) -> int: return int.from_bytes(b, byteorder='big', signed=signed) # Test unsigned: for i in range(1025): assert i == bytes_to_int(int_to_bytes(i)) # Test signed: for i in range(-1024, 1025): assert i == bytes_to_int(int_to_bytes(i, signed=True), signed=True) 

For the encoder, instead of (i + ((i * signed) < 0)).bit_length() , (i + ((i * signed) < 0)).bit_length() i.bit_length() since the latter leads to inefficient encoding - 128, -32768, etc.

Credit: CervEd to eliminate minor inefficiencies.

+1
Jan 11 '19 at 6:29
source share

If performance is not important to you, you can first convert int to str.

 number = 1024 str(number).encode() 
-one
Jul 27 '18 at 13:16
source share



All Articles