NumPy - What is the difference between a buffer and a string?

They seem to give me the same result:

In [32]: s Out[32]: '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x15\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' In [27]: np.frombuffer(s, dtype="int8") Out[27]: array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int8) In [28]: np.fromstring(s, dtype="int8") Out[28]: array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int8) In [33]: b = buffer(s) In [34]: b Out[34]: <read-only buffer for 0x035F8020, size -1, offset 0 at 0x036F13A0> In [35]: np.fromstring(b, dtype="int8") Out[35]: array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int8) In [36]: np.frombuffer(b, dtype="int8") Out[36]: array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int8) 

When to use against another?

+10
python numpy
source share
1 answer

From a practical point of view, the difference is that:

 x = np.fromstring(s, dtype='int8') 

Run a copy of the string in memory, and:

 x = np.frombuffer(s, dtype='int8') 

or

 x = np.frombuffer(buffer(s), dtype='int8') 

Will use the line memory buffer directly and will not use any extra memory. Using frombuffer will also result in a read-only array if the input to buffer is a string, since the strings are immutable in python.

(* Neglecting the few bytes of memory used for the optional python ndarray . The base memory for the data will be shared.)


If you are not familiar with buffer objects ( memoryview in python3.x) , they are essentially a way for C-level libraries to expose a block of memory for use in python. This is basically a python interface for controlled access to raw memory.

If you were working on what displayed the buffer interface, then you probably want to use frombuffer . (Python 2.x and python 3.x bytes lines expose a buffer interface, but you will get a read-only array, since python strings are immutable.)

Otherwise, use fromstring to create a numpy array from the string. (If you do not know what you are doing, and want to tightly control the use of memory, etc.)

+17
source share

All Articles