The join function of a numpy array consisting of a string

I am trying to use the join function in a numpy array consisting of only strings (representing binary floats) to get a concatenated string to use the numpy.fromstring function, but the join function doesn't work fine.

Any idea why? What alternative function can I use for this?

Here is a separate example to show my problem:

 import numpy as np nb_el = 10 table = np.arange(nb_el, dtype='float64') print table binary = table.tostring() binary_list = map(''.join, zip(*[iter(binary)] * table.dtype.itemsize)) print 'len binary list :', len(binary_list) # len binary list : 10 join_binary_list = ''.join(binary_list) print np.fromstring(join_binary_list, dtype='float64') # [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.] binary_split_array = np.array(binary_list) print 'nb el :', binary_split_array.shape # nb el : (10,) print 'nb_el * size :', binary_split_array.shape[0] * binary_split_array.dtype.itemsize # nb_el * size : 80 join_binary_split_array = ''.join(binary_split_array) print 'len binary array :', len(join_binary_split_array) # len binary array : 72 table_fromstring = np.fromstring(join_binary_split_array, dtype='float64') print table_fromstring # [ 1. 2. 3. 4. 5. 6. 7. 8. 9.] 

As you can see, using the join function in the list ( binary_list ) works correctly, but in the equivalent numpy array ( binary_split_array ) this is not so: we can see that the returned string has only 72 characters instead of 80.

+7
python string arrays join numpy
source share
1 answer

The first element of your join_binary_split_array is an empty string:

 print(repr(binary_split_array[0])) '' 

The first item on your list:

 '\x00\x00\x00\x00\x00\x00\x00\x00' 

The empty string has a length of 0:

 print([len("".join(a)) for a in binary_split_array]) print([len("".join(a)) for a in binary_list]) [0, 8, 8, 8, 8, 8, 8, 8, 8, 8] [8, 8, 8, 8, 8, 8, 8, 8, 8, 8] 

Byte String Length 8:

 print(len('\x00\x00\x00\x00\x00\x00\x00\x00')) 8 

A tobytes call will give the same output length as the list:

 print(len(binary_split_array.tobytes())) 80 table_fromstring = np.fromstring(binary_split_array.tobytes(), dtype='float64') print table_fromstring [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.] 

The numpy array handles empty bytes differently for python, null bytes are truncated.

+3
source share

All Articles