NumPy arrays are stored as contiguous blocks of memory. Usually they have one data type (for example, integers, floats or strings of fixed length), and then bits in memory are interpreted as values ββwith this data type.
Creating an array with dtype=object is different. The memory taken by the array is now filled with pointers to Python objects that are stored elsewhere in memory (like the Python list , it actually is a list of pointers to objects, not the objects themselves).
Arithmetic operators, such as * , do not work with arrays, such as ar1 , which have the string_ data string_ (there are special functions instead - see below). NumPy just treats the bits in memory as characters, and the * operator does not make sense. However line
np.array(['avinash','jay'], dtype=object) * 2
works because now the array is an array of (pointers to) Python strings. The * operator is correctly defined for these Python string objects. Python creates new lines in memory and returns a new object array with links to new lines.
If you have an array with string_ or unicode_ dtype and you want to repeat each line, you can use np.char.multiply :
In [52]: np.char.multiply(ar1, 2) Out[52]: array(['avinashavinash', 'jayjay'], dtype='<U14')
NumPy has many other vectorized string methods .
Alex Riley
source share