I am interested in using numpy arrays of several heterogeneous data types. Since numpy indicates that the data should be uniform, this would be achieved by defining a super-dtype that acts as a wrapper over the union of all subtypes. Access to the subtype fields then gives a different interpretation of the underlying data.
There is already some means for this, for example
dtype(('|S2', [('x', '|i1'), ('y', '|i1')]))
refers to an array of double-byte strings, but the first and second bytes can also be interpreted as integers through the field names "x" and "y". However, I cannot figure out how to assign a field label to a double-byte string.
Is it possible to make this more general so that we can impose any number of different field specifications on the data?
My first attempt was to specify field offsets in dtype, but it didn’t work out with a complaint that offsets should be ordered (i.e. non-overlapping data).
dtype1 = np.dtype(dict( names=['a','b'], formats=['|a2','<i2'], offsets=[0,0]))
Another technique works, but is cumbersome. In this method, I can define several variables as a representation on the same basic data and change the data type of different variables so that I can access data in different formats, i.e.
a=np.zeros(3, dtype='<a2') b=a[:] b.dtype='<i2'
This allows me to access the data in the form of strings or integers, depending on whether I am viewing a or b. But this is a cumbersome way to manipulate data. Ideally, I would like to specify many different fields with arbitrary offsets. Is there any way to do this?