I think there could be another way shorter.
import dask.array as da import dask.dataframe as df ruta ='...' df = dd.read_csv(...) x = df_reg['column you want to transform in array'] def transf(x): xd=x.to_delayed() full = [da.from_delayed(i, i.compute().shape, i.compute().dtype) for i in xd] return da.concatenate(full) x_array=transf(x)
Also, if you want to convert a DaskDataframe with N columns, and therefore each element of the array will be the following array:
Array ((x, x2, x3), (y1, y2, y3), ....)
You must reorder:
from
i.compute().dtype
to
i.compute().dtypes
thanks
Julio CamPlaz
source share