Edit: use dask.array imread function
As of dask 0.7.0 you do not need to save images in HDF5. Use the imread function imread :
In [1]: from skimage.io import imread In [2]: im = imread('foo.1.tiff') In [3]: im.shape Out[3]: (5, 5, 3) In [4]: ls foo.*.tiff foo.1.tiff foo.2.tiff foo.3.tiff foo.4.tiff In [5]: from dask.array.image import imread In [6]: im = imread('foo.*.tiff') In [7]: im.shape Out[7]: (4, 5, 5, 3)
Older answer that stores HDF5 images
Acquiring data is often the most difficult of problems. Dask.array does not have any automatic integration with image files (although this is doable if there is enough interest.) Fortunately, moving data to h5py easy because h5py supports numpy shorthand syntax. In the following example, we will create an empty h5py dataset, and then save the four tiny tiff files into this dataset in a for loop.
First we get the file names for our images (please forgive the data set of the toys. I have nothing realistic.)
In [1]: from glob import glob In [2]: filenames = sorted(glob('foo.*.tiff')) In [3]: filenames Out[3]: ['foo.1.tiff', 'foo.2.tiff', 'foo.3.tiff', 'foo.4.tiff']
Download and check out the sample image.
In [4]: from skimage.io import imread In [5]: im = imread(filenames[0])
Now we will create an HDF5 file and an HDF5 dataset called '/x' in this file.
In [8]: import h5py In [9]: f = h5py.File('myfile.hdf5')
Great, now we can insert our images one at a time into the HDF5 dataset.
In [11]: for i, fn in enumerate(filenames): ....: im = imread(fn) ....: out[i, :, :, :] = im
At this point, dask.array can wrap out happily
In [12]: import dask.array as da In [13]: x = da.from_array(out, chunks=(1, 5, 5, 3))
If you want to see more native support for stacks of images, then I recommend that you raise a question . It would be pretty easy to use dask.array for your tiff file stack directly without going through HDF5.