The Falko post is definitely the canonical way to do this. However, if I can offer a more numpy / Pythonic way to do this, I would let the first dimension be the index of the image you want, and the second and third dimensions are the rows and columns of the image, and if desired, the fourth dimension is the desired color channel. Therefore, assuming that your image is M x N and you have K images, you should create a long length K x M x N long or K x M x N x 3 in the case of color images.
So a simple single line in numpy can be specified based on your current code:
data = np.array([mpimg.imread(name) for name in os.listdir('images/sample_images/')], dtype=np.float64)
So, if you want to access the image i th you just do data[i] . This will be independent of whether the image is RGB or grayscale ... therefore, by making data[i] , you will get an RGB image or grayscale image depending on what you decide to use an array for packaging. However, you must make sure that all images are consistent ... That is, they are all colors or all shades of gray.
However, to show you that this works, try this with 5 x 5 x 3 "RGB" images, where each starts at 0 and increases to K-1 , where K will be 10 in this case:
data = np.array([i*np.ones((5,5,3)) for i in range(10)], dtype=np.float64)
See an example run (in IPython):
In [26]: data = np.array([i*np.ones((5,5,3)) for i in range(10)], dtype=np.float64) In [27]: data.shape Out[27]: (10, 5, 5, 3) In [28]: img = data[0] In [29]: img Out[29]: array([[[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]], [[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]], [[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]], [[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]], [[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]]]) In [30]: img.shape Out[30]: (5, 5, 3) In [31]: img = data[7] In [32]: img Out[32]: array([[[ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.]], [[ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.]], [[ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.]], [[ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.]], [[ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.], [ 7., 7., 7.]]]) In [33]: img.shape Out[33]: (5, 5, 3)
In the above run example, I created an array of sample data and 10 x 5 x 5 x 3 , as we expected. We have 10 5 x 5 x 3 matrices. Then I extract the first βRGBβ image and all 0s, as we expect, with a size of 5 x 5 x 3 . I also extract the eighth slice, and we all get 7s, as we expect, with a size of 5 x 5 x 3 .
Obviously, choose any answer that you think is best, but I personally will go with the route indicated above, since indexing into your array to capture the correct image is easier - you allow dimensional broadcasting to do the work for you.