Equal arrays, but not the same visually

enter image description here

I have a 32x32x3 image, say, for example, one of the cifar10 images in keras. Now, let's say I want to do some manipulation. Firstly, to make sure that I am doing everything right, I tried to copy the image (this is not what I want to do, so please do not tell me how to copy the image without three loops, I need three loops to manipulate some values )

from keras.datasets import cifar10 import matplotlib.pyplot as plt (X_train, Y_train), (X_test, Y_test) = cifar10.load_data() im = numpy.reshape(X_train[0], (3, 32, 32)) im = im.transpose(1,2,0) imC = numpy.zeros((32,32,3)) for k in range(3): for row in range(0,32): for col in range(0,32): imC[row][col][k] = im[row][col][k] 

Now, if I check if they are the same, they, in fact, I see a “cool” printed

 if (im==imC).all(): print "cool" 

But when I try to visualize them, they are different:

 plt.imshow( imC ) plt.show() plt.imshow( im ) plt.show() 

What's happening?

+5
source share
1 answer

Images in the Python CIFAR10 dataset have pixel values ​​of type numpy.uint8 . (Presumably they are read from PNG files or something like that.) So X_train.dtype == numpy.uint8 and therefore im.dtype == numpy.uint8 .

The created array has the default element type numpy.float64 . In other words, imC.dtype == numpy.uint8 .

It happens that matplotlib.pyplot.imshow processes its input differently depending on its element type. In particular, if you give an m-by-n-by-3 array of the uint8 element type, then for 0 it will mean the darkest, and 255 the lightest for each of the three color channels, as you would expect; if you give an m-by-n-by-3 float64 type float64 element, however, it wants all values ​​to be in the range 0 (darkest) to 1 (lightest), and the documentation says nothing about that will happen with values ​​outside this range.

I will question what happens to values ​​outside this range: I think the code probably does something like: multiply by 255, round to an integer, treat as uint8 . This means that when 0 becomes 0, and 1 becomes 255.

But if this last step means ejecting all but the low 8 bits, it also means that 2 becomes 254, 3 becomes 253, ..., 255 becomes 1! In other words, if you make a very understandable mistake by specifying an imshow image whose pixel values ​​are equal to floats in the range 0..255, these values ​​will be effectively reset, so 0-> 0, 1-> 255, 2 → 254, .. ., 255-> 1. (This is not quite the same as turning the range exactly upside down, because 0 is saved.)

And here's what happened to you: each imC element imC numerically equal to the corresponding im element, but since imC is an array with a floating point, and not an unsigned array - a small integer array, it gets the above, and you get an almost photographic negative of the image you expected.

+4
source

All Articles