I am using the MNIST example with 60,000 training images and 10,000 test images. How to find out which of the 10,000 test images has the wrong classification / forecast?
Just use model.predict_classes() and compare the output with true labes. i.e:
model.predict_classes()
incorrects = np.nonzero(model.predict_class(X_test).reshape((-1,)) != y_test)
to get indices of incorrect forecasts