Digit recognition on CNN

Question

Digit recognition on CNN

I am testing printed numbers (0-9) in a convolutional neural network. This gives 99% accuracy in the MNIST dataset, but when I tried it using fonts installed on the computer (Ariel, Calibri, Cambria, Cambria math, Times New Roman) and trained images created in fonts (104 images per font ( Total 25 fonts - 4 images per font (slight difference)), the learning error rate does not fall below 80%, i.e. 20% accuracy. Why?

Here is the number "2" Sample Image -

I resized each image to 28 x 28.

Details: -

Workout data size = 28 x 28 images. Network Settings - Like LeNet5 Network Architecture -

Input Layer -28x28 | Convolutional Layer - (Relu Activation); | Pooling Layer - (Tanh Activation) | Convolutional Layer - (Relu Activation) | Local Layer(120 neurons) - (Relu) | Fully Connected (Softmax Activation, 10 outputs)

This works by giving 99 +% accuracy to MNIST. Why is it so bad with computer fonts? CNN can handle many variations in data.

+6

deep-learning machine-learning ocr image-recognition handwriting-recognition

kumar030 Jul 15 '16 at 7:01

source share

3 answers

Martin thoma · Answer 1 · 2016-07-17T11:24:35+0000

I see two possible problems:

Preprocessing: MNIST is not only 28px x 28px, but also:

The original black and white (two-level) images from NIST were normalized in size to fit the size of 20x20 pixels, while maintaining their aspect ratio. The resulting images contain gray levels as a result of the smoothing method used by the normalization algorithm. the images were centered in the 28x28 image by calculating the center of mass of the pixels and translating the image to position this point in the center of the 28x28 field.

Source: MNIST Website

retraining

MNIST has 60,000 case studies and 10,000 test cases. How much do you have?
Have you tried dropouts (see document )?
Have you tried using methods to increase the data set? (for example, by slightly changing the image, perhaps by slightly changing the aspect ratio, you can also add noise - however, I do not think this will help)
Have you tried smaller networks? (And how big are your filters / how many filters do you have?)

Notes

Interesting idea! Have you tried simply applying the prepared MNIST network to your data? What are the results?

Santiago regojo · Answer 2 · 2016-07-15T08:01:39+0000

This may be a retraining problem. This can happen when your network is too complex to solve the problem. Check out this article: http://es.mathworks.com/help/nnet/ug/improve-neural-network-generalization-and-avoid-overfitting.html

FiReTiTi · Answer 3 · 2016-07-17T09:13:27+0000

This definitely looks like a retraining issue. I see that you have two convolution layers, two maximum union layers and two fully connected. But how much total weight? You have only 96 examples for each class, which is certainly less than the number of weights you have in your CNN. Remember that you want at least 5 times more copies in your training set than the weights of your CNN.

You have two solutions to improve your CNN:

Shake each instance in the training set. You each number is about 1 pixel. He will already multiply your training set by 9.
Use a transformer layer. He will add elastic deformation to every number in every era. This will greatly strengthen your training by artificially increasing your training set. Moreover, it will be much more efficient to predict other fonts.

Digit recognition on CNN

More articles: