Isolation of functions from neural networks

I am making simple recognition of letters and numbers using neural networks. So far, I have used every pixel of the letter image as an input to the network. Needless to say, this approach creates networks that are very large. Therefore, I would like to extract functions from my images and use them as contributions to NN. My first question is what properties of letters are good for recognizing them. The second question is how to present these functions as a contribution to neural networks. For example, I could detect all corners in letters and have them as a vector of points (x, y). How to convert this vector to something suitable for NN (since the size of the vector may differ for different letters).

+7
source share
3 answers

Many people have used various features for OCR. The simplest of which is, of course, transferring pixel values โ€‹โ€‹directly.

OpenCV samples have letter recognition data extracted from a UCI dataset. It uses about 16 different functions. Check out this SOF: How to Create a Data Dataset Image, such as โ€œLetter Image Recognition Datasetโ€, from UCI

You can also see an article explaining this in one of your answers. You can get it through googling.

You may also be interested in this PPT . It gives a brief explanation of the various traits extraction methods currently in use.

+3
source

This article is an introduction to artificial intelligence. OCR using Artificial Neural Networks Kluever (2008) provides an overview of four methods for extracting OCR using neural networks. It describes the following methods:

  • Run Length Encoding (RLE): for this you need a binary image (i.e. only white or black). A binary string can be encoded into a smaller representation.
  • Edge Detection: Find the edges. You can be pretty rude with this, so instead of returning the exact (x, y) coordinates, you can reduce the matrix only by counting if such an edge occurs in reduced locations (i.e. by 20%, 40%, 60 and 80 % Images).
  • Count 'True Pixels': reduces the dimension from width * height the image matrix to width + height . You use the vector width vector and height as a separate input.
  • Basic input matrix: you have already tried this; Entering the entire matrix gives good results, but, as you noticed, can lead to high dimensionality and training time. You can experiment with reducing the size of your images (for example, from 200x200 to 50x50).
+4
source

If you have a very large dimensional input vector, I suggest that you apply Basic Component Analysis (PCA) to remove redundant functions and reduce the dimension of the vector function.

+1
source

All Articles