Android: Extract text from image

I am working on an application that needs to convert a jpeg image to text so that I can identify the text recorded on the image. plz give me a guide to do this.

+5
source share
1 answer

EXTRACT FROM Creating an OCR Application Using Tesseract.

Note. . These instructions are for Android SDK r19 and Android NDK r7c. On 64-bit Ubuntu, you may need to install the ia32-libs 32-bit compatibility library. You will also need the necessary PATH variables.

git . Tesseract, Leptonica JPEG Android. Android Eclipse, API- Java API- Tesseract Leptonica . - , .

, ( tess-two - tess-two - , , tess-two-test):

cd <project-directory>/tess-two
ndk-build
android update project --path .
ant release

Eclipse.

File -> Import -> Existing Projects into workspace -> tess-two directory<code>. Right click the project, Android Tools -> Fix Project Properties. Right click -> Properties -> Android -> Check Is Library

tess-two :

Right click your project name -> Properties -> Android -> Library -> Add, and choose tess-two. 

OCR , .

-, . . , , OCR, . , - :

// _path = path to the image to be OCRed
ExifInterface exif = new ExifInterface(_path);
int exifOrientation = exif.getAttributeInt(
        ExifInterface.TAG_ORIENTATION,
        ExifInterface.ORIENTATION_NORMAL);

int rotate = 0;

switch (exifOrientation) {
case ExifInterface.ORIENTATION_ROTATE_90:
    rotate = 90;
break;
case ExifInterface.ORIENTATION_ROTATE_180:
    rotate = 180;
break;
case ExifInterface.ORIENTATION_ROTATE_270:
    rotate = 270;
break;
}

if (rotate != 0) {
    int w = bitmap.getWidth();
    int h = bitmap.getHeight();

    // Setting pre rotate
    Matrix mtx = new Matrix();
    mtx.preRotate(rotate);

    // Rotating Bitmap & convert to ARGB_8888, required by tess
    bitmap = Bitmap.createBitmap(bitmap, 0, 0, w, h, mtx, false);
}
bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true);

, TessBaseAPI OCR, :

TessBaseAPI baseApi = new TessBaseAPI();
// DATA_PATH = Path to the storage
// lang = for which the language data exists, usually "eng"
baseApi.init(DATA_PATH, lang);
// Eg. baseApi.init("/mnt/sdcard/tesseract/tessdata/eng.traineddata", "eng");
baseApi.setImage(bitmap);
String recognizedText = baseApi.getUTF8Text();
baseApi.end();
(You can download the language files from [here][2] and put them in a directory on your device – manually or by code)

, OCRed authorizedText, - , , ! . , , . SD- .

  • PATH. PATH , , . Android SDK SDK PATH. Android NDK , android-ndk PATH.
  • Maven-ising - . , Windows.
  • Ctrl + F , -, , .
+1

All Articles