How to stop while Tesseract OCRing?

im using tesseract for OCRing images in my iPhone app. I want to stop the entire OCR process while it is running.

here is my code:

in the .h file:

dispatch_queue_t main; tesseract::TessBaseAPI *tesseract; uint32_t *pixels; 

in the .m file:

 - (void)processOcrAt:(UIImage *)image { [self setTesseractImage:image]; //char* utf8Text = tesseract->GetUTF8Text(); //[self performSelector:@selector(ocrProcessingFinished:) withObject:[NSString stringWithUTF8String:utf8Text]]; //dispatch_queue_t queue = dispatch_queue_create("com.awesome", 0); main = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0); dispatch_async(main, ^{ tesseract->Recognize(NULL); char* utf8Text = tesseract->GetUTF8Text(); [self performSelectorOnMainThread:@selector(ocrProcessingFinished:) withObject:[NSString stringWithUTF8String:utf8Text] waitUntilDone:NO]; delete [] utf8Text; }); } -(IBAction)backPressed:(id)sender{ dispatch_release(main); tesseract->Clear(); //tesseract->End(); delete tesseract; tesseract = nil; delete pixels; [self.navigationController popViewControllerAnimated:YES]; } 

When I press the back button when ocr is working, it crashes. because ocr is still working. How can I stop him? I could not find any method in tesseract.

+4
source share
3 answers

here is the answer from the tesseract form: https://groups.google.com/forum/?fromgroups=#!topic/tesseract-ocr/1uLF4BmmmUg

I think that the essence of the problem is your attempt to stop the OCR thread at a random place in its execution, but expect the state of the Tesseract instance to be consistent. You have the right to want to delete the instance, otherwise you will have a memory leak, but it seems that you cannot do this after you stop the OCR stream abnormally. In our own iPhone application (ScanBizCards), what we do in this case should allow the OCR stream to complete its work in the background, even if its results are ignored and not shown to the user. The disadvantage is that if the user starts a new scan immediately after the interruption, we delay the start of a new scan until the previous (interrupted) scan ends.

0
source

What about the ETEXT_DESC argument from the Recognize () function? (not sure if he was there when you wrote your answer fulberto100). This is used by the monitor to achieve progress, as well as its cancellation. It is used in TessBaseAPI :: ProcessPage. I have not tried this myself, though.

 ETEXT_DESC monitor; monitor.cancel = NULL; monitor.cancel_this = NULL; monitor.set_deadline_msecs(timeout_millisec); // Now run the main recognition. failed = Recognize(&monitor) < 0; 
+1
source

This program explains the flow of a Tesseract page in two threads:

 #include <baseapi.h> #include <allheaders.h> #include <iostream> #include <thread> using namespace std; using namespace tesseract; //monitorProgress will show actual progress done by tesseract void monitorProgress(); //Here image send to extract text void tesseractProcessing(); TessBaseAPI *api; ETEXT_DESC *monitor = new ETEXT_DESC(); int main() { //This statement will launch multiple threads in loop thread t1(tesseractProcessing); thread t2(monitorProgress); std::cout << "The main function execution\n"; t1.join(); t2.join(); return 0; } void monitorProgress() { while (1) { cout << "Current Progress : " << monitor[0].progress << endl; } } void tesseractProcessing() { api = new TessBaseAPI(); Pix *image = pixRead("myimage.jpg"); api->Init("tessdata", "eng", OEM_DEFAULT); api->SetPageSegMode(PSM_AUTO); api->SetImage(image); api->Recognize(monitor); cout << "out from recognition"<<endl; ofstream myfile("myfile.html"); if (myfile.is_open()) { myfile << api->GetHOCRText(0); } myfile.close(); } 
0
source

All Articles