I am trying to set some Tesseract parameters using python-tesseract shell, but for Init Only parameters I cannot do this.
I read the Tesseract documentation and it looks like I should use Init () to install them. This is evidenced by the setVariable documentation:
Only works for non-init * variables (init variables must be passed to Init ()).
So, the Init () function has this signature:
const char * datapath, const char * language, OcrEngineMode oem, char ** configs, int configs_size, const GenericVector< STRING > * vars_vec, const GenericVector< STRING > * vars_values, bool set_only_non_debug_params
and my code is as follows:
import tesseract configVec = ['user_words_suffix', 'load_system_dawg', 'load_freq_dawg'] configValues = ['brands', '0', '0'] api = tesseract.TessBaseAPI() api.Init(".","eng",tesseract.OEM_TESSERACT_ONLY, None, 0, configVec, configValues, False) api.SetPageSegMode(tesseract.PSM_AUTO_OSD) api.SetVariable("tessedit_char_whitelist", "€$0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,.\"-/+%")
The problem is that I get the following error:
NotImplementedError: Wrong number or type of arguments for overloaded function 'TessBaseAPI_Init'. Possible C/C++ prototypes are: tesseract::TessBaseAPI::Init(char const *,char const *,tesseract::OcrEngineMode,char **,int,GenericVector< STRING > const *,GenericVector< STRING > const *,bool)
And the problem is with these GenericVectors. If I use this line instead:
api.Init(".","eng",tesseract.OEM_TESSERACT_ONLY, None, 0, None, None, False)
it works. So the problem is that GenericVectors. How can I pass the correct parameters to Init ()?
Is there any other way to set only init parameters in code? Can I load the configuration file from the code with these parameters?
Thanks for your time, any help is greatly appreciated.