How to use a link grammar analyzer as a grammar checker

Abiword uses a grammar parser as a simple grammar checking tool. I would like to duplicate this function using Python.

Poorly documented Python bindings exist, but I don't know how to use them to simulate grammar checks in Abiword.

(I am not interested in the actual results of the parsing. I only need to know if the sentence parses OK using the link grammar parser, and if not, which words cannot be related.)

What would be the best way to achieve this?

+6
source share
1 answer

I can't help you mimic AbiWord's grammar checking abilities using Python bindings, but I can at least help you build it and test its functionality.

Creation using MS Visual Studio (32-bit architecture)

I would say that the “best way to achieve this” is to build the Link Grammar library and Python bindings on a Linux machine, following the extensive instructions in their readme file . However, judging by your comment above , Linux may not be an option, and it looks like you want to use Visual Studio over use, for example. Cygwin .

Dependencies

Regex

As indicated in the readme, the Link Grammar library depends on some form of the POSIX-compatible regex library - this is baked on Linux. However, on Windows, you can (or rather should) choose the library implementation to use. Fortunately, the version 2.7 of the port provided by GnuWin played well with the Visual Studio solution / project files provided by Link Grammar 5.3.11 (found in the %LINK_GRAMMAR%\msvc14 ).

However, you must make sure that the Visual Studio GNUREGEX_DIR assembly macro points to the directory into which you unzipped the regular expression library (for example, D:\Program Files (x86)\GnuWin32 ). However, note that these assembly macros do not match the Windows environment variables. Although an environment variable was set in Windows 10 with the name GNUREGEX_DIR , Visual Studio did not use this variable until I changed the definition of assembly macros in Link Grammar, namely: %LINK_GRAMMAR%\msvc14\Local.props string:

 <GNUREGEX_DIR>$(HOMEDRIVE)$(HOMEPATH)\Libraries\gnuregex</GNUREGEX_DIR> 

to

 <GNUREGEX_DIR>$(GNUREGEX_DIR)</GNUREGEX_DIR> 

Swig

To create Python bindings, you need SWIG on your system. However, in order for the assembly defined by the Visual Studio project Python2.vcxproj to find the SWIG executable, you need to add the appropriate directory to the Windows path, for example. D:\Program Files (x86)\swigwin-3.0.10 .

As in the regex library, you need to set up a VS project to be able to find your Python directory, for example. change <PYTHON2>C:\Python27</PYTHON2> in Local.props to <PYTHON2>$(PYTHON2)</PYTHON2> if you have the corresponding environment variable.

Construction

After all of the above libraries can be found by Visual Studio, the build process is pretty painless. Just create a Python2 project, and if you have a VS solution file ( LinkGrammar.sln ) LinkGrammar.sln , it should automatically create the LinkGrammar and LinkGrammarExe projects on which it depends.

Shared Library Resolution

After creating the executable, you still need to make sure that a common library with regular expressions (DLLs) can be found: for this, the directory containing the required library (in this case, regex2.dll ) must include your path. Perhaps the easiest way is to add the directory to your global path, for example. %GNUREGEX_DIR%\bin" when using the GnuWin library mentioned above, pointing to the GNUREGEX_DIR environment GNUREGEX_DIR .

Work with Python

Now that you have tested the Windows executable and completed the Python bindings, you can import them into a Python script. To make sure they are imported correctly and that SWIG has correctly placed the appropriate DLLs, the Grammar readme link tells you to run the make-check.py script make-check.py to load and run your script using Link Grammar:

 make-check [PYTHON_FLAG] PYTHON_OUTDIR [script.py] [ARGUMENTS] 

where OUTDIR is the directory where your Python bindings were written, for example. Win32\Debug\Python2 . Unfortunately, however, although this file is mentioned in readme for version 5.3.11, it is not actually present in the "stable" version 5.3. 11 distributed ; despite the fact that there is a version of it in the main GitHub repository . However, you can simply get this file from the Git repository and then use it in the msvc14 directory of your 5.3.11 distribution. However, as stated above, this script requires regex2.dll be in the Windows path: if it has not been added to the global path, you will have to add it to the path available to the Python executable when you run the script.

API API and Python API

I have not used the Link Grammar parser myself and therefore cannot help you, but you can still figure out how to use them by looking at the C code for the LinkGrammarExe project. You can start by looking at the main function in link-parser\link-parser.c :

 sent = sentence_create(input_string, dict); ... num_linkages = sentence_parse(sent, opts); 

In a simple CLI program created by a VS project, it simply checks for num_linkages , and if its value is 0 , it displays No complete linkages found , which the user can interpret as meaning that the sentence is non-grammatical. Of course, this behavior can be changed to take lower grades, find a word (s) that does not fit, etc., and so you can first study the functionality using API C. Later, if you really want to use Python bindings , Python methods are called similarly to their C samples and— see the clinkgrammar.py file:

 def sentence_parse(sent, opts): return _clinkgrammar.sentence_parse(sent, opts) sentence_parse = _clinkgrammar.sentence_parse 
+1
source

All Articles