Windows NLTK MEGAM Max Ent Algorithms

I played with NLTK in Python, but could not use the MEGAM Max Ent algorithm due to the lack of a 64-bit Windows executable of any version of the MEGAM library equal to or higher than 0.3 (you must enable the -nobias option for NLTK to work, which was introduced in Section 0.3).

http://www.cs.utah.edu/~hal/megam/

The author recommends compiling your own executable file, although getting O'Caml to work with Win64 is another nightmare.

Does anyone have a version of the compiled version of the Windows MEGAM executable, which is either version 0.4 or higher? I would be eternally grateful!

+6
source share
2 answers

I was able to get the Python NLTK MegaM library to work after a little work in Windows 7, the solution is quite simple (in retrospect). My methodology is described in detail below and links are included. I hope you find them useful.

High level:

  • Install OCaml Compiler (Special Version: OCamlPro)
  • Download source code for MagaM
  • Download and install the GNU32Make utility
  • Edit MegaM MakeFile in 2 places.
  • Run Gnu32Make to create the magam.exe file
  • Programmatically indicate the location of the megam.exe file for Python NLTK
  • Run the nltk.MaxentClassifier.train command

References:

Gory Details

There are some features of this process that can easily go south given the lack of documentation - I would like to draw attention to a few found ...

Windows OCamPro

It is very important to get the OCaml Pro version for Windows so that autonomous entities depend on others. The version I listed is that it will be installed in one directory of your choice. It is very important to add the path to the bin directory to the Windows system path.

MEGAM

Windows is a problem for this library because it has some SNAFU with the developer, so you have to download the source and compile it yourself. It is not as difficult as it seems at first glance. As a normal process, it’s quite simple to parse the .Targz file into a directory and unlock it 2X to go to the original directory. The most important goals of 2 are: (a) to edit the Makefile correctly and (b) to add the path to the directory containing the resulting megam.exe file to the Windows system path .

GNU32Win

This is a direct process, just make sure to add the path to the exe file with the Gnu32Make file after installation.

MEGAM MakeFile

In the directory in which you unpacked the MagaM files, there will be a MakeFile in which there are two lines where you must get the edit right to ensure the correct assembly.

First : (change the ones marked in bold in the missing line with the one in the hopeless line)

  • WITHSTR = str.cma -cclib -lstr
  • WITHSTR = str.cma -cclib -lcamlstr

Second: (Change the path of the first line with the equivalent path in your system)

NOTE This path should point to the "\ lib \ caml" directory of your OcamlPro installation on your system.

  • WITHCLIBS = -I / usr / lib / ocaml / 3.09.2 / caml
  • WITHCLIBS = -I E: \ OCamlPro \ OCPWin64 \ lib \ caml

Run make in the megam directory

At this point, you just need to open the Windows CMD shell, cd, into the directory where you changed the makefile, and just run make to compile and create the megam.exe executable.

You should see a result similar to:

do ocamldep * .mli * .ml> .depend ocamlc -g -custom -o megam str.cma -cclib -lcamlstr bigarray.cma -cclib -lbigarray unix.cma -cclib -lunix -IE: \ OCamlPro \ OCPWin64 \ lib \ caml fastdot_c.c fastdot.cmo intHashtbl.cmo arry.cmo util.cmo data.cmo bitvec.cmo cg.cmo wsemlm.cmo bfgs.cmo pa.cmo perceptron.cmo radapt.cmo kernelmap.cmo abffs.cmo main.cmo

Programmatically indicate the location of the Megam.exe file on Pythons NLTK

The last thing I came across was how to tell Pythonn NLTK the exact location of my magam.exe file. In the calling code, I placed a statement pointing to this immediately before the line where I myself called MaxentClassifier, and it worked just fine, see below.

Note. It took a lot of time on my development workstation, so be patient.

nltk.config_megam('E:\megam\megam.exe') self.classifier = nltk.MaxentClassifier.train(train_set, algorithm='megam', trace=0) 
+5
source

It can also be compiled using cygwin:

  • download cygwin installer: https://cygwin.com/install.html
  • during installation, check gnu make and ocaml (both the compiler and runtime)
  • change makefile
    • WITHSTR = str.cma -cclib -lstr β†’ WITHSTR = str.cma -cclib -lcamlstr
    • WITHCLIBS = path to your cygwin ocaml dir
  • compile with make. There might be a difference in debug compilation and opt build. I can build using cygwin with an option, but not debug it, but native with debug, but not opt.
  • add cygwin bin to PATH
  • run megam using nltk.config_megam (your path to the megam)
0
source

Source: https://habr.com/ru/post/926331/


All Articles