FST (final converters) Libraries, C ++ or java

I have a problem using FST. In principle, I will make a morphological parser, and at this moment I will have to work with large converters. Performance is a big issue here.

I recently worked in C ++ in other projects where performance is important, but now I am considering java because java benefits and because java is getting better.

I have studied some comparisons between java and C ++, but I cannot decide which language I should use for this particular problem, because it depends on the use of lib.

I can not find much information about java libs, so my question is: are there any open source Java files that have good performance, like the RWTH FSA Toolkit , which I read in the article, which is the fastest version of C ++ lib?

Thanks to everyone.

+6
java c ++ performance fsm
source share
5 answers

What are the “benefits” of Java for your goals? What specific problem does the platform you need solve? What performance limitation should you consider? Were the "comparisons" honest, because Java is actually extremely difficult to compare. There is also C ++, but you can at least get some algorithmic boundary guarantees from STL.

I suggest you take a look at OpenFst and the AT & T final state conversion tools. There are others, but I think your concern for Java puts the cart before the horse - focus on what solves your problem well.

Good luck

+4
source share

http://jautomata.sourceforge.net/ and http://www.cs.duke.edu/csed/jflap/ are Java based, but I have no experience using them, so I cannot comment on the effectiveness.

+2
source share

I am one of the developers of the morfologik-stemming library. This is pure Java and its performance is very good, both when creating the machine, and when using it. We use it for morphological analysis in LanguageTool.

+2
source share

The problem is the minimum size of your objects in Java. In C ++, without virtual methods and runtime type identification, your objects exactly match their content. And the time that your machines take to manage memory greatly affects performance.

I think this should be the main reason for choosing C ++ over Java.

0
source share

OpenFST is a C ++ end-state platform that is truly complete. Some of the CMU employees have posted it in Java for use in processing their natural language.

A series of blog posts describing him .
The code is on svn .

Update: I put it in java here

0
source share

All Articles