Is there a library similar to pyparsing in Java?

I need to quickly create a parser for a very simplified version of the html-like markup language in Java. In python, I would use the pyparsing library for this. Is there something similar for Java? Please do not offer libraries that already exist for html parsing, my application is a school assignment that will demonstrate the movement of an object tree and serialize text using a visitor template, so I don’t think about it in real life. Basically all I need is tags, attributes and text nodes.

+7
java python parsing pyparsing
source share
5 answers

Another good parser generator is ANTLR , it may be what you are looking for.

+7
source share

It may be redundant for your use, but javacc is a great generator of industrial power syntax. I have used this program / library several times, its reliable and worthy study, especially if you are going to work with languages ​​and compilers. Here is a description of the program from the site listed above:

The Java [tm] compiler compiler (JavaCC [tm]) is the most popular parser generator for use with Java [tm] applications. A parser generator is a tool that reads a grammar specification and converts it into a Java program that can recognize grammar matches. In addition to the parser generator, JavaCC provides other standard features related to parser generation, such as tree building (via the JJTree tool included in JavaCC), actions, debugging, etc.

+3
source share

A quick search for parser generators in Java gives JParsec . I never used it, but it was inspired by the Haskell library, so by definition it should be good :-)

+3
source share

I like JParsec (which I just opened thanks to Torsten) because it does not generate code ... :-) Perhaps less efficient, but enough for small tasks.
I found a similar library, JTopas .

There is a good list of parsers (generators or not) in the Java Source .

+2
source share

There are many options for handling strings in java. Maybe the most basic classes java.util.Scanner and java.util.StringTokenizer are useful to you?

Another good choice might be the org.apache.commons.lang.text library. http://commons.apache.org/lang/apidocs/org/apache/commons/lang/text/package-summary.html

+1
source share

All Articles