HTML parsing from web page

I need to extract some information from a web page and reformat it for the user.

Since the webpage is somewhat regular, I am now using HttpClient to extract the HTML as a string, and I am extracting the substrings in the given places with the corresponding data.

Anyway, I wonder if there is a better way, perhaps a way to promote HTML. How do you do that?

Greetings

+6
java android html
source share
4 answers

Ideally, you should use a real HTML parser. I used Jsoup in the past on Android:

http://jsoup.org/

+7
source share

I personally like to use Jericho parser: http://jericho.htmlparser.net/docs/index.html

It is easy to use, has a lot of examples on the project page and works fine with pure HTML (private tags, etc.).

+3
source share

We used HTTPUnit, doing this in the past.

+1
source share

jsoup.org is better, but Cobra also has some additional features (with CSS and JavaScript support).

+1
source share

All Articles