Example for parsing (X) Html with libxml2 in iOS

I recently started playing with libxml2 lib in an iPhone iOS project. I read some useful links, for example:

http://laurentparenteau.com/blog/2009/12/parsing-xhtml-in-ca-libxml2-tutorial/

and a very nice post here:

http://bill.dudney.net/roller/objc/entry/libxml2_push_parsing

I managed to get the remote html (with ASIHTTPRequest) and successfully get the data (NSData) in the 'didReceiveData' event, transferred to the wrapper class containing the parser created using htmlCreatePushParserCtxt (SAX style). I get beautifully startDocument and endDocument callbacks. In the callbacks "startElement" and "character", I print the parameter "localname" ( const xmlChar ). In the console, I see that it finds "html", then "body", then the "p" tag, but then I get a lot of unrecognizable characters (sometimes it looks the same as Chinese ..) ...

Anyway, before going into many details of the code, I want to ask if anyone has a working example of parsing (x) html with libxml2 in an objective-c based project? I tried googling for more than the two links mentioned, but still no luck.

+4
source share
2 answers

I suggest AQXMLParser alan quartemain: http://blog.alanquatermain.me/2013/01/09/using-aqxmlparser-and-friends/

it is a thin shell aroung libxml2 and is much more efficient than NSXMLParser.

set the HTMLMode property to yes, so it uses libxml in html mode .. (I used it many times, and it works pretty well even with invalid html)

0
source

Why do you want to use libxml2 through the built-in NSXMLParser Apple class? If you are creating an iOS application, it makes sense to use the Foundation class for this, except for the C library. You can access the documentation for NSXMLParser on the Apple website .

If you do not want to use NSXMLParser directly, you can try parsing XML using NSXMLDocument , which has an easy to use interface, use the - (id)initWithData:(NSData *)data options:(NSUInteger)mask error:(NSError **)error method - (id)initWithData:(NSData *)data options:(NSUInteger)mask error:(NSError **)error for parsing XML data. You can even use the NSXMLDocumentTidyHTML parameter for the init method to read HTML data as XHTML.

-3
source

All Articles