I am wondering if any existing libraries exist or are accessible from Objective-C, which will allow me to clear pages formatted like this . In particular, all dates and all text next to each date. If not, what would be the best way to do this? Ordinary expressions? I heard that it NSStringmay already have built-in methods for this. It's true?
I looked around to see if there is an alternative to scraping, such as an XML file or an API. I found the API, but the only clients I see are available in other languages, and they seem to just be able to put content on the pages, rather than retrieve it.
EDIT . Therefore, I found additional information about the API at these links:
And I was able to come up with this query that returns some HTML-encoded text (well, XML format, but it includes text text, for example »a href=, etc. I will continue to look through the documents to see if I can do this a little better if not though, are there any recommendations for parsing this?
EDIT 2 : Well, thanks to this doc page , the easiest and cleanest way I was able to retrieve data uses this constructed link , which returns the raw data (in the wiki markup) of the corresponding section. However, I think I would then need to parse this, although if that is the case, it should be much simpler than the whole article.
- wiki, , Objective-C?
==Events==
* [[710]] – [[Saracen]] invasion of [[Sardinia]].
*[[1275]] – Traditional founding of the city of [[Amsterdam]].
*[[1682]] – [[Philadelphia]], [[Pennsylvania]] is founded.
, , , , NSDictionary , . !