Extract paragraphs from Wikipedia API using PHP cURL

Here I am trying to use the Wikipedia API (MediaWiki) - http://en.wikipedia.org/w/api.php

I'm stuck on # 3. I see a bunch of JSON data that includes "\ n \ n" between paragraphs, but for some reason the PHP explode () function is not working.

Essentially, I just want to grab the “meat” of every Wikipedia page (not the headings or any formatting, just the content) and break it into paragraphs into an array.

Any ideas? Thank!

+5
source share
1 answer

\n\n- these are literally these characters, not line breaks. Make sure you use single quotes around the string in explode:

$parts = explode('\n\n', $text);

If you decide to use double quotes, you will have to avoid characters \, for example:

$parts = explode("\\n\\n", $text);

On the other hand: why do you extract data in two different formats? Why not just go to JSON or just XML?

+1
source

All Articles