Cannot get "text only" from the Wikipedia API. You can load the HTML page (if you do this using index.php, not api.php, use action=render to avoid loading the entire contents of the skin) or wikitext (which you can do via the API or by passing action=raw in index.php); you will have to analyze it yourself to remove a bit that you do not want to save.
In HTML output, MediaWiki usually adds classes well to various interface elements that you might want to filter out; templates and such created by users, perhaps, are smaller (for example, a hack for sorting tables simply puts some text in the range display:none , not a class).
To get wikitext via the API, use prop=revisions . To get displayable HTML, use action=parse .
Anomie
source share