I just got access to the Stackoverflow data dump , and I'm disappointed that the message body field is in HTML, not Markdown.I suspect Markdown is in the original database, because this is what I see if I try to edit the answer.
I want to restore Markdown from a large set of answers. I will process hundreds of records in batch mode using either command line tools or some Lua or C library, so an interactive tool like wmd Markdown editor is not suitable. Can people say What tools are available to help me recover Markdown from a Stackoverflow data dump?
(A related question, not a duplicate: Convert HTML back to Markdown in wmd .)
Markdownify converts HTML to Markdown.
See also: MetaSO / Can Markdown be recovered from a SO data dump?
take a look at pandoc: http://johnmacfarlane.net/pandoc/
There is an html2markdown tool included with pandoc that works very well, and the program runs from the command line, which makes batch conversion pretty enjoyable.
here is the man page: http://johnmacfarlane.net/pandoc/html2markdown.1.html