Recursive directory parsing with Pandoc on Mac

I found this question that had the answer to the question about performing batch conversions with Pandoc, but it does not answer the question of how to make it recursive. I stipulate in advance that I am not a programmer, so I ask for help here.

The Pandoc documentation is subtle regarding the details regarding file transfers to an executable, and based on the script, it seems that Pandoc itself is not able to parse more than one file at a time. The script below works fine on Mac OS X, but only processes files in the local directory and displays the results in the same place.

find . -name \*.md -type f -exec pandoc -o {}.txt {} \;

I used the following code to get something from the result I was hoping for:

find . -name \*.html -type f -exec pandoc -o {}.markdown {} \;

This simple script, using Pandoc installed on Mac OS X 10.7.4, converts all the relevant files to the directory in which I run it for markdown and saves them in one directory. For example, if I had a file called apps.html , it will convert this file in apps.html.markdown to the same directory as the source files.

While I am glad that he is doing the conversion, and this is fast, I need him to process all the files located in one directory and put the markup versions into a set of mirrored directories for editing. Ultimately, these directories are in the Github repositories. One branch is for editing, and the other is for production / publication. In addition, this simple script saves the original extension and adds a new extension to it. If I go back, it will add the HTML extension after the markdown extension, and the file size will only grow and grow.

Technically, all I need to do is to parse one branch directory and synchronize it with the production one, then when all the changed, deleted and new data is checked correctly, I can make commits to post the changes. Looks like the Find command can handle all this, but I just have no idea how to set it up correctly, even after reading Mac OS X and Ubuntu pages.

Any kind words of wisdom will be deeply appreciated.

Tf

+4
source share
2 answers

Create the following Makefile :

 TXTDIR=sources HTMLS=$(wildcard *.html) MDS=$(patsubst %.html,$(TXTDIR)/%.markdown, $(HTMLS)) .PHONY : all all : $(MDS) $(TXTDIR) : mkdir $(TXTDIR) $(TXTDIR)/%.markdown : %.html $(TXTDIR) pandoc -f html -t markdown -s $< -o $@ 

(Note: indented lines must begin with TAB - this may not happen in the above example, since markdown usually removes tabs.)

Then you just need to enter "make" and it will run pandoc for each file with the .html extension in the working directory, creating a markdown version in "sources". The advantage of this method in using "find" is that it will only run pandoc in a file that has been modified since it was last run.

+8
source

Just for the record: here's how I achieved the conversion of a bunch of HTML files with their Markdown equivalents:

 for file in $(ls *.html); do pandoc -f html -t markdown "${file}" -o "${file%html}md"; done 

When you look at the script code from the -o argument, you will see that it uses string manipulations to delete the existing html with the end of the md file.

+8
source

All Articles