Finally, after solving this, use the git mailing list. After all, this is not a git problem, but the problem of my filters regarding pdftk. (Maybe a coding thing? Didnβt dig deeper.)
A useful post on the git mailing list is here: http://permalink.gmane.org/gmane.comp.version-control.git/224797
Basically, the script filter that I wrote was not go-powerful, which means that re-applying a clean filter to the cleaned file will change the file.
Background: When pdftk is used to update pdf metadata using the metadata that it extracted from this exact pdf, first of all, to my surprise, it changes the pdf file.
So, I included a security check in my filter, and the problem disappeared.
For reference, here is the complete filter:
#!/bin/bash ## use GNU coreutils on OS X explicitely ## (install via homebrew, for instance: ## > brew install coreutils ## > brew install gnu-sed ## ) if [ ${OSTYPE:0:6} == "darwin" ]; then MKTMP=gmktemp SED=gsed else MKTMP=mktemp SED=sed fi FILEASARG=true if [ "$#" == 0 ]; then FILEASARG=false fi if $FILEASARG ; then FILENAME="$1" else FILENAME=`$MKTMP` cat /dev/stdin > "${FILENAME}" fi TMPFILE=`$MKTMP` TMPFILE2=`$MKTMP` TMPFILE3=`$MKTMP` ## dump the pdf metadata to a file and replace the dates pdftk "$FILENAME" dump_data > "$TMPFILE3" $SED -e '/Date/{ N; s/Date\nInfoValue: D:.*/Date\nInfoValue: D:19790101072619/ }' < "$TMPFILE3" > "$TMPFILE" ## if the metadata did not change, do nothing if diff "$TMPFILE3" "$TMPFILE"; then rm "$TMPFILE3" rm "$TMPFILE" if [ -n $FILEASARG ] ; then cat "$FILENAME" fi exit 0 fi ## update the pdf metadata pdftk "$FILENAME" update_info "$TMPFILE" output "$TMPFILE2" ## overwrite the original pdf mv -f "$TMPFILE2" "$FILENAME" ## clean up rm -f "$TMPFILE" rm -f "$TMPFILE2" if [ -n $FILEASARG ] ; then cat "$FILENAME" fi
source share