I am trying to load a static wiki mirror using wget. I only need the latest version of each article (not a complete history or differences between versions). It would be easy to just download all of this and delete unnecessary pages later, but it would take too much time and unnecessary voltage on the server.
There are several pages that I clearly do not need, for example:
?WhoIsDoingWhat action = Diff & date = 1184177979
Is there a way to tell wget not to load and recurs to URLs that have "action = diff" in them? Or else exclude URLs matching some regular expression?
source
share