Wget -k converts files differently on Windows and Linux

I have GNU Wget 1.10.2 for windows and linux, and the -k option behaves differently on the two.

-k, --convert-links link at the loaded HTML point to local files.

In the windows it produces:

www.example.com/index.html
www.example.com/ index.html@page = about
www.example.com/ index.html@page = contact
www.example.com/ index.html@page = sitemap

and on linux it produces:

www.example.com/index.html
www.example.com/index.html?page=about
www.example.com/index.html?page=contact
www.example.com/index.html?page=sitemap

This is a problem in Linux because when I serve the mirror through Apache, it will not distinguish between 4 generated pages, since the part after the question mark ( ? ) Is used as the query string for the file.

Any ideas on how I can control this?

thanks

+5
source share
4 answers

You cannot use a question mark (?) In a file name on NTFS or FAT32. This is why wget uses the at (@) character.

On Linux, on most file systems only a slash (/) is forbidden, so wget uses a question mark (since it is part of the URI).

You can force any behavior using --restrict-file-names=unixor --restrict-file-names=windows.

From the wget documentation:

"unix", Wget '/ 0-31 128-159. Unix- .

"", Wget '\,' |, '/, ':,'? ',' ',' *, '<,' > 0-31 128-159. , Wget Windows '+ ': '@ "? . URL-, www.xemacs.org:4300/search.pl?input=blah Unix www.xemacs.org+4300/search.pl@input=blah Windows. Windows.

+11

Linux, , Apache, 4 , , (?) .

URL, :

www.example.com/index.html%3Fpage=about

- , - , .

+4

Linux, , Apache 4 , (?) .

, sed :

find . -type f -name "*html*" -exec sed -i -r 's/(src|href)=(["\x27])(.*?)(\?)(.*?)\2/\1=\2\3%3F\5\2/g' {} + 

? href= src= % 3F. (\ x27 - )

0

All Articles