What to use for checking html links in a large project, on Linux?

I have a directory with files of size 1000.html and would like to check them all for bad links - preferably using the console. Any tool you can recommend for such a task?

+5
source share
4 answers

You can extract links from html files using the Lynx text browser. Bash scripts around this should not be difficult.

0
source

you can use wgetfor example

wget -r --spider  -o output.log http://somedomain.com

at the bottom of output.log, it will indicate if it found wgetbroken links. you can parse usingawk/grep

+4

checklink ( W3C)

+2
source

Try webgrep command line tools or, if you're comfortable with Perl, HTML :: TagReader from the same author.

0
source

All Articles