I have a directory with files of size 1000.html and would like to check them all for bad links - preferably using the console. Any tool you can recommend for such a task?
You can extract links from html files using the Lynx text browser. Bash scripts around this should not be difficult.
you can use wgetfor example
wget
wget -r --spider -o output.log http://somedomain.com
at the bottom of output.log, it will indicate if it found wgetbroken links. you can parse usingawk/grep
awk/grep
checklink ( W3C)
Try webgrep command line tools or, if you're comfortable with Perl, HTML :: TagReader from the same author.