This is a late answer, but for completeness: it’s quite difficult to get even about 90% of the sample of all the icons.
Some time ago I wrote a WordPress plugin: http://wordpress.org/extend/plugins/wp-favicons/ , which is trying to get closer.
but. it starts by browsing favicon repositories like google favicons, getfavicons etc.
b. if none of them returns an icon (I check this by matching the default icon that they return) I start by trying to get the icon myself
from. this includes moving pages, but also checking redirects with NO autoredirect, as well as moving 404, because the icon may also be present on 404. In the end, this means that you will have to parse the redirects in the html header as well as the javascript redirects to get close to 100%
e. after that I do some checks in the physical image file, because sometimes on some servers (I tested 300,000+) sometimes the files are returned with the wrong mime type, etc.
The code is still not perfect, because it goes crazy in the details, you will find many strange situations: people have erroneously encoded paths (img / favicon.ico, where img is NOT in the root), duplicate headers in the html output, various server responses from head and body, etc.
the core of the resulting part is here: http://plugins.svn.wordpress.org/wp-favicons/trunk/includes/server/class-http.php so you can deploy it, but remember that checking the answer should really be (check image file, mime, etc.)
edelwater Feb 24 '13 at 17:54 2013-02-24 17:54
source share