Python - getting all images from html file

Question

Python - getting all images from html file

Can someone help me parse the html file to get links for all the images in the file in python?

Preferably with a third-party module ...

Thanks!

+7

python image urllib

user377419 Nov 28 '10 at 3:16

source share

3 answers

using PSL only

 from html.parser import HTMLParser class MyParse(HTMLParser): def handle_starttag(self, tag, attrs): if tag=="img": print(dict(attrs)["src"]) h=MyParse() page=open("index.html").read() h.feed(page)

+10

Kabie Nov 28 '10 at 3:38

source share

In general, it is generally accepted that lxml is faster than Beautiful Soup (ref) . His tutorial can be found here: (link) You can also take a look at https://stackoverflow.com/a/167444/2/ .

+2

Overmind jiang Nov 28 '10 at 4:34

source share

Russell Dias · Accepted Answer · 2010-11-28T03:21:41+0000

You can use Beautiful Soup . I know what you said without a third-party module. However, it is an ideal tool for parsing HTML.

import urllib2 from BeautifulSoup import BeautifulSoup page = BeautifulSoup(urllib2.urlopen("http://www.url.com")) page.findAll('img')

Python - getting all images from html file

More articles: