It will receive all src values ββonly if they are present. Or else it will skip the <script>
from bs4 import BeautifulSoup import urllib2 url="http://rediff.com/" page=urllib2.urlopen(url) soup = BeautifulSoup(page.read()) sources=soup.findAll('script',{"src":True}) for source in sources: print source['src']
I get the following two src values ββas a result
http://imworld.rediff.com/worldrediff/js_2_5/ws-global_hm_1.js http://im.rediff.com/uim/common/realmedia_banner_1_5.js
I think this is what you want. Hope this is helpful.
Venkateshwaran selvaraj
source share