Extract href with Beautiful Soup

I use this code to access my link:

links = soup.find("span", { "class" : "hsmall" })
links.findNextSiblings('a')
for link in links:
  print link['href']
  print link.string

The link does not have an identifier or class or anything else, it is just a classic link with the href attribute.

My script answer:

print link['href']
TypeError: string indices must be integers

Can you help me get the href value? thank!

+5
source share
2 answers

Ok, now it works with the following code:

linkSpan = soup.find("span", { "class" : "hsmall" })
link = [tag.attrMap['href'] for tag in linkSpan.findAll('a', {'href': True})]
for lien in link:
  print "LINK = " + lien`
+3
source

The links still apply to your soup.find. So you can do something like:

links = soup.find("span", { "class" : "hsmall" }).findNextSiblings('a')
for link in links:
    print link['href']
    print link.string
+8
source

All Articles