I am scanning a table from a web link and would like to rebuild the table by removing all script tags. Here are the source codes.
response = requests.get(url) soup = BeautifulSoup(response.text) table = soup.find('table') for row in table.find_all('tr') : for col in row.find_all('td'): #remove all different script tags #col.replace_with('') #col.decompose() #col.extract() col = col.contents
How to remove all script tags? Take the following cell as an example, which includes the tag a , br and td .
<td><a href="http://www.irit.fr/SC">Signal et Communication</a> <br/><a href="http://www.irit.fr/IRT">Ingénierie Réseaux et Télécommunications</a> </td>
Expected Result:
Signal et Communication Ingénierie Réseaux et Télécommunications
source share