I want to clear a data block from a series of pages that have data hidden in a JSON object inside a script tag. I'm pretty comfortable with BeautifulSoup, but I think I could bark the wrong tree, trying to use it to get data from JavaScript.
The structure of the pages is approximately:
...
<script>
$(document).ready(function(){
var data = $.data(graph_selector, [
{ data: charts.createData("Stuff I want")}
])};
</script>
The head and body have a million scripts each, but there is only one var dataper page. I'm not sure how I would identify this specific one <script>for BeautifulSoup, except for the presencevar data
Can I do it? Or do I need another tool?
source
share