Google app Script parsing table from messed html

I want to create a script that loads html, parses a table and saves it to SpreadSheet. I am stuck in loading and analyzing.

Xpath for table:

/ Html / body / table / TBODY / tr [5] / TD / table / TBODY / TR / TD [2] / table

I am currently stuck in parsing an Xpath.

function fetchIt() { var fetchString="http://www.zbranebrymova.com/index.php?s_lev=22&type=nabku*signa" var response = UrlFetchApp.fetch(fetchString); var xmlDoc = Xml.parse(response.getBlob().getDataAsString(),true); var b = xmlDoc.getElement().getElement("body").getElement("table") ; Logger.log(b); } 
+1
html parsing xpath google-apps-script
source share
1 answer

I don't know if this will be useful, here is a snippet of my table parsing:

html file FOO.HTM:

 <html> <head> </head> <body style="margin-left:10px"> <table title=""> <tbody> <tr> <th align="center" abbr="Sunday">Sun</th> <th align="center" abbr="Monday">Mon</th> </tr> <tr> <td align="left"><a title="January 01">1</a> <div>Joe,Doe</div> <div>Murphy,Jack</div> </td> <td align="left"><a title="January 02">2</a> <div>Carlson,Carl</div> <div>Guy,Girl</div> <div>Lenin,Vladimir</div> </td> </tr> </tbody> </table> </body> <html> 

and this is how I take it apart:

 function foo() { var page = UrlFetchApp.fetch('foo.htm'); var rows = Xml.parse(page,true).getElement() .getElement("html") .getElement("body") .getElement("table") .getElement("tbody") .getElements("tr"); for (var ii = 0; ii < rows.length; ii++) { var cols = rows[ii].getElements("td"); for (var jj = 0; jj < cols.length; jj++) { var divs = cols[jj].getElements("div"); for (var kk = 0; kk < divs.length; kk++) { var div = divs[kk]; } } } } 

greetings, sean

+1
source share

All Articles