Once you have a node list, you can apply a function on it to extract the node. A function of type xmlValue or xmlGetAttr .... For example:
x <- xpathApply(y, "//table/tr") sapply(x,xmlValue) ## it a list of nodes.. " Test1.1 Test1.2 " " Test1.3 Test1.4 "
Which is equivalent:
xpathSApply(y,"//table/tr",xmlValue) " Test1.1 Test1.2 " " Test1.3 Test1.4 "
EDIT
I am sure your question can be resolved with the correct xpath. You must learn to work with xml files when working with a database. xpath is just like sql query. it's fast, and many browsers can help you create the correct xpath.
For instance:
xpathSApply(y,"//table/tr[2]/td[1]",xmlValue) # second tr and first td [1] " Test1.3 " xpathSApply(y,"//table/tr[2]/td[3]",xmlValue) # second tr and third td
EDIT
OP looks if it wants to replicate the XML structure (get tr and td in the same order)
here is the way, I don't think this is a more efficient way ...
nn.trs <- length(xpathSApply(y,"//table/tr",I)) lapply(seq(nn.trs),function(i){ xpathSApply(y,paste("//table/tr[",i,"]/td",sep=''),xmlValue) }) [[1]] [1] " Test1.1 " " Test1.2 " [[2]] [1] " Test1.3 " " Test1.4 "
If, if the number td is the same for each tr, you can replace lapply with sapply , and you get:
[,1] [,2] [1,] " Test1.1 " " Test1.3 " [2,] " Test1.2 " " Test1.4 "
But I think readHtmlTable is better in this case.