I am trying to find nodes in an html document using Xpath in R. In the code below, I would like to know how to return NULL or NA when node is missing:
library(XML)
b <- '
<bookstore specialty="novel">
<book style="autobiography">
<author>
<first-name>Joe</first-name>
<last-name>Bob</last-name>
</author>
</book>
<book style="textbook">
<author>
<first-name>Mary</first-name>
<last-name>Bob</last-name>
</author>
<author>
<first-name>Britney</first-name>
<last-name>Bob</last-name>
</author>
<price>55</price>
</book>
<book style="novel" id="myfave">
<author>
<first-name>Toni</first-name>
<last-name>Bob</last-name>
</author>
</bookstore>
'
doc2 <- htmlTreeParse(b, useInternal=T)
xpathApply(doc2, "//author/first-name", xmlValue)
For example, when I run a function xpathApply()from the author, I get 4 results, but if I were to delete one of the nodes <first-name>, I want the function to xpathApplyreturn NULL or something else in its place, I do not want him to miss it. I want the result to look like this if I wanted to remove <first-name>Mary</first-name>:
Joe
NA
Britney
Tony
source
share