How to extract XML data from CrossRef using R?

If you put the following URL in your CrossRef address, you will get an XML file

"http://www.crossref.org/openurl?title=Science&aulast=Fernández&date=2009&multihit=true&pid=your.crossref.email" 

An example file is available here:

crossref.xml

I want to extract a DOI list (identification of digital objects) in data.frame in R. I want to do this using one of the common R xml packages

 library(XML) or library(tm) 

I tried

 doc<-xmlTreeParse(file) top<-xmlRoot(doc) 

but can't figure out how to get out of here

 top[[1]]["doi"] 

does not work.

+3
source share
3 answers

Try the following:

 library(XML) doc <- xmlTreeParse("crossref.xml", useInternalNodes = TRUE) root <- xmlRoot(doc) xpathSApply(root, "//x:doi", xmlValue, namespaces = "x") 
+2
source

I and others as part of rOpenSci have some functions for getting into the Crossref API, the functions crossref and crossref_r are here .

+2
source

I had the same misunderstanding. I spent a day and a half look and finally stumbled upon this post.

Thanks!!!

0
source

All Articles