How to extract XML data from CrossRef using R?

Question

How to extract XML data from CrossRef using R?

If you put the following URL in your CrossRef address, you will get an XML file

"http://www.crossref.org/openurl?title=Science&aulast=Fernández&date=2009&multihit=true&pid=your.crossref.email"

An example file is available here:

crossref.xml

I want to extract a DOI list (identification of digital objects) in data.frame in R. I want to do this using one of the common R xml packages

 library(XML) or library(tm)

I tried

 doc<-xmlTreeParse(file) top<-xmlRoot(doc)

but can't figure out how to get out of here

 top[[1]]["doi"]

does not work.

+3

xml r metadata

Etienne Low-Décarie Mar 30 '12 at 23:55

source share

3 answers

I and others as part of rOpenSci have some functions for getting into the Crossref API, the functions crossref and crossref_r are here .

+2

sckott Jun 30 '12 at 19:36

source share

I had the same misunderstanding. I spent a day and a half look and finally stumbled upon this post.

Thanks!!!

0

Wyatt Nov 28 '13 at 20:38

source share

G. grothendieck · Accepted Answer · 2012-03-31T02:56:00+0000

Try the following:

 library(XML) doc <- xmlTreeParse("crossref.xml", useInternalNodes = TRUE) root <- xmlRoot(doc) xpathSApply(root, "//x:doi", xmlValue, namespaces = "x")

How to extract XML data from CrossRef using R?

More articles: