I am using fileName from example(xmlEventParse) as a reproducible example. It has record tags that have an id attribute and text that we would like to extract. Instead of using handler , I will go after the branches argument. This looks like a handler, but it has access to the full node, not just the element. The idea is to write a closure in which there is a place to store the data we have accumulated, and functions for processing each branch of the XML document that we are interested in. So let's start by defining a closure - for our purposes - a function that returns a list of functions
ourBranches <- function() {
We need a place to store the results that we accumulate, choosing the environment so that the insertion time is constant (and not the list that we would need to add and will be ineffective)
store <- new.env()
The event parser expects a list of functions that will be called when a corresponding tag is detected. We are interested in the tag record . The function we are writing will receive the node of the XML document. We want to extract the id element that we will use to store (text) values in node. We add them to our store.
record <- function(x, ...) { key <- xmlAttrs(x)[["id"]] value <- xmlValue(x) store[[key]] <- value }
Once the document is processed, we need a convenient way to get our results, so we will add a function for our own purposes, regardless of the nodes in the document
getStore <- function() as.list(store)
and then finish closing by returning a list of functions
list(record=record, getStore=getStore) }
The difficult concept is that the environment in which the function is defined is part of the function, so every time we say ourBranches() , we get a list of functions and a new store environment to save our results. To use, call xmlEventParse in our file with an empty set of event handlers and get access to our accumulated storage.
> branches <- ourBranches() > xmlEventParse(fileName, list(), branches=branches) list() > head(branches$getStore(), 2) $`Hornet Sportabout` [1] "18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 " $`Toyota Corolla` [1] "33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 "