How to create a frame of user-defined S4 classes in R

I want to create data.frame of various variables, including S4 classes. For an inline class such as "POSIXlt" (for dates), this works fine:

as.data.frame(list(id=c(1,2), date=c(as.POSIXlt('2013-01-01'),as.POSIXlt('2013-01-02')) 

But now I have a custom class, say, the class "Man" with a name and age:

 setClass("person", representation(name="character", age="numeric")) 

But the following is true:

 as.data.frame(list(id=c(1,2), pers=c(new("person", name="John", age=20), new("person", name="Tom", age=30)))) 

I also tried to overload the [...] operator for the human class using

 setMethod( f = "[", signature="person", definition=function(x,i,j,...,drop=TRUE){ initialize(x, name=x@name [i], age = x@age [i]) } ) 

This allows you to use vector behavior:

 persons = new("person", name=c("John","Tom"), age=c(20,30)) p1 = persons[1] 

But still the following is true:

 as.data.frame(list(id=c(1,2), pers=persons)) 

Perhaps I have to overload more operators to get a user-defined class in a dataframe? I am sure there must be a way to do this, since POSIXlt is an S4 class and it works! Any solution using the new R5 reference classes would be great!

I don’t want to put all my data in the person class (you might ask why the "id" is not a member of the person, I just do not use dataframes)! The idea is that my data.frame file is a table from a database with many columns with different types, like rows, numbers, ... but also dates, intervals, geo objects, etc. Although I already have a solution for dates (POSIXlt), for intervals, geo objects, etc. Perhaps I need to specify my own S4 / R5 classes.

Thank you very much in advance.

+6
source share
2 answers

Here is your class, with a “column” of interpretation of its definition, not a string; it will be important for performance; also date the link

 setClass("person", representation(name="character", age="numeric")) pers <- new("person", name=c("John", "Tom"), age=c(20, 30)) date <- as.POSIXct(c('2013-01-01', '2013-01-02')) 

Some experiments, including looking at methods(class="POSIXct") and attention to error messages, led me to implement as.data.frame.person and format.person (the latter is used to display in data.frame) as

 as.data.frame.person <- function(x, row.names=NULL, optional=FALSE, ...) { if (is.null(row.names)) row.names <- x@name value <- list(x) attr(value, "row.names") <- row.names class(value) <- "data.frame" value } format.person <- function(x, ...) paste0( x@name , ", ", x@age ) 

This gets my objects in the data.frame file:

 > lst <- list(id=1:2, date=date, pers=pers) > as.data.frame(lst) id date pers John 1 2013-01-01 John, 20 Tom 2 2013-01-02 Tom, 30 

If I want a subset, then I need

 setMethod("[", "person", function(x, i, j, ..., drop=TRUE) { initialize(x, name=x@name [i], age=x@age [i]) }) 

I'm not sure what other methods might be required, since there are more data.frame operations, there is no "data.frame interface".

Using a vectorized class in data.table seems to require a length method to build.

 > library(data.table) > data.table(id=1:2, pers=pers) Error in data.table(id = 1:2, pers = pers) : problem recycling column 2, try a simpler type > setMethod(length, "person", function(x) length( x@name )) [1] "length" > data.table(id=1:2, pers=pers) id pers 1: 1 John, 20 2: 2 Tom, 30 

Maybe there is a data.table interface?

+6
source

Judging by this topic on the mailing list:

http://tolstoy.newcastle.edu.au/R/e2/devel/06/11/1013.html

... John Chambers thought about this in 2006. And yet we cannot put S4 objects in data frame columns. We also cannot put complex S3 classes in data frame columns.

There are other tabular data structures that can do this - data.table is possible:

 require(data.table) setClass("geezer", representation(name="character", age="numeric")) tom=new("geezer",name="Tom",age=20) dick=new("geezer",name="Dick",age=23) harry=new("geezer",name="Harry",age=25) gt = data.table(geezers=c(tom,dick,harry),weapons=c("Gun","Gun","Knife")) gt geezers weapons 1: <geezer> Gun 2: <geezer> Gun 3: <geezer> Knife 

The semantics of data.table is slightly different from data.frame and does not expect it to be able to connect data.table to any code that uses data.frame and expect it to work (for example, I suspect lm and glm shaken). But it seems the authors of data.table allow composite classes in columns ...

+2
source

All Articles