Is there a common inverse function to table ()?

I know that a little programming allows converting frequency tables of a fixed dimension, as returned, for example. on table() , back to the observational data. So the goal is to convert a frequency table such as this ...

 (flower.freqs <- with(iris,table(Petal=cut(Petal.Width,2),Species))) Species Petal setosa versicolor virginica (0.0976,1.3] 50 28 0 (1.3,2.5] 0 22 50 

... back to data.frame() with a row number that corresponds to the sum of the numbers of the input matrix, while the cell values ​​are obtained from the input sizes:

  Petal Species 1 (0.0976,1.3] setosa 2 (0.0976,1.3] setosa 3 (0.0976,1.3] setosa # ... (150 rows) ... 

With some mastering, I build a rough prototype that also needs to digest high-dimensional inputs:

 tableinv <- untable <- function(x) { stopifnot(is.table(x)) obs <- as.data.frame(x)[rep(1:prod(dim(x)),c(x)),-length(dim(x))-1] rownames(obs) <- NULL; obs } > head(tableinv(flower.freqs)); dim(tableinv(flower.freqs)) Petal Species 1 (0.0976,1.3] setosa 2 (0.0976,1.3] setosa 3 (0.0976,1.3] setosa 4 (0.0976,1.3] setosa 5 (0.0976,1.3] setosa 6 (0.0976,1.3] setosa [1] 150 2 > head(tableinv(Titanic)); nrow(tableinv(Titanic))==sum(Titanic) Class Sex Age Survived 1 3rd Male Child No 2 3rd Male Child No 3 3rd Male Child No 4 3rd Male Child No 5 3rd Male Child No 6 3rd Male Child No [1] TRUE 

I am obviously proud that this bricolage reconstructs the multi-attribute data.frame() from high-dimensional frequency tables such as Titanic - but is there a set (built-in, battle-tested) common table inverse (), ideally one that doesn’t depend on a specific library that knows how to handle unlabeled dimensions, which is optimized so that it will not be suppressed on bulky inputs and what is reasonable to deal with tabular inputs that will correspond to the factor, as well as non-factorial observations?

+7
source share
2 answers

I think your solution is pretty good. In any case, the way I would have to resolve this issue is very similar:

 tableinv <- function(x){ y <- x[rep(rownames(x),x$Freq),1:(ncol(x)-1)] rownames(y) <- c(1:nrow(y)) return(y)} survivors <- as.data.frame(Titanic) surv.invtab <- tableinv(survivors) 

what gives

 > head(surv.invtab) Class Sex Age Survived 1 3rd Male Child No 2 3rd Male Child No 3 3rd Male Child No 4 3rd Male Child No 5 3rd Male Child No 6 3rd Male Child No 

As for the color example, using the tableinv() function, as defined above, you will first need to convert the data into a data frame:

 flower.freqs <- with(iris,table(Petal=cut(Petal.Width,2),Species)) flower.freqs <- as.data.frame(flower.freqs) flower.invtab <- tableinv(flower.freqs) 

The result in this case is

 > head(flower.invtab) Petal Species 1 (0.0976,1.3] setosa 2 (0.0976,1.3] setosa 3 (0.0976,1.3] setosa 4 (0.0976,1.3] setosa 5 (0.0976,1.3] setosa 6 (0.0976,1.3] setosa 

Hope this helps.

+1
source

In the specific case, when we are dealing with one-dimensional frequency data, there is a simple way. Let's take an example:

 mytable = table(mtcars$cyl) #### 4 6 8 #### 11 7 14 

A simple function to get advanced data:

 InvTable = function(tb, random = TRUE){ output = rep(names(tb), tb) if (random) { output <- base::sample(output, replace=FALSE) } return(output) } InvTable(mytable, T) #### [1] "4" "8" "8" "4" "4" "6" "6" ... 

This is not really a user need, but I think it can be very useful in many such cases. Just keep in mind that the result is in character format, which is not always what we need (so add as.numeric if necessary).

0
source

All Articles