I know that a little programming allows converting frequency tables of a fixed dimension, as returned, for example. on table() , back to the observational data. So the goal is to convert a frequency table such as this ...
(flower.freqs <- with(iris,table(Petal=cut(Petal.Width,2),Species))) Species Petal setosa versicolor virginica (0.0976,1.3] 50 28 0 (1.3,2.5] 0 22 50
... back to data.frame() with a row number that corresponds to the sum of the numbers of the input matrix, while the cell values ββare obtained from the input sizes:
Petal Species 1 (0.0976,1.3] setosa 2 (0.0976,1.3] setosa 3 (0.0976,1.3] setosa # ... (150 rows) ...
With some mastering, I build a rough prototype that also needs to digest high-dimensional inputs:
tableinv <- untable <- function(x) { stopifnot(is.table(x)) obs <- as.data.frame(x)[rep(1:prod(dim(x)),c(x)),-length(dim(x))-1] rownames(obs) <- NULL; obs } > head(tableinv(flower.freqs)); dim(tableinv(flower.freqs)) Petal Species 1 (0.0976,1.3] setosa 2 (0.0976,1.3] setosa 3 (0.0976,1.3] setosa 4 (0.0976,1.3] setosa 5 (0.0976,1.3] setosa 6 (0.0976,1.3] setosa [1] 150 2 > head(tableinv(Titanic)); nrow(tableinv(Titanic))==sum(Titanic) Class Sex Age Survived 1 3rd Male Child No 2 3rd Male Child No 3 3rd Male Child No 4 3rd Male Child No 5 3rd Male Child No 6 3rd Male Child No [1] TRUE
I am obviously proud that this bricolage reconstructs the multi-attribute data.frame() from high-dimensional frequency tables such as Titanic - but is there a set (built-in, battle-tested) common table inverse (), ideally one that doesnβt depend on a specific library that knows how to handle unlabeled dimensions, which is optimized so that it will not be suppressed on bulky inputs and what is reasonable to deal with tabular inputs that will correspond to the factor, as well as non-factorial observations?