Extension of the resume function R (or creation of a new function with a similar output) to display factors as a percentage of the total

Is there a way to easily expand the R summary() function (or create a new function with a similar output) to display factors as a percentage of the total?

 summary(chickwts) # weight feed # Min. :108.0 casein :12 # 1st Qu.:204.5 horsebean:10 # Median :258.0 linseed :12 # Mean :261.3 meatmeal :11 # 3rd Qu.:323.5 soybean :14 # Max. :423.0 sunflower:12 

Desired conclusion:

 pct_summary(chickwts) # weight feed # Min. :108.0 casein :17% # 1st Qu.:204.5 horsebean:14% # Median :258.0 linseed :17% # Mean :261.3 meatmeal :15% # 3rd Qu.:323.5 soybean :20% # Max. :423.0 sunflower:17% # Or even this... # weight feed # Min. :108.0 casein :12 17% # 1st Qu.:204.5 horsebean:10 14% # Median :258.0 linseed :12 17% # Mean :261.3 meatmeal :11 15% # 3rd Qu.:323.5 soybean :14 20% # Max. :423.0 sunflower:12 17% 

The closest I found is Hmisc::describe() .

+4
source share
3 answers

We can borrow from existing summary routines and make it a little less invasive, giving the factors a transition attribute of an additional class.

 summary.my.factor<-function(object,...) { x<-prop.table(table(object)) setNames(sprintf("%1.2f%%",100*x),names(x)) } my.summary<-function(object,...) { f<-function(x) if(inherits(x,"factor")) structure(x,class=c("my.factor",class(x))) else x summary(as.data.frame(lapply(object,f)),...) } my.summary(chickwts) 
  weight feed       
  Min.  : 108.0 casein: 16.90%  
  1st Qu.:204.5 horsebean: 14.08%  
  Median: 258.0 linseed: 16.90%  
  Mean: 261.3 meatmeal: 15.49%  
  3rd Qu.:323.5 soybean: 19.72%  
  Max  : 423.0 sunflower: 16.90%  

I did not bother to respect any options, such as digits in my.factor formatting.

+2
source

You can recode parts of bodies of functions quite strange.

 ## Rework a piece of the body mysummary <- summary.factor body(mysummary)[[5]] <- quote( tbl <- round(table(object)/sum(table(object))*100) ) summary.factor(chickwts$feed) # casein horsebean linseed meatmeal soybean sunflower # 12 10 12 11 14 12 mysummary(chickwts$feed) # casein horsebean linseed meatmeal soybean sunflower # 17 14 17 15 20 17 

This may be a more complicated solution than you are looking for, but you can do a similar thing for summary.data.frame and tell it to use a modified summary.factor in your example.

So it will look like

 mysumm <- summary.data.frame body(mysumm)[[3]] <- quote( z <- lapply(X=as.list(object), FUN=function(x) if (is.factor(x)) mysummary(x) else summary(x)) ) mysumm(chickwts) # weight feed # Min. :108.0 casein :17 # 1st Qu.:204.5 horsebean:14 # Median :258.0 linseed :17 # Mean :261.3 meatmeal :15 # 3rd Qu.:323.5 soybean :20 # Max. :423.0 sunflower:17 

Note. I ignored the other arguments for the summary to shorten the code, but you can add these arguments, which will be passed by a generic summary method.

+3
source

Bad and dangerous way:

 # backup original summary.factor original_summary_factor = base::summary.factor # our new summary.factor summary.factor = function(object,maxsum = 100, ...){ res = original_summary_factor(object = object, maxsum = maxsum, ...) pct = round(res/length(object)*100) setNames(paste0(res, " ", pct, "%"),names(res)) } # DANGEROUS CODE. USE IT AT YOUR OWN RISK. # Here we replace original summary.factor with the new one unlockBinding("summary.factor", as.environment("package:base")) assignInNamespace("summary.factor", summary.factor, ns="base", envir=as.environment("package:base")) assign("summary.factor", summary.factor, as.environment("package:base")) lockBinding("summary.factor", as.environment("package:base")) summary(chickwts) # weight feed # Min. :108.0 casein :12 17% # 1st Qu.:204.5 horsebean:10 14% # Median :258.0 linseed :12 17% # Mean :261.3 meatmeal :11 15% # 3rd Qu.:323.5 soybean :14 20% # Max. :423.0 sunflower:12 17% 
+1
source

All Articles