I want a subset of dataframe by coefficient. I only want to keep factor levels above a certain frequency.
df <- data.frame(factor = c(rep("a",5),rep("b",5),rep("c",2)), variable = rnorm(12))
This code creates a data frame:
factor variable 1 a -1.55902013 2 a 0.22355431 3 a -1.52195456 4 a -0.32842689 5 a 0.85650212 6 b 0.00962240 7 b -0.06621508 8 b -1.41347823 9 b 0.08969098 10 b 1.31565582 11 c -1.26141417 12 c -0.33364069
And I want to lower the levels of factors that are repeated less than 5 times. I developed for-loop and it works:
for (i in 1:length(levels(df$factor))){ if(table(df$factor)[i] < 5){ df.new <- df[df$factor != names(table(df$factor))[i],] } }
But are there faster and more beautiful solutions?
r subset
Bixic
source share