How to select all factor variables in R

I have a data frame called "insurance" with both numerical and variable factors. How can I select all factor variables so that I can check the levels of categorical variables?

I tried sapply(insurance,class) to get classes of all variables. But then I cannot make a boolean argument based on if class(var)="factor" , since variable names are also included in the result of sapply() .

Thanks,

+9
r
source share
4 answers

Some data:

 insurance <- data.frame( int = 1:5, fact1 = letters[1:5], fact2 = factor(1:5), fact3 = LETTERS[3:7] ) 

I would use sapply , like you, but in combination with is.factor , to return a logical vector:

 is.fact <- sapply(insurance, is.factor) # int fact1 fact2 fact3 # FALSE TRUE TRUE TRUE 

Then use [ to extract these columns:

 factors.df <- insurance[, is.fact] # fact1 fact2 fact3 # 1 a 1 C # 2 b 2 D # 3 c 3 E # 4 d 4 F # 5 e 5 G 

Finally, to get levels, use lapply :

 lapply(factors.df, levels) # $fact1 # [1] "a" "b" "c" "d" "e" # # $fact2 # [1] "1" "2" "3" "4" "5" # # $fact3 # [1] "C" "D" "E" "F" "G" 

You can also find str(insurance) interesting as a short summary.

+14
source share

This (almost) seems like the perfect time to use rarely used features.

 rapply(insurance, class = "factor", f = levels, how = "list") 

Or

 Filter(Negate(is.null),rapply(insurance, class = "factor", f = levels, how = "list")) 

To remove NULL elements (which are not factors)

Or simply

 lapply(Filter(is.factor,insurance), levels)) 
+1
source share
 insurance %>% select_if(~class(.) == 'factor') 
0
source share

using the "insurance" data frame from flodel to get all the factors in one go, you can use apply , for example:

 apply(insurance,2,factor) int fact1 fact2 fact3 [1,] "1" "a" "1" "C" [2,] "2" "b" "2" "D" [3,] "3" "c" "3" "E" [4,] "4" "d" "4" "F" [5,] "5" "e" "5" "G" 

If you are only interested in levels of one factor, you can do the following:

 factor(insurance$fact1) [1] abcde Levels: abcde 
-2
source share

All Articles