Creating a matrix of indicator variables

Question

Creating a matrix of indicator variables

I would like to create a matrix of indicator variables. My initial thought was to use model.matrix, which was also proposed here: Automatically expand the R-factor into a set of 1/0 indicator indicators for each level factor

However, model.matrix does not seem to work if the coefficient has only one level.

Here is an example of a data set with three levels in the “area” of a factor:

dat = read.table(text = " reg1 reg2 reg3 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 1 ", sep = "", header = TRUE) # model.matrix works if there are multiple regions: region <- c(1,1,1,1,1,1,2,2,2,3,3,3,3) df.region <- as.data.frame(region) df.region$region <- as.factor(df.region$region) my.matrix <- as.data.frame(model.matrix(~ -1 + df.region$region, df.region)) my.matrix # The following for-loop works even if there is only one level to the factor # (one region): # region <- c(1,1,1,1,1,1,1,1,1,1,1,1,1) my.matrix <- matrix(0, nrow=length(region), ncol=length(unique(region))) for(i in 1:length(region)) {my.matrix[i,region[i]]=1} my.matrix

Effective for a loop and seems simple enough. However, I struggled to come up with a solution that is not related to cycles. I can use the cycle above, but I try to wean myself from them. Is there a better way?

+1

matrix r

Mark Miller Dec 22

source share

2 answers

I came up with this solution by changing the answer to a similar question:

Rearrange a column from a data frame to multiple columns using R

 region <- c(1,1,1,1,1,1,2,2,2,3,3,3,3) site <- seq(1:length(region)) df <- cbind(site, region) ind <- xtabs( ~ site + region, df) ind region <- c(1,1,1,1,1,1,1,1,1,1,1,1,1) site <- seq(1:length(region)) df <- cbind(site, region) ind <- xtabs( ~ site + region, df) ind

EDIT:

The line below will extract the data frame of indicator variables from ind :

 ind.matrix <- as.data.frame.matrix(ind)

0

Mark Miller Dec 25 2018-12-12T00:

source share

flodel · Accepted Answer · 2012-12-22 02:35

I would use matrix indexing. From ?"[" :

The third form of indexing is through a numerical matrix with one column for each dimension: each row of the index matrix selects one element of the array, and the result is a vector.

Using this nice feature:

 my.matrix <- matrix(0, nrow=length(region), ncol=length(unique(region))) my.matrix[cbind(seq_along(region), region)] <- 1 # [,1] [,2] [,3] # [1,] 1 0 0 # [2,] 1 0 0 # [3,] 1 0 0 # [4,] 1 0 0 # [5,] 1 0 0 # [6,] 1 0 0 # [7,] 0 1 0 # [8,] 0 1 0 # [9,] 0 1 0 # [10,] 0 0 1 # [11,] 0 0 1 # [12,] 0 0 1 # [13,] 0 0 1

Creating a matrix of indicator variables

More articles: