A fairly simple (and fast!) Alternative is to use a matrix to index into your matrix:
# Your data d <- data.frame(color=c('red','blue','blue','green'), shape=c('circle','square','circle','sphere')) m <- matrix(1:9, 3,3, dimnames=list(c('red','blue','green'), c('circle','square','sphere')))
The match function is used to find the corresponding numerical index for a particular string.
Note that in the newer version of R (2.13 and newer, I think) you can use character strings in the index matrix. Unfortunately, color and shape columns are usually factors , and cbind doesn't like this (it uses integer codes), so you need to force them using as.character :
i <- cbind(as.character(d$color), as.character(d$shape))
... I suspect that using match more efficient.
EDIT I measured and apparently approximately 20% faster used match :
# Make 1 million rows d <- d[sample.int(nrow(d), 1e6, TRUE), ] system.time({ i <- cbind(match(d$color, rownames(m)), match(d$shape, colnames(m))) d2 <- cbind(id=m[i], d) })
Tommy
source share