Assuming dat contains your data, we are processing using strsplit() in
tt <- matrix(unlist(strsplit(dat$V3, split = "")), ncol = 13, byrow = TRUE)
giving:
> tt [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [1,] "a" "a" "a" "a" "a" "a" "b" "b" "b" "a" "b" "a" "b" [2,] "a" "b" "a" "b" "a" "a" "a" "b" "a" "a" "a" "b" "b" [3,] "b" "a" "b" "b" "b" "a" "b" "a" "a" "b" "b" "b" "a"
We can get the desired results by observing the correct level settings:
apply(tt, 2, function(x) c(table(factor(x, levels = c("a","b")))))
which gives:
> apply(tt, 2, function(x) c(table(factor(x, levels = c("a","b"))))) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] a 2 2 2 1 2 3 1 1 2 2 1 1 1 b 1 1 1 2 1 0 2 2 1 1 2 2 2
To automate the selection of suitable levels, we could do something like:
> lev <- levels(factor(tt)) > apply(tt, 2, function(x, levels) c(table(factor(x, levels = lev))), + levels = lev) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] a 2 2 2 1 2 3 1 1 2 2 1 1 1 b 1 1 1 2 1 0 2 2 1 1 2 2 2
where in the first line we consider tt as a vector and extract the levels after temporarily converting tt to a factor. Then we provide these levels ( lev ) to the apply() step instead of explicitly specifying the levels.