Read the data:
Lines <- "vial response explanatory Xm1.1 0 4 Xm2.1 0 4 Xm3.1 0 4 Xm4.1 0 4 " day.df <- read.table(text = Lines, header = TRUE, as.is = TRUE)
1) , then process it with strapplyc . (we used as.is=TRUE so that day.df$vial a character, but if its a factor in your data frame, replace day.df$vial with as.character(day.df$vial) .) This approach performs parsing in only one short line of code:
library(gsubfn) s <- strapplyc(day.df$vial, "(.)(.)(\\d+)[.](.)", simplify = rbind) # we can now cbind it to the original data frame colnames(s) <- c("treatment", "gender", "line", "block") cbind(day.df, s)
which gives:
vial response explanatory treatment gender line block 1 Xm1.1 0 4 X m 1 1 2 Xm2.1 0 4 X m 2 1 3 Xm3.1 0 4 X m 3 1 4 Xm4.1 0 4 X m 4 1
2) Here is a different approach. It uses no packages and is relatively simple (no regular expressions at all) and includes only one R statement, including cbind'ing:
transform(day.df, treatment = substring(vial, 1, 1), # 1st char gender = substring(vial, 2, 2), # 2nd char line = substring(vial, 3, nchar(vial)-2), # 3rd through 2 prior to last char block = substring(vial, nchar(vial))) # last char
The result is still.
UPDATE: the second approach is added.
UPDATE: some simplifications.
G. grothendieck
source share