I have a column as shown below. Each column has two pairs with the suffixes "a" and "b" - for example, col1a, col1b, colNa, colNb, etc. To the end of the file (> 50,000).
mydataf <- data.frame (Ind = 1:5, col1a = sample (c(1:3), 5, replace = T), col1b = sample (c(1:3), 5, replace = T), colNa = sample (c(1:3), 5, replace = T), colNb = sample (c(1:3),5, replace = T), K_a = sample (c("A", "B"),5, replace = T), K_b = sample (c("A", "B"),5, replace = T)) mydataf Ind col1a col1b colNa colNb K_a K_b 1 1 1 1 2 3 BA 2 2 1 3 2 2 BB 3 3 2 1 1 1 BB 4 4 3 1 1 3 AB 5 5 1 1 3 2 BA
With the exception of the first column (Ind), I want to collapse a couple of lines to make the dataframe look like the following: at the time, the suffix "a" and "b" will be deleted. Also, the combined characters or number must be ordered 1 first, that 2, A first, than B
Ind col1 colN K_ 1 11 23 AB 2 13 22 BB 3 12 11 BB 4 13 13 AB 5 11 23 AB
Edit: the grep function (possibly) in the answer has a problem if the column name is similar.
mydataf <- data.frame (col_1_a = sample (c(1:3), 5, replace = T), col_1_b = sample (c(1:3), 5, replace = T), col_1_Na = sample (c(1:3), 5, replace = T), col_1_Nb = sample (c(1:3),5, replace = T), K_a = sample (c("A", "B"),5, replace = T), K_b = sample (c("A", "B"),5, replace = T)) n <- names(mydataf) nm <- c(unique(substr(n, 1, nchar(n)-1))) df <- data.frame(sapply(nm, function(x){ idx <- grep(x, n) cols <- mydataf[idx] x <- apply(cols, 1, function(z) paste(sort(z), collapse = "")) return(x) })) names(df) <- nm df col_1_ col_1_N K_ 1 2233 23 BB 2 2233 22 BB 3 1123 13 AB 4 1223 12 AB 5 2333 33 AB