I still find the ifelse structure in R a bit confusing, I have the following data frame:
df <- structure(list(snp = structure(1:11, .Label = c("AL0009", "AL00014", "AL0021", "AL00046", "AL0047", "AS0005", "AS0014", "AS00021", "AS0047", "AS0071", "DR0001" ), class = "factor"), CHROMOSOME = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), COUNT_ALLELE = structure(c(1L, 1L, 1L, 3L, 1L, 1L, 1L, 2L, 3L, 3L, 1L), .Label = c("A", "C", "G"), class = "factor"), OTHER_ALLELE = structure(c(3L, 3L, 2L, 1L, 3L, 2L, 2L, 1L, 1L, 1L, 3L), .Label = c("A", "C", "G"), class = "factor"), `116601888` = c(0L, 0L, 0L, 2L, 2L, 0L, 0L, 0L, 0L, 0L, 2L ), `116621563` = c(0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L), `117253533` = c(0L, 0L, 0L, 2L, 2L, 0L, 0L, 0L, 1L, 0L, 2L), `117423827` = c(1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 2L)), .Names = c("snp", "CHROMOSOME", "COUNT_ALLELE", "OTHER_ALLELE", "11688", "11663", "11533", "13827" ), row.names = c(NA, 11L), class = "data.frame")
using the function TranslateAlleleI want to replace the numbers in the columns starting in column 5 with the corresponding two letter codes:
TranslateAllele <- function(COUNT_ALLELE, OTHER_ALLELE, genotype){
if(genotype==0){
print(paste(OTHER_ALLELE, OTHER_ALLELE, sep=""))
} else if(genotype==1){
print(paste(COUNT_ALLELE, OTHER_ALLELE, sep=""))
} else if(genotype==2){
print(paste(COUNT_ALLELE, COUNT_ALLELE, sep=""))
}
}
Thus, the desired result will be as follows:
In the end, I need to do this for 1.6M rows on 1M columns, so I won’t just use the for loop: (