I was wondering what is the best way to avoid string handling in R, most of them are executed differently in internal C procedures. For example: I have a data frame a :
chromosome_name start_position end_position strand 1 15 35574797 35575181 1 2 15 35590448 35591641 -1 3 15 35688422 35688645 1 4 13 75402690 75404217 1 5 15 35692892 35693969 1
I want: based on whether the line is positive or negative, startOFgene as start_position or end_position . One way to avoid the for loop is to split the data.frame with the +1 chain and the thread and perform the selection. What could be another way to speed up? A method does not scale if it has some other complex processing for each row.
source share