I have a data frame illustrated by the following
dist <- c(1.1,1.0,10.0,5.0,2.1,12.2,3.3,3.4) id <- rep("A",length(dist)) df<-cbind.data.frame(id,dist) df id dist 1 A 1.1 2 A 1.0 3 A 10.0 4 A 5.0 5 A 2.1 6 A 12.2 7 A 3.3 8 A 3.4
I need to clear it, so the row values ββin the dist column are more than 2 times the value of the next row at any time. A cleared data frame will look like this:
id dist 1 A 1.1 2 A 1.0 5 A 2.1 7 A 3.3 8 A 3.4
I tried to make a function with a for loop and if the instruction to clear it
cleaner <- function (df,dist,times_larger) { for (i in 1:(nrow(df)-1)) { if (df$dist[i] > df$dist[i+1]*times_larger){ df<-df[-i,] break } } df }
Obviously, if I do not break the loop, it will create an error, because the number of lines in df will change in the process. If I manually started the cycle on df several times:
df<-cleaner(df,"dist",2)
he will be cleaned as I want.
I also tried various function constructors and applied them to a data frame using, but without luck.
Does anyone have a good suggestion on how to repeat a function in a data frame until it changes anymore, a better functional structure, or maybe a better way to clean it up?
Any suggestions are most appreciated
r dataframe data-manipulation data-cleaning
Kristian
source share