I am trying to find the number of observations on a column in a data frame that satisfy a certain condition after max for this column is encountered.
Here is a very simplified example:
fake.dat<-data.frame(samp1=c(5,6,7,5,4,5,10,5,6,7), samp2=c(2,3,4,6,7,9,2,3,7,8), samp3=c(2,3,4,11,7,9,2,3,7,8),samp4=c(5,6,7,5,4,12,10,5,6,7))
samp1 samp2 samp3 samp4
1 5 2 2 5
2 6 3 3 6
3 7 4 4 7
4 5 6 11 5
5 4 7 7 4
6 5 9 9 12
7 10 2 2 10
8 5 3 3 5
9 6 7 7 6
10 7 8 8 7
So, let's say I'm trying to find the number of observations per column exceeding 5, after excluding all the observations in the column before and including the row in which the maximum for the column occurs.
Expected Result:
samp1 samp2 samp3 samp4
2 2 4 3
I can get the answer that I want using nested for loopsto exclude those observations that I do not want.
newfake.dat<-data.frame()
for(j in 1:length(fake.dat)){
for(i in 1:nrow(fake.dat)){
ifelse(i>max.row[j],newfake.dat[i,j]<-fake.dat[i,j],"NA")
print(newfake.dat)
}}
This creates a new data frame on which I can run the light function apply.
colcount<-apply(newfake.dat,2,function(x) (sum(x>5,na.rm=TRUE)))
V1 V2 V3 V4
1 NA NA NA NA
2 NA NA NA NA
3 NA NA NA NA
4 NA NA NA NA
5 NA NA 7 NA
6 NA NA 9 NA
7 NA 2 2 10
8 5 3 3 5
9 6 7 7 6
10 7 8 8 7
V1 V2 V3 V4
2 2 4 3
, , , . (2000 x 2000 ) . ( , ), 5 ( , ). , dataframe - , apply.
? , apply, seq .
maxrow<-apply(fake.dat,2,function(x) which.max(x))
print(maxrow)
seq.att<-apply(fake.dat,2,function(x) {
sum(x[which(seq(1,nrow(fake.dat))==(maxrow)):nrow(fake.dat)]>5,na.rm=TRUE)})
:
1: In seq(1, nrow(fake.dat)) == (maxrow) :
longer object length is not a multiple of shorter object length
, , :
samp1 samp2 samp3 samp4
2 3 3 3
while, , ( , ).
for loops, , , , . - R, , - . , !