I use R with package.table data tables, and I would like to group the data table using time slots or overlapping cells. For each of these time intervals, I would like to find the presence of equal data pairs. Moreover, these “equal data pairs” should not be exactly the same, but in some intervals of the range.
A simple version of the question is this:
DT[, sum(counts), by = list(Time, X, Y)]
findintervals() will give me bunkers with "tight borders" rather than overlapping.
The problem is more detailed: Let's say I have a data table.
Time <- c(1,1,2,4,5,5,6,7,8,8,8,8,12,13)
X <- c(6,6,7,10,5,7,6,3,9,10,6,3,3,6)
Y <- c(2,6,10,3,4,6,6,9,4,9,6,6,9,9)
DT <- data.table(Time, X, Y)
Time X Y
1: 1 6 2
2: 1 6 6
3: 2 7 10
4: 4 10 3
5: 5 5 4
6: 5 7 6
7: 6 6 6
8: 7 3 9
9: 8 9 4
10: 8 10 9
11: 8 6 6
12: 8 3 6
13: 12 3 9
14: 13 6 9
And some predefined interval sizes:
Timeinterval <- 5
#for a time value of 10 this means to look from 10-5 to 10+5
RangeX.percentage <- 0.5
RangeY.percentage <- 0.5
The result should give me an extra column, let it “count” with the presence of equal data pairs X and Y, given the ranges for X and Y.
- ,
c(1, 1, 2, 4, 5, 5, 6)
c(1, 1, 2, 4, 5, 5, 6, 7)
c(1, 1, 2, 4, 5, 5, 6, 7, 8, 8, 8, 8)
c(8, 8, 8, 8, 12, 13)
(, , ):
EDIT: , :
Ranges <- DT[ , list(
X* (1 + RangeX.percentage), X* (1 - RangeX.percentage),
Y* (1 + RangeY.percentage), Y* (1 - RangeY.percentage))]
DT2 <- cbind(DT, Ranges, count = rep(1, nrow(DT)))
setnames(DT2, c("Time","X","Y","XR1","XR2","YR1","YR2","count"))
for (i in 1:nrow(DT2)){
DT2.subset <- DT2[which(abs(Time - DT2[i]$Time) < Timeinterval)]
DT[i]$Count<- length(which(DT2.subset$X < DT2[i]$XR1 &
DT2.subset$X > DT2[i]$XR2 &
DT2.subset$Y < DT2[i]$YR1 &
DT2.subset$Y > DT2[i]$YR2))
}
DT2
Time X Y XR1 XR2 YR1 YR2 count
1: 1 6 2 9.0 3.0 3.0 1.0 1
2: 1 6 6 9.0 3.0 9.0 3.0 3
3: 2 7 10 10.5 3.5 15.0 5.0 4
4: 4 10 3 15.0 5.0 4.5 1.5 3
5: 5 5 4 7.5 2.5 6.0 2.0 1
6: 5 7 6 10.5 3.5 9.0 3.0 6
7: 6 6 6 9.0 3.0 9.0 3.0 4
8: 7 3 9 4.5 1.5 13.5 4.5 2
9: 8 9 4 13.5 4.5 6.0 2.0 3
10: 8 10 9 15.0 5.0 13.5 4.5 4
11: 8 6 6 9.0 3.0 9.0 3.0 4
12: 8 3 6 4.5 1.5 9.0 3.0 1
13: 12 3 9 4.5 1.5 13.5 4.5 2
14: 13 6 9 9.0 3.0 13.5 4.5 1
data.table , DT $ .