I am working with eye tracking data right now, so you have a huge array of data (I think millions of rows) and therefore would like to quickly complete this task. Here is a simplified version.
The data tells you where the eye is looking at every moment in time, and for each file that we are looking at. X1, Y1 to the coordinates of the point we are looking at. For each file, there are several time points (representing the eyes that look at another place in the file in time).
Filename Time X1 Y1 1 1 10 10 1 2 12 10
I also have a file where elements are located for each file name. Each file contains (in this simplified case) two objects. X1, Y1 are the lower left coordinates, and X2, Y2 are the upper right coordinates. You can imagine this as providing a bounding box where the element is in each file. For example.
Filename Item X1 Y1 X2 Y2 1 Dog 11 10 20 20
What I would like to do is add another column to the first data frame, which tells me which object the person refers to at every moment for each file. If you are not looking at any of the objects, I would like the column to say โnoneโ. Things on the border are counted when viewing. For example.
Filename Time X1 Y1 LookingAt 1 1 10 10 none 1 2 12 11 Dog
I know how to do this for a loop, but it takes forever (and crashed my RStudio). I am wondering if there could be a faster and more efficient way that I am missing.
Here's the dput for the first data frame (they contain more rows that I showed above):
structure(list(Filename = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("1", "2", "3"), class = "factor"), Time = structure(c(1L, 2L, 3L, 1L, 2L, 1L, 2L, 4L, 5L), .Label = c("1", "2", "3", "5", "6"), class = "factor"), X1 = structure(c(1L, 4L, 3L, 2L, 1L, 4L, 6L, 5L, 1L), .Label = c("10", "11", "12", "15", "20", "25" ), class = "factor"), Y1 = structure(c(1L, 5L, 6L, 4L, 1L, 2L, 3L, 4L, 1L), .Label = c("10", "11", "12", "15", "20", "25"), class = "factor")), .Names = c("Filename", "Time", "X1", "Y1"), row.names = c(NA, -9L), class = "data.frame")
And here is the dput for the second:
structure(list(Filename = structure(c(1L, 1L, 2L, 2L), .Label = c("1", "3"), class = "factor"), Item = structure(1:4, .Label = c("Cat", "Dog", "House", "Mouse"), class = "factor"), X1 = structure(c(2L, 4L, 3L, 1L), .Label = c("10", "11", "20", "35"), class = "factor"), Y1 = structure(c(2L, 4L, 3L, 1L), .Label = c("10", "11", "13", "35"), class = "factor"), X2 = structure(c(1L, 3L, 4L, 2L), .Label = c("10", "11", "20", "35"), class = "factor"), Y2 = structure(c(1L, 3L, 4L, 2L), .Label = c("10", "11", "13", "35"), class = "factor")), .Names = c("Filename", "Item", "X1", "Y1", "X2", "Y2"), row.names = c(NA, -4L), class = "data.frame")