I have a database s> 300,000 animal observation records. Each line represents the location of the animal. Each animal has a unique identifier ( id1 ) and several columns with attributes related to this animal location, including the date of observation and the x and y coordinates.
Can someone help me create code that will allow me the following:
1) A subset of BOTH date and id1
2) Measure the distance (the coordinates are in UTM, so the distance will be in meters) between the FIRST and LAST location entries for each date for each other id1
Example data is as follows:
mydata<-read.table(text = "id1 date xy 1 11/02/2014 478776.4332 7922167.59 1 11/02/2014 478776.4333 7922170.59 1 11/02/2014 478776.4334 7922180.59 1 12/02/2014 478776.4335 7922190.59 1 12/02/2014 478776.4350 7922192.59 1 12/02/2014 478776.4360 7922195.59 2 11/02/2014 478776.4338 7922167.59 2 11/02/2014 478776.4339 7922183.59 2 11/02/2014 478776.4340 7922185.59 2 12/02/2014 478776.4350 7922188.30 2 12/02/2014 478776.4360 7922190.59 2 12/02/2014 478776.4390 7922198.59 3 11/02/2014 478776.4338 7922167.59 3 11/02/2014 478776.4345 7922175.59 3 11/02/2014 478776.4355 7922178.85 3 12/02/2014 478776.4368 7922180.59 3 12/02/2014 478776.4398 7922183.59 3 12/02/2014 478776.4399 7922185.59 4 11/02/2014 478776.4338 7922167.59 4 11/02/2014 478776.4340 7922172.59 4 11/02/2014 478776.4345 7922178.59 3 11/02/2014 478776.4350 7922179.59 3 12/02/2014 478776.4355 7922184.59 3 12/02/2014 478776.4360 7922187.59 3 12/02/2014 478776.4365 7922198.59", header = TRUE)
A less effective alternative would be to select the first and last records for each date and id1 , and then measure the distance between all pairs of points. I found the code to select the LAST entry for each individual animal, but I still have to include a subset by date:
myid.uni <- unique(mydata$id1) a<-length(myid.uni) last <- c() for (i in 1:a) { temp<-subset(mydata, id1==myid.uni[i]) if (dim(temp)[1] > 1) { last.temp<-temp[dim(temp)[1],] } else { last.temp<-temp } last<-rbind(last, last.temp) }
Can someone help me with any strategy, preferably with the easiest way to do this?
Thanks!
Annk source share