Calculate full miles moved from lat / lon vectors

I have a data frame with data about the driver and the route that they followed. I am trying to figure out the total mileage. I use the geosphere package, but I can’t determine the correct way to use it and get the answer in miles.

 > head(df1) id routeDateTime driverId lat lon 1 1 2012-11-12 02:08:41 123 76.57169 -110.8070 2 2 2012-11-12 02:09:41 123 76.44325 -110.7525 3 3 2012-11-12 02:10:41 123 76.90897 -110.8613 4 4 2012-11-12 03:18:41 123 76.11152 -110.2037 5 5 2012-11-12 03:19:41 123 76.29013 -110.3838 6 6 2012-11-12 03:20:41 123 76.15544 -110.4506 

So far i tried

 spDists(cbind(df1$lon,df1$lat)) 

and a few other functions, but it seems it cannot get a reasonable answer.

Any suggestions?

 > dput(df1) structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40), routeDateTime = c("2012-11-12 02:08:41", "2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41", "2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41", "2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41", "2012-11-12 02:08:41", "2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41", "2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41", "2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41", "2012-11-12 02:08:41", "2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41", "2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41", "2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41", "2012-11-12 02:08:41", "2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41", "2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41", "2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41" ), driverId = c(123, 123, 123, 123, 123, 123, 123, 123, 123, 123, 456, 456, 456, 456, 456, 456, 456, 456, 456, 456, 789, 789, 789, 789, 789, 789, 789, 789, 789, 789, 246, 246, 246, 246, 246, 246, 246, 246, 246, 246), lat = c(76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785, 76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785, 76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785, 76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785), lon = c(-110.80701574916, -110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522, -110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522, -110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522, -110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522, -110.217355202651)), .Names = c("id", "routeDateTime", "driverId", "lat", "lon"), row.names = c(NA, -40L), class = "data.frame") 
+4
source share
3 answers

How about this?

 ## Setup library(geosphere) metersPerMile <- 1609.34 pts <- df1[c("lon", "lat")] ## Pass in two derived data.frames that are lagged by one point segDists <- distVincentyEllipsoid(p1 = pts[-nrow(df),], p2 = pts[-1,]) sum(segDists)/metersPerMile # [1] 1013.919 

(To use one of the algorithms for faster distance calculation, simply replace distCosine , distVincentySphere or distHaversine with distVincentyEllipsoid in the above call.)

+6
source

Be VERY careful with missing data, as distVincentyEllipsoid () returns 0 for the distance between any two points with missing coordinates c (NA, NA), c (NA, NA).

+1
source
 library(geodist) geodist(df, sequential = TRUE, measure = "geodesic") # sequence of distance increments sum(geodist(df, sequential = TRUE, measure = "geodesic")) # total distance in metres sum(geodist(df, sequential = TRUE, measure = "geodesic")) * 0.00062137 # total distance in miles 

Geodesic distances are necessary because of the large distances. The result is 1013.915, slightly different from the less accurate Vincenty geosphere distances. The distance to the street network can also be calculated using

 library(dodgr) dodgr_dists(from = df) 

... but there should be a network of streets, which is not so (lat = 76, lon = -110). Where there is a street network, by default it will give you all the pairwise distances laid through the street network, of which successive increments are off-diagonal.

0
source

All Articles