What are the design solutions underlying the encoding algorithm for the Google Maps polyline?

Question

What are the design solutions underlying the encoding algorithm for the Google Maps polyline?

Several Google Maps products have the concept of polylines, which in terms of basic data is basically just a sequence of lat / lng points, which can, for example, appear in a line drawn on a map. The Google Map Developer Libraries use an encoded polyline format that displays an ASCII string representing the points that make up the polyline. Then, this encoded format is usually decoded with a built-in function of Google libraries or a function written by a third party that implements the decoding algorithm.

The polyline coding algorithm is described in the Encoded Polyline Algorithm Format document. What is not described is the rationale for implementing the algorithm in this way and the meaning of each of the individual steps. I am interested to know if the thought / purpose of the implementation of the algorithm is thus publicly described anywhere. Two examples:

Some steps have a quantitative effect on compression, and how does this influence vary with delta between points?
Is summing values with ASCII 63 an incorrect hack?

But, as a rule, the description goes along with an algorithm explaining why the algorithm is implemented as it is.

+7

algorithm google-maps google-polyline

Bryce thomas Mar 18 '14 at 13:48

source share

1 answer

Karussell · Answer 1 · 2014-07-01T12:52:11+0000

Update : This blog post from James Snook also has a valid ascii range argument and is read logically for the other steps that I thought of. For example. left offset before saving, which puts a negative bit as the first bit.

Some of the explanations I found are not sure if everything is 100% correct.

One double value is stored in several 5-bit chunks, and 0x20 (binary '0010 0000') is used as an indication that the next 5-bit record refers to the current double.
0x1f (binary '0001 1111') is used as a bitmask to discard other bits.
I expect 5 bits to be used because delta lat or hors is in this range. So each double value takes only 5 bits on average when done for many examples (but not yet verified).
Compression is now performed, assuming close binary values are very close, and creating the difference is almost 0, so the results fit in a few bytes. Then this result is saved dynamically: save 5 bits, and if the value is greater, mark 0x20 and save the next 5 bits and so on. So I think you can tweak the compression if you try 6 or 4 bits, but I think 5 is an almost wise choice.
Now about magic 63, it's 0x3f and binary 0011 1111. I'm not sure why they add it. I thought that adding 63 would give some “best” asci characters (for example, allowed in XML or in the URL), as we skip for example. 62 which > but 63 which ? really better? At least the first ascii characters are not displayed and should be avoided. Note that if you used 64, then you would fall into ascii char 127 for a maximum value of 31 (31 + 64 + 32), and this char is not defined in html4. Or is it because of the signed char, it goes from -128 to 127, and we need to keep the negative numbers as positive, thus adding the maximum possible negative number?
Just for me: here is a link to the official Java implementation with Apache license

What are the design solutions underlying the encoding algorithm for the Google Maps polyline?

More articles: