Geohash Implementation Explained

Do you really need to import a library?

Yat Man, Wong
7 min readOct 1, 2021

I assume you already understand what is geohash. This post focus on its algorithm and implementation. If you are evaluating the existing libraries versus coding it up yourself, this post is for you.

For general information see:

Geohash Implementation

Lets do an example, convert latitude and longitude (39.92324, 116.3906) into geohash.

Latitude 39.92324 will break into 1011 1000 1100 0111 1001
Below is a step by step break down of how we get to that number.

https://blog.csdn.net/geng333abc/article/details/45169129?locationNum=7&fps=1&ops_request_misc=&request_id=&biz_id=102&utm_term=geohash%20algorithm&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-1-.first_rank_v2_pc_rank_v29&spm=1018.2226.3001.4187

Do you see the pattern? Latitude range starts from -90 to 90, if our latitude falls in the first half which is -90 to 0 (exclude 0), we append a 0, otherwise a 1. Since 39.92324 fall into 1 in the first step, in second step the range shrink to 0 to 90. Again break that range into 2 half, 0 to 45 and 45 to 90.

Almost like a binary search right? We just keep repeating this step until we get our desired precision.

Do the same for longitude 116.3906, and get another pair of code 1101 0010 1100 0100 0100

https://blog.csdn.net/geng333abc/article/details/45169129?locationNum=7&fps=1&ops_request_misc=&request_id=&biz_id=102&utm_term=geohash%20algorithm&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-1-.first_rank_v2_pc_rank_v29&spm=1018.2226.3001.4187

Interleave them, so all the odd bit are latitude, even bit are longitude (0 bit is even)

11100 11101 00100 01111 00000 01101 01011 00001 (64 bit)

The benefit to interleave them instead of just concatenate them:

  1. Ensure nearby area still have the common prefix
  2. Easily reduce precision by removing character from the end of the code

Sample Implementation

Below is a naive implementation taken from Factual. Knowing the above “binary” pattern, this code is pretty self-explanatory.

Optimized Implementation

Now the optimized version is interesting. Again taken from Factual. This post wouldn’t have exist if Factual explains how this 4 lines mange to produce the same output as the big chunk of code.

It is Java.

To understand this, we need to recap how we turn double into binary.

If we take double 2.33 as example, the fractional part is

Notice if the number >= 0.5, get 1

Here is my understanding:

First we scale the latitude or longitude with its range. Basically turning it to positive. if Latitude is 0 for example, right in the middle of (-90, 90), after +90 will become 90/180 = 0.5
Which is right, we turned the latitude into a scaled double.

After that the value is between 0 and 1. Then we turn it into binary. When we turn the fractional part into binary, this is equivalent to checking if the value is in the first half or second half.

In Java, multiple by 0x80000000L just mean turning the ratio into 64 bit binary.

& with 0x7fffffffL means get rid of the sign bit to ensure it is 0

Bit Interleave algorithm

The last piece of the geohash puzzle. After getting 2 double from latitude and longitude, we just need to interleave them into 1 double.

I am sure you can understand this part on your own.

Base 32 Encode to String

After we interleave the latitude and longitude, we get the following:

Red is latitude, in even place

Black is longitude, in odd place

Notice the first 2 0s in the high bit are unused. This is because there are 31 latitude bits, 31 longitude bits, after interleave them there are only 62 of them, so the first 2 bit in a 64 bit long is unused.

When we base32 encode this number to string, we group the bits into group of 5 (base 32 use 5 bits to encode into a character). So remember do not encode the first 2 bit

If we want character precision 5, then we only need the first 5 groups. We just need the number of groups up till the character precision. Notice we can have maximum 12 character precision stored in 64 bit long.

Finding Neighbor

Given a geohash (representing an area), how to find its neighbor?

Image from this helpful website, allow you to check if your implementation is correct.
https://www.movable-type.co.uk/scripts/geohash.html

Find a geohash’s north neighbor

Given geohash 9qh0cwbg, find its north neighbor

If we want to find the north neighbor, we just need to update the latitude bits and keep the longitude bits untouched.

To update the latitude bits means +1 to the correct bit of that latitude bits, not to the interleaved bits to prevent longitude bits get modified.

However, if we just simply +1 to the last bit of the latitude, the increase could be too small for the current precision level. Therefore:

The latitude bit to update is dependent on the character precision

To find the character to update, found out the last group of 5 bits that will used in the base32 encode. Then find the last latitude bit in that group, that’s the bit that will be updated.

More example,

So the relationship between the character precision and the xth bit of latitude to update is:

How many latitude bits are in first 5 blocks? 3 block has 2, 2 block has 3 = 12, 32–12 = 20How many latitude bits are in first 6 blocks? 3 block has 2, 3 block has 3 = 15, 32–15 = 17
How many latitude bits are in first 7 blocks? 4 block has 2, 3 block has 3 = 17, 32 -17 = 15
How many latitude bits are in first 8 blocks? 4 block has 2, 4 block has 3 = 20, 32 -20 = 12How many latitude bits are in first 9 blocks? 5 block has 2, 4 block has 3 = 22, 32 -22 = 10To +1 to the 20th bit, we just need to shift 1 << 19 bit left, so it is always -1 of the xth bit of latitude to update

Find a geohash’s east neighbor

Given geohash 9qh0cwbg, find its east neighbor

So the relationship between the character precision and the xth bit of longitude to update is:

How many longitude bits are in first 5 blocks? 2 block has 2, 3 block has 3 = 13, 32-13 = 19How many longitude bits are in first 6 blocks? 3 block has 2, 3 block has 3 = 15, 32-15 = 17
How many longitude bits are in first 7 blocks? 3 block has 2, 4 block has 3 = 18, 32 -18 = 14
How many longitude bits are in first 8 blocks? 4 block has 2, 4 block has 3 = 20, 32 -20 = 12How many longitude bits are in first 9 blocks? 4 block has 2, 5 block has 3 = 23, 32 -23 = 9To +1 to the 19th bit, we just need to shift 1 << 18 bit left, so it is always -1 of the xth bit of longitude to update

Summary

If all you need is some simple feature with geohash, then may be there is no need to import a huge library. These libraries usually have many utility classes compare to the 4 lines of optimized code we just studied.

Now that you understand how it works, I am sure you can code it up efficiently. Hope this helped you.

--

--

Yat Man, Wong
Yat Man, Wong

Written by Yat Man, Wong

Android developer, problem solver, real man in training

Responses (1)