Introduction
Location-based services (proximity systems) are foundational components of modern applications, powering services like:
- Ride-sharing (Uber, Lyft)
- Food delivery (DoorDash)
- Navigation (Google Maps)
- Local business discovery (Yelp)
- Dating applications (Tinder)
These systems face two core technical challenges:
- Efficiently identifying relevant nearby entities (users, businesses or locations) within specified geographic boundaries
- Managing and processing real-time location updates at scale
In this module, we'll first explore the core technologies that power these location-based services and then walk through three practical examples to help you master the principles of proximity system design.
Related Concepts
Geo Point or Geo Location
A Geo point precisely defines a location on Earth using two coordinates:
- Latitude (lat): Angular distance north or south of the equator
- Longitude (lng/lon): Angular distance east or west of the prime meridian
QuadTree vs GeoHash
QuadTree
A QuadTree is a spatial data structure that recursively partitions 2D space into four equal quadrants. Each internal node has exactly four children, representing northwest (NW), northeast (NE), southwest (SW), and southeast (SE) regions. The division process continues until each leaf node contains no more than a predefined number of points.
Key advantages:
- Highly memory-efficient for non-uniform data distribution (e.g., dense urban centers vs. sparse rural areas)
- Excels at spatial relationship queries (containment, intersection)
- Efficient for complex geometric queries (e.g., finding points within irregular polygons)
- Adaptive to data density variations
Trade-off: Requires rebalancing operations when adding or removing points, which can impact performance in highly dynamic scenarios.
GeoHash
GeoHash is a hierarchical spatial data structure that encodes geographic coordinates into alphanumeric strings. It creates a grid system where longer strings represent smaller, more precise areas. Each additional character in the hash divides the previous cell into 32 subcells, increasing precision.
For example:
- "9q9" approximately represent the San Francisco area
- "9q9hvu" pinpoints a specific location within San Francisco
Key advantages:
- Optimal for real-time location updates due to simple encoding/decoding
- Efficient for proximity searches using prefix matching
- Eeay to shard/partition data by prefix.
- Excellent for write-heavy workloads (because QuadTree may require rebalance on writes).
- Easy to cache popular regions
Trade-off:
- can be very tricky in handling edge cases on Grid Boundaries. Two very close points can have very different GeoHash prefixes.
- Rectangular vs. Circular Search. GeoHash cells are rectangular but most "find nearby" queries want results within a radius (my current location +/- 1 mile).
So GeoHash may have higher processing overheads (i.e: fetching/filtering additional cells) when supporting complex geometric queries.
Building Blocks
- Geo Coding
- Geo Coding is a process (a service) to “translate” an address or area names into geo coordinates. They’re known geocoding service providers such as: Google Geocoding API, MapBox, Amazon Location Service.
- Why is this needed? When doing location-based search, users are unlikely to provide or search by coordinates ( "lat": 37.7749, "lng": -122.4194,). What’s more likely to happen is searching by an address (i.e: find all Thai restaurants within 1 miles of my current location, or on Mission street). So the backend needs to “translate” a user-provided location to a Geo point first (and then do processing & database query using Geo point).
- SQL Database (Postgres + PostGIS extension)
- PostGIS extends PostgreSQL with geographic objects and functions
- Supports complex spatial queries, geometric operations, and indexing
- Example: storing static delivery zones, service areas or other complex geo boundaries;
- NoSQL Database (MongoDB)
- Native support for GeoJSON format and 2d-sphere indexes.
- Good for simple proximity queries and high write throughputs
- Example: storing user location data.
- In-Memory (Redis)
- Redis supports geospatial indexing through
GEOADD
andGEORADIUS
commands. (Redis uses a 52-bit GeoHash implementation). - Can support extremely fast for simple radius queries
- Example: Real-time proximity searches, caching frequent location queries
- Redis supports geospatial indexing through
- Elastic Search + GEO Support
- Supports multiple approaches: geo_point, geo_shape
- Efficient for large-scale text search combined with location filtering
- (Extra) Provides flexible scoring based on distance
- Best for: Location-aware search, combining full-text search with proximity