When working with spatial databases, it often helps to create a spatial index for your datasets. The spatial index makes it more efficient to process spatial queries by helping narrow down the number of features that be tested against. For instance, if you want to know which polygon a point falls into you can check for intersection with EVERY possible polygon, or you can use a spatial index to limit the number of point in polygon operations you have to do. A common spatial index is based on the bounding box of the polygon. Checking if a point is within the bounding box is much quicker than testing for spatial intersection with a complex polygon. By limiting to just the polygons where the point is inside the bounding box, we can limit our need to do the ‘more expensive’ calculation and just calculate against the smaller set of likely polygons.
More information on spatial indexes:
How do you create a spatial index?
Easy. Here is an example of creating an index on a table:
create index <table name>_
on <table name>
using <geometry field>)
Note that some of the functionality will use a spatial index by default (e.g., ). That is super cool. Also note that you can easily force it to ignore the spatial index by using the _ function instead.
Now, let’s do a test to see how much of a difference the spatial index makes!
- Points – a dataset of 1000 random point locations, all within the conterminous US
(gratuitous ugly map of random point data!)
- Polygons – three datasets of different sizes covering entire US (including AK and HI). The datasets are from the US Census cartographic boundary files
- US States (51 polygons) – includes DC
- US Counties (3233 polygons)
- US Census tracts (74002 polygons)
How long does it take for a query to identify the polygon that each point falls inside using the following query?
from random1000 as point
join tl_2018_us_state as poly
on (, )
Or, if you want the actual numbers:
Without spatial index
With spatial index
For reference, the () function uses a spatial index by default, but you can force it to ignore the index by using _() instead (note the underscore ( _ ) before the in the non-indexed query. That is what I used in the tests above.
Seems worthwhile to use a spatial index!