Order of multicolumn gist index - Mailing list pgsql-general
From | Paul van der Linden |
---|---|
Subject | Order of multicolumn gist index |
Date | |
Msg-id | PA4PR04MB9320F0DDD87B68F2696738E5FA6A2@PA4PR04MB9320.eurprd04.prod.outlook.com Whole thread Raw |
List | pgsql-general |
I have the following query:
SELECT *
FROM polygons
WHERE zoomlevel <= {zoom} AND st_intersects(way,{tileboundary})
For any given tile according to the openstreetmap tiles.
So zoomlevel is from 0..14 and the number of polygons in each level is roughly exponential.
Postgres doc (https://www.postgresql.org/docs/current/indexes-multicolumn.html) states that
“A GiST index will be relatively ineffective if its first column has only a few distinct values, even if there are many distinct values in additional columns.”
So I am trying to figure out the difference between the column order in the index.
Created following indices:
CREATE INDEX polygon_minzoom_geo ON polygons USING gist (minzoom,way)
CREATE INDEX polygon_minzoom_geo2 ON polygons USING gist (way,minzoom)
And did a test for a zoom=9 tile:
SELECT *
FROM polygons
WHERE zoomlevel <= 9 AND st_intersects(way,'SRID=3857;POLYGON((547900 6653078,547900 6574807,626172 6574807,626172 6653078,547900 6653078))')
After running that query with either one of the indices disabled (BEGIN; DROP INDEX etc) I get the following explain results:
Using (minzoom,way):
Result (cost=0.42..228992.47 rows=229000 width=113) (actual time=149.483..1471.819 rows=42463 loops=1)
Buffers: shared hit=352653
-> ProjectSet (cost=0.42..4572.47 rows=229000 width=88) (actual time=149.461..1234.048 rows=42463 loops=1)
Buffers: shared hit=274281
-> Index Scan using polygon_minzoom_geo on polygons (cost=0.42..3283.20 rows=229 width=264) (actual time=149.409..955.849 rows=42463 loops=1)
Index Cond: ((minzoom <= 9) AND (way && '0103000020110F00000100000005000000BE5C7A45F81B23412C8346BC25615941BE5C7A45F81B2341B99930DBB51459412611CB3C79B82041B99930DBB51459412611CB3C79B820412C8346BC25615941BE5C7A45F81B23412C8346BC25615941'::geometry))
Filter: ((minzoom <= 9) AND (way && '0103000020110F00000100000005000000BE5C7A45F81B23412C8346BC25615941BE5C7A45F81B2341B99930DBB51459412611CB3C79B82041B99930DBB51459412611CB3C79B820412C8346BC25615941BE5C7A45F81B23412C8346BC25615941'::geometry))
Rows Removed by Filter: 345
Buffers: shared hit=274025
Planning Time: 7.220 ms
Execution Time: 1494.267 ms
Using (way,minzoom):
Result (cost=0.42..228992.47 rows=229000 width=113) (actual time=178.747..1715.135 rows=42463 loops=1)
Buffers: shared hit=350570
-> ProjectSet (cost=0.42..4572.47 rows=229000 width=88) (actual time=178.731..1436.764 rows=42463 loops=1)
Buffers: shared hit=272198
-> Index Scan using polygon_minzoom_geo2 on polygons (cost=0.42..3283.20 rows=229 width=264) (actual time=178.683..1118.691 rows=42463 loops=1)
Index Cond: ((way && '0103000020110F00000100000005000000BE5C7A45F81B23412C8346BC25615941BE5C7A45F81B2341B99930DBB51459412611CB3C79B82041B99930DBB51459412611CB3C79B820412C8346BC25615941BE5C7A45F81B23412C8346BC25615941'::geometry) AND (minzoom <= 9))
Filter: ((way && '0103000020110F00000100000005000000BE5C7A45F81B23412C8346BC25615941BE5C7A45F81B2341B99930DBB51459412611CB3C79B82041B99930DBB51459412611CB3C79B820412C8346BC25615941BE5C7A45F81B23412C8346BC25615941'::geometry) AND (minzoom <= 9))
Rows Removed by Filter: 345
Buffers: shared hit=271942
Planning Time: 9.427 ms
Execution Time: 1742.729 ms
So all in all not really a big difference.
Is this situation somehow special and thus the remark in the documentation not applicable here, or am I missing something in the analysis that would show up the difference?
Paul
P.S. when replying, please include me too
pgsql-general by date: