Tom Lane wrote:
> You may need to bite the bullet and try to devise some real selectivity
> estimation techniques for your geometric operators. The stuff in
> src/backend/utils/adt/geo_selfuncs.c at the moment is all just stubs :-(
I've already done this - it actually gives pretty accurate estimates.
Basically, I do a two-stage pass through the geometry table. The first
pass just calculates the bounding box of all the geometries. I make a
2D histogram structure (much like a really really simple quad tree) that
fits this bounding box. The 2nd pass looks at each individual geometry
in the table - it updates one (or more) cells in the 2D histogram.
When the planner asks for the selectivity of a '&&' query, I pull in the
2D Histogram (its in the geometry_column metadata table) and look at the
query's 'query bounding box' and see which cells of the 2D histogram it
overlaps. I make an estimate of the number of rows the query will
return by looking at the % overlap of the query window and the number of
geometries in that 2D histogram cells. You'd think such a crude test
wouldnt give you good results, but in fact its amazingly accurate.
If for some reason I cannot calculate an estimate, I fall back to the
geo_selfuncs.c method and return some really small number.
The PostGIS mailing list (postgis.refractions.net) has more technical
details.
The only real problem is that the user has to manual keep stats
up-to-date. Is there anyway to attach something to VACUUM ANALYSE?
As an aside, PostGIS is now passing the Open GIS Consortium's "Simple
Features For SQL" conformance test. We're hoping to submit it for the
OGC stamp of approval "very soon."
Thanks for the info on start-up cost being more heavily weighted. The
user who reported this problem didnt have the index-selectivity-stats
package turned on in their database. Once they turned it on, everything
worked correctly. I was just confused as to why the DECLARE and SELECT
were making different plans.
dave