Re: rtree indexes aren't being used with 7.0 - Mailing list pgsql-general

From Jeff Hoffmann
Subject Re: rtree indexes aren't being used with 7.0
Date
Msg-id 39200BEC.5A09AAC6@propertykey.com
Whole thread Raw
List pgsql-general
Tom Lane wrote:

> Jeff, I've applied the attached patch for 7.0.1.  As you'll see if you
> read the comments in that file, the selectivity estimation code for
> r-tree operators is completely bogus.  The idea was to force indexes to
> be used no matter what, but it seems that the numbers I used before were
> actually pretty close to the crossover point.  These should be better.
>
> Perhaps someday someone will try to write selectivity estimators that
> actually mean something for r-trees.  In the meantime, feel free to
> twiddle the numbers in geo_selfuncs.c, and let us know which kluge
> seems to work best ;-)

close, but no cigar.  i kept on dropping that constant until it worked
for all of my tables.  i ended up at 0.0002, but i still haven't tried
it on my biggest tables.  they'll probably be indexing most of the day
so it will be a while before i can check if it works.  i assumed a
fairly linear relationship between # of records and the value of that
constant so that value should work into the low 1 million record range
at least.  i've been trying to avoid mucking around in the source as
much as possible because it seems like it'd be one heckuva learning
curve.  on the other hand it seems silly for me to have to keep on
bumping down that value and recompiling every time my table grows past
some bogus threshold.  how low can that number go?  why did it change so
much from 6.5.3?  IIRC, it was somewhere around 0.25 in 6.5.3.

without understanding how selectivity functions work, would it even be
possible to come up with meaningful functions for geometric types &
rtrees?  my guess is that for ordinary btree type indexes, it would be
based on things like the range of values, which are statistics that i'm
sure are kept (maybe just for this case).  any sort of selectivity on
r-trees would be based on the area you're trying to select vs. the total
area covered by the table.  are these stats kept anywhere?  if they're
not, is there a facility to do that?

jeff

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: rtree indexes aren't being used with 7.0
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Dumping and reloading stuff in 6.5.3