Thread: has anybody else used r-tree indexes in 6.5?

has anybody else used r-tree indexes in 6.5?

From

Jeff Hoffmann

Date:

18 June 1999, 15:50:13

has something changed with r-tree indexes in 6.5?  i had a database in
6.4.2 that had a box type and would do queries on it with the box
overlap operator ("&&").  the field i'm querying is indexed with an
r-tree index.  all this worked fine in 6.4.2.  then i did a dump &
restore from 6.4.2 to 6.5 and now the queries that were running ok in
6.4.2 aren't in 6.5.  if i drop the index, everything is fine (albeit
slow).  this is the error i'm getting:

ERROR:  Operator 500 must have a restriction selectivity estimator to be
used in a btree index

which doesn't make sense because it's an r-tree index, first of all. 
and second, i don't know what to look for (especially since this was
working great with 6.4.2).

i'm running on a linux system, compiled with a pretty recent egcs.  the
diffs in the regression tests didn't seem to be anything major.  the
only configuration option i selected was --with-perl when i compiled it,
so so everything should be pretty vanilla.  (i haven't even added in any
of my custom types yet, which have been known to mess with operators in
the past.)

it's really easy to reproduce the error:

create table test_table (area1 box);
insert into test_table values ( '(100,100,200,200)'::box );
create index test_table_index on test_table using rtree ( area1 box_ops
);
select * from test_table where area1 && '(0,0),(100,100)'::box;
drop index test_table_index;
select * from test_table where area1 && '(0,0),(100,100)'::box;

any ideas?

jeff

ps, is it just me, or do things seem a little dead around here today?

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Tom Lane

Date:

18 June 1999, 18:58:39

Jeff Hoffmann <jeff@remapcorp.com> writes:
> ERROR:  Operator 500 must have a restriction selectivity estimator to be
> used in a btree index

I put in that error check, so I must be guilty :-(.  I didn't think it
should affect r-trees either.  I'll take a look.
        regards, tom lane

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Tom Lane

Date:

18 June 1999, 19:36:56

Jeff Hoffmann <jeff@remapcorp.com> writes:
> has something changed with r-tree indexes in 6.5?
> ERROR:  Operator 500 must have a restriction selectivity estimator to be
> used in a btree index

What we have here is a big OOOPS.

The reference to "btree" is coming from btreesel, which it turns out is
called by rtreesel as well --- I missed that the first time around.
But that's just a cosmetic problem in the error message.  The real
problem is that none of the operators that are used for rtree indexes
have restriction selectivity estimators.

They *used* to have selectivity estimators in 6.4.2 --- they all pointed
at intltsel, which is pretty much completely inappropriate for an area-
based comparison.  I believe I must have removed those links from the
pg_operator table during one of my cleanup-and-make-consistent passes.

The right fix would be to put in an appropriate selectivity estimator,
but we can't do that as a 6.5.* patch because changing pg_operator
requires an initdb.  It will have to wait for 6.6.  (One of my to-do
items for 6.6 was to rewrite the selectivity estimators anyway, so I'll
see what I can do.)  In the meantime, I think the only possible patch is
to disable the error check in btreesel and have it return a default
selectivity estimate instead of complaining.  Drat.

Apparently, none of the regression tests exercise rtree indexes at all,
else we'd have known there was a problem.  Adding an rtree regression test
seems to be strongly indicated as well...
        regards, tom lane

RE: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

"Stupor Genius"

Date:

18 June 1999, 20:47:19

> Jeff Hoffmann <jeff@remapcorp.com> writes:
> > has something changed with r-tree indexes in 6.5?
> > ERROR:  Operator 500 must have a restriction selectivity estimator to be
> > used in a btree index
>
> ...
>
> The right fix would be to put in an appropriate selectivity estimator,
> but we can't do that as a 6.5.* patch because changing pg_operator
> requires an initdb.  It will have to wait for 6.6.

Would it possible to write an sql script to put in the changes?

Seems it should be possible to do this for any non-code changes
that affect the system tables.

Darren

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Bruce Momjian

Date:

18 June 1999, 21:00:53

> The right fix would be to put in an appropriate selectivity estimator,
> but we can't do that as a 6.5.* patch because changing pg_operator
> requires an initdb.  It will have to wait for 6.6.  (One of my to-do
> items for 6.6 was to rewrite the selectivity estimators anyway, so I'll
> see what I can do.)  In the meantime, I think the only possible patch is
> to disable the error check in btreesel and have it return a default
> selectivity estimate instead of complaining.  Drat.
> 
> Apparently, none of the regression tests exercise rtree indexes at all,
> else we'd have known there was a problem.  Adding an rtree regression test
> seems to be strongly indicated as well...

Sounds like a good fix.  Bypass the system tables, since we can't change
them,  and hard-wire a selectivity.


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Bruce Momjian

Date:

18 June 1999, 21:22:08

[Charset iso-8859-1 unsupported, filtering to ASCII...]
> > Jeff Hoffmann <jeff@remapcorp.com> writes:
> > > has something changed with r-tree indexes in 6.5?
> > > ERROR:  Operator 500 must have a restriction selectivity estimator to be
> > > used in a btree index
> >
> > ...
> >
> > The right fix would be to put in an appropriate selectivity estimator,
> > but we can't do that as a 6.5.* patch because changing pg_operator
> > requires an initdb.  It will have to wait for 6.6.
> 
> Would it possible to write an sql script to put in the changes?
> 
> Seems it should be possible to do this for any non-code changes
> that affect the system tables.

Yes, we could, and have in the past had SQL scripts that are run as part
of the upgrade, but this fix is easier to do in C because it doesn't
require that difficult step for upgraders.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Thomas Lockhart

Date:

18 June 1999, 22:10:51

> has something changed with r-tree indexes in 6.5?
> any ideas?

revision 1.29
date: 1999/05/31 19:32:47;  author: tgl;  state: Exp;  lines: +61 -5
Generate a more specific error message when an operator used
in an index doesn't have a restriction selectivity estimator.

Tom, was there anything more here than the new elog error exit itself?
It used to ignore the missing estimator, or fail farther in to the
code?

> ps, is it just me, or do things seem a little dead around here today?

At your office, or on the list? Most of us were waiting for your
message ;)
                  - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Thomas Lockhart

Date:

18 June 1999, 22:19:11

> What we have here is a big OOOPS.
> The right fix would be to put in an appropriate selectivity estimator,
> but we can't do that as a 6.5.* patch because changing pg_operator
> requires an initdb.  It will have to wait for 6.6.  (One of my to-do
> items for 6.6 was to rewrite the selectivity estimators anyway, so I'll
> see what I can do.)

Uh, I think we *should* do it as a patch, just not one applied to the
cvs tree for the v.6.5.x branch. Let's apply it to the main cvs branch
once we do the split, and Jeff can use a snapshot at that time (since
it will strongly resemble v6.5 and since he wants the capability).

In the meantime, can you/we develop a set of patches for Jeff to use?
Once we have them, we can post them into
ftp://postgresql.org/pub/patches, which probably needs to be cleaned
out from the v6.4.x period.

Let me know if I can help with any of this...

>  In the meantime, I think the only possible patch is
> to disable the error check in btreesel and have it return a default
> selectivity estimate instead of complaining.  Drat.

... and let's use this solution for the v6.5.x branch, once it comes
into being.
                   - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Tom Lane

Date:

18 June 1999, 22:37:27

Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> In the meantime, I think the only possible patch is
>> to disable the error check in btreesel and have it return a default
>> selectivity estimate instead of complaining.  Drat.

> ... and let's use this solution for the v6.5.x branch, once it comes
> into being.

I've already done that, committed it, and posted the patch on
pgsql-patches.  We can reverse out the patch after something's
been done to provide reasonable selectivity estimates for rtrees.

Applying intltsel, as 6.4 did, was so bogus that it's difficult
to argue that the resulting numbers were better than the 0.5
default estimate I just put into btreesel ;-) ... so I feel no
special desire to return to the status quo ante.  I have a to-do
list item to look at the whole selectivity estimation business,
and I will try to figure out something reasonable for rtrees
while I'm at it.  It may be a while before that gets to the top
of the to-do list (unless someone else gets to it before I do),
but I think this patch will do fine until then.

Mostly I'm embarrassed that we didn't notice the problem during
beta testing :-(.  No regression test, and no users of rtrees
in the beta population either, it would seem.
        regards, tom lane

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Tom Lane

Date:

18 June 1999, 22:59:02

Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> date: 1999/05/31 19:32:47;  author: tgl;  state: Exp;  lines: +61 -5
> Generate a more specific error message when an operator used
> in an index doesn't have a restriction selectivity estimator.

> Tom, was there anything more here than the new elog error exit itself?
> It used to ignore the missing estimator, or fail farther in to the
> code?

That code useta look something like
fmgr(get_oprrest(operatorOID), ...)

so that if get_oprrest returned 0 you'd get an error message along the
lines of "fmgr: no function cache entry for OID 0".  This was pretty
unhelpful, of course, and someone complained about it a few weeks ago;
so I added a test for missing oprrest.  That wasn't what broke things
... what broke things was my removal of seemingly bogus oprrest links
from pg_operator, which I think I did on 4/10:

revision 1.56
date: 1999/04/10 23:53:00;  author: tgl;  state: Exp;  lines: +99 -99
Fix another batch of bogosities in pg_operator table.
These were bogus selectivity-estimator links, like a '>' operator
pointing to intltsel when it should use intgtsel.
        regards, tom lane

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Bruce Momjian

Date:

18 June 1999, 23:01:13

> >  In the meantime, I think the only possible patch is
> > to disable the error check in btreesel and have it return a default
> > selectivity estimate instead of complaining.  Drat.
> 
> ... and let's use this solution for the v6.5.x branch, once it comes
> into being.

He already did this, Thomas.  The 6.5.x branch is currently our only
active branch.  Any patches now applied appear in 6.5.x and 6.6.  We
will split it when someone needs to start on 6.6-only features.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Bruce Momjian

Date:

18 June 1999, 23:14:57

> 
> Applying intltsel, as 6.4 did, was so bogus that it's difficult
> to argue that the resulting numbers were better than the 0.5
> default estimate I just put into btreesel ;-) ... so I feel no
> special desire to return to the status quo ante.  I have a to-do
> list item to look at the whole selectivity estimation business,
> and I will try to figure out something reasonable for rtrees
> while I'm at it.  It may be a while before that gets to the top
> of the to-do list (unless someone else gets to it before I do),
> but I think this patch will do fine until then.
> 
> Mostly I'm embarrassed that we didn't notice the problem during
> beta testing :-(.  No regression test, and no users of rtrees
> in the beta population either, it would seem.
> 

No reason to be emarrassed.  This 6.5 release is our smoothest yet. 
Sometimes, we had some pretty major initial problems, and at this stage,
we would be figuring out when we needed to get the next subrelease out.

We are sitting around at this point, just tweeking things, and have no
major need to rush into a minor release to fix problems because we don't
have a flood of identical bug reports that have users screaming.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Don Baccus

Date:

19 June 1999, 01:06:18

At 11:11 PM 6/18/99 -0400, Bruce Momjian wrote:

>We are sitting around at this point, just tweeking things, and have no
>major need to rush into a minor release to fix problems because we don't
>have a flood of identical bug reports that have users screaming.

Actually, I noticed this also.  I don't have experience with your
earlier release attempts, but there were enough "wink, wink - we'll
be busy chasing initial release bugs" type comments to make me wonder.

This release is ... somewhat mature.  And you seem to be capturing
areas that regression testing doesn't touch.  I don't have to tell
you that the writing of tests is of crucial importance (though of
course entirely inadequate!). 

I'm really impressed with this release...

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at
http://donb.photo.net

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Bernard Frankpitt

Date:

19 June 1999, 16:53:22

I read through some of the papers about R-trees and GIST about a year
ago,
and it seems that estimating costs for R-tree searches (and GIST
searches) is
not so straightforward as B-Trees. 

Hellerstein et al. 1995 write "...currently such estimates are reasonably accurate for B+ trees
and       less so for R-Trees. Recently, some work on R-tree cost
estimation         has been done by [FK94], but more work is required to bring
this to         bear on GISTs in general...." 

The reference that they give is 

[FK94] Christos Faloutsos and Ibrahim Kamel. "Beyond Uniformity and
Independence: Analysis of R-trees using the concept of fractal
dimension.
Proc. 13th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database
Systems, pp 4--13, Minneapolis, May 1994


I don't have the Faloustos paper.  The R-tree code authors, and the GIST
authors just used the B-Tree code as an expedient solution. 

Bernie Frankpitt

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

"Ross J. Reedstrom"

Date:

21 June 1999, 01:07:33

On Sat, Jun 19, 1999 at 09:12:11PM +0000, Bernard Frankpitt wrote:
> I read through some of the papers about R-trees and GIST about a year
> ago,
> and it seems that estimating costs for R-tree searches (and GIST
> searches) is
> not so straightforward as B-Trees. 
> 
> Hellerstein et al. 1995 write 
>     "...currently such estimates are reasonably accurate for B+ trees
> and       less so for R-Trees. Recently, some work on R-tree cost
> estimation         has been done by [FK94], but more work is required to bring
> this to         bear on GISTs in general...." 
> 
> The reference that they give is 
> 
> [FK94] Christos Faloutsos and Ibrahim Kamel. "Beyond Uniformity and
> Independence: Analysis of R-trees using the concept of fractal
> dimension.
> Proc. 13th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database
> Systems, pp 4--13, Minneapolis, May 1994
> 
Hmm, a quick Google search for these two authors hit on a great index server
in Germany:
http://www.informatik.uni-trier.de/~ley/db/index.html
http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/f/Faloutsos:Christos.htm

And that paper in particular:
http://www.informatik.uni-trier.de/~ley/db/conf/pods/pods94-4.html

Which gives an abstract, access to an electronic version (ACS membership
required) and a cite for a more recent (1997) journal paper.

HTH,
Ross

-- 
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Jeff Hoffmann

Date:

21 June 1999, 11:42:42

Tom Lane wrote:
> 
> Jeff Hoffmann <jeff@remapcorp.com> writes:
> > has something changed with r-tree indexes in 6.5?
> > ERROR:  Operator 500 must have a restriction selectivity estimator to be
> > used in a btree index
> 
> What we have here is a big OOOPS.

i guess so.  the patch works fine, though, so no big deal.  i thought it
was weird that i hadn't heard it come up before since it didn't seem
like something i could have caused, but you never know.

> Apparently, none of the regression tests exercise rtree indexes at all,
> else we'd have known there was a problem.  Adding an rtree regression test
> seems to be strongly indicated as well...

i noticed this when i ran the regression tests and everything came out
ok, but forgot to mention it.  if i recall correctly, what's actually in
the geometry regression test is pretty weak.  i think it only really
tests some of the common cases, not all of the functions.  it's probably
not a high priority item, though, since, judging by how long it took for
this bug to surface, there aren't a lot of people using the geometry
functions/types.

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Tom Lane

Date:

21 June 1999, 17:41:53

Jeff Hoffmann <jeff@remapcorp.com> writes:
> Tom Lane wrote:
>> Apparently, none of the regression tests exercise rtree indexes at all,
>> else we'd have known there was a problem.  Adding an rtree regression test
>> seems to be strongly indicated as well...

> i noticed this when i ran the regression tests and everything came out
> ok, but forgot to mention it.  if i recall correctly, what's actually in
> the geometry regression test is pretty weak.  i think it only really
> tests some of the common cases, not all of the functions.  it's probably
> not a high priority item, though, since, judging by how long it took for
> this bug to surface, there aren't a lot of people using the geometry
> functions/types.

That's exactly why we need a more thorough regression test.  The core
developers aren't doing much with the geometry operations, and evidently
neither are any of the frontline beta testers.  So, if the regression
tests don't cover the material either, we stand a good chance of
breaking things and not even knowing it --- which is exactly what
happened here.

It seems that you do make use of the geometry operations; perhaps
you would be willing to work up some more-thorough regression tests?
You're certainly better qualified to do it than I am...
        regards, tom lane

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Jeff Hoffmann

Date:

22 June 1999, 10:37:44

Tom Lane wrote:

> That's exactly why we need a more thorough regression test.  The core
> developers aren't doing much with the geometry operations, and evidently
> neither are any of the frontline beta testers.  So, if the regression
> tests don't cover the material either, we stand a good chance of
> breaking things and not even knowing it --- which is exactly what
> happened here.
> 
> It seems that you do make use of the geometry operations; perhaps
> you would be willing to work up some more-thorough regression tests?
> You're certainly better qualified to do it than I am...
> 

well, depending on how complete you want the regression tests, this
could be fairly easy.  after a quick look at the tests, it seems like
the only type that is really left out is line (which i don't know if
there are any native operators for it anyway, all i know about are the
ones for lsegs).  just a simple select for the forgotten operators in
with the test of other operators for each type would be an improvement. 
i think all of the functions are covered in at least one place.  again,
though, i think everything that i use on a regular basis is covered in
the regression test.  so overall it's really not that bad.  except, of
course, for the bug i uncovered.  there actually is a place where an
rtree index is created, but nothing is every selected against it, which
is what caused this error to go unnoticed.  i haven't looked closely
enough at parts of the other regression tests to see if there are any
selects where indexes come in to play, but it'd be a good idea to make
sure indexes are actually used in the tests for all access methods (and
op classes? - i can't really imagine when this would be a problem, but
who knows). 

i'll try updating some of the dedicated tests (box.sql, circle.sql,
geometry.sql, lseg.sql, path.sql, polygon.sql), but i'm not sure where
testing the rtree indexes should go.  i think other index types are
tested in select.sql, but i'd probably put them in geometry.sql.  does
anybody care?  is there someone that oversees the methods and
organization of the regression tests or do people just throw in new
tests when there's something new?

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Thomas Lockhart

Date:

22 June 1999, 11:17:33

> i'll try updating some of the dedicated tests (box.sql, circle.sql,
> geometry.sql, lseg.sql, path.sql, polygon.sql), but i'm not sure where
> testing the rtree indexes should go.  i think other index types are
> tested in select.sql, but i'd probably put them in geometry.sql.  does
> anybody care?  is there someone that oversees the methods and
> organization of the regression tests or do people just throw in new
> tests when there's something new?

Well, Marc and I had reorganized the regression tests a couple of
years ago, and most of the organizational changes since then have been
done by us too (Marc handling the platform-specific stuff, and I the
tests themselves). But new test areas have been added by others, and
we certainly could use more contributions to existing tests,
reorganizing them if that seems advisable.

I agree with your suggestion to put rtree testing in geometry.sql, at
least until the size of the new tests would suggest separating it into
a new "rtree.sql" test.

Go ahead and do something. We'll apply it to the tree, and if there is
something which provokes someone else into modifying it, we'll do it
then. But I'm sure whatever you do will be fine, since you have
clearly already given some thought to it.

Thanks...
                    - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California

Re: [HACKERS] has anybody else used r-tree indexes in 6.5?

From

Tom Lane

Date:

22 June 1999, 11:18:23

Jeff Hoffmann <jeff@remapcorp.com> writes:
> i'll try updating some of the dedicated tests (box.sql, circle.sql,
> geometry.sql, lseg.sql, path.sql, polygon.sql), but i'm not sure where
> testing the rtree indexes should go.  i think other index types are
> tested in select.sql, but i'd probably put them in geometry.sql.  does
> anybody care?  is there someone that oversees the methods and
> organization of the regression tests or do people just throw in new
> tests when there's something new?

AFAIK we have no regression-test-meister (though we should).  Do what
seems reasonable.

About the only stylistic thing I'd suggest is to try to avoid machine
dependent results.  For example, the existing geometry.sql test causes
a lot of uninteresting comparison failures on many machines because of
small variations in roundoff error.  So, if you can exercise a feature
using only exact-integral inputs and results, do it that way rather than
making up "realistic" test data.  (A lot of people would be very happy
if you could revise this problem away in geometry.sql ... but if that
seems like more work than you bargained for, don't worry about it.
Extending the test coverage is the high-priority task, I think.)

I'd be inclined to agree that rtree indexes should be tested in one
of the existing geometry-related tests, or perhaps in a brand new
regression test, rather than sticking them into the generic select.sql
test.

Thanks for taking a shot at it!
        regards, tom lane