Thread: GiST rtree logic is not right
It occurred to me to wonder whether contrib/rtree_gist fixes the rtree bug documented here: http://archives.postgresql.org/pgsql-general/2004-03/msg01143.php The answer is unfortunately "no". In the regression database, install rtree_gist and do: regression=# create table gist_emp4000 as select * from slow_emp4000; SELECT regression=# create index grect2ind ON gist_emp4000 USING gist (home_base); CREATE INDEX regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';count ------- 2144 (1 row) The correct answer is regression=# select count(*) from slow_emp4000 where home_base << '(35565,5404),(35546,5360)';count ------- 2214 (1 row) Now this is noticeably better than the rtree implementation, which finds only 1363 rows, but broken is still broken :-( This is doubtless not as high priority as the concurrency stuff you are working on, but it'd be good to fix anyway. I was thinking of proposing that we move rtree_gist into the core --- but the case for it would be stronger if it worked correctly ... regards, tom lane
I'll look at problem after GiST concurrency. Fixing rtree_gist is bug a fix, not a new feature, so I'm not limited by 1 July. > This is doubtless not as high priority as the concurrency stuff you > are working on, but it'd be good to fix anyway. I was thinking of > proposing that we move rtree_gist into the core --- but the case for > it would be stronger if it worked correctly ... > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
> I'll look at problem after GiST concurrency. Fixing > rtree_gist is bug a fix, not a new feature, so I'm not > limited by 1 July. Wont fixing rtree(_gist) require initdb, since the behaviour of the operators will change? ... John
Hmm, something is broken in HEAD. I recall it worked in STABLE, here is what I have Welcome to psql 8.0.3, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \gor terminate with semicolon to execute query \q to quit regression=# create index grect2ind ON gist_emp4000 USING gist (home_base); CREATE INDEX regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404) ,(35546,5360)'; count ------- 2214 (1 row) btw, Teodor, I noticed gist_print from gevel package doesn't works in HEAD regression=# select * from gist_print('grect2ind') as t(level int, a box) where level =1; ERROR: function return row and query-specified return row do not match DETAIL: Returned row contains 3 attributes, but query expects 2. Oleg On Wed, 22 Jun 2005, Tom Lane wrote: > It occurred to me to wonder whether contrib/rtree_gist fixes the rtree > bug documented here: > http://archives.postgresql.org/pgsql-general/2004-03/msg01143.php > > The answer is unfortunately "no". In the regression database, > install rtree_gist and do: > > regression=# create table gist_emp4000 as select * from slow_emp4000; > SELECT > regression=# create index grect2ind ON gist_emp4000 USING gist (home_base); > CREATE INDEX > regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)'; > count > ------- > 2144 > (1 row) > > The correct answer is > > regression=# select count(*) from slow_emp4000 where home_base << '(35565,5404),(35546,5360)'; > count > ------- > 2214 > (1 row) > > Now this is noticeably better than the rtree implementation, which finds > only 1363 rows, but broken is still broken :-( > > This is doubtless not as high priority as the concurrency stuff you > are working on, but it'd be good to fix anyway. I was thinking of > proposing that we move rtree_gist into the core --- but the case for > it would be stronger if it worked correctly ... > > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes: > something is broken in HEAD. I recall it worked in STABLE, here is what I have No, actually it is broken in 8.0 too. The difference is that 8.0 defaults to not using an indexscan for this query. In 8.0 I see: regression=# set enable_seqscan TO 1; SET regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)'; QUERY PLAN ---------------------------------------------------------------------Aggregate (cost=65.53..65.53 rows=1 width=0) -> SeqScan on gist_emp4000 (cost=0.00..64.75 rows=310 width=0) Filter: (home_base << '(35565,5404),(35546,5360)'::box) (3 rows) regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';count ------- 2214 (1 row) regression=# set enable_seqscan TO 0; SET regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)'; QUERY PLAN ----------------------------------------------------------------------------------------Aggregate (cost=112.96..112.96 rows=1width=0) -> Index Scan using grect2ind on gist_emp4000 (cost=0.00..112.18 rows=310 width=0) Index Cond: (home_base<< '(35565,5404),(35546,5360)'::box) (3 rows) regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';count ------- 2144 (1 row) In CVS tip the default plan is: regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)'; QUERY PLAN --------------------------------------------------------------------------------Aggregate (cost=35.74..35.74 rows=1 width=0) -> Bitmap Heap Scan on gist_emp4000 (cost=5.08..34.96 rows=310 width=0) Recheck Cond: (home_base << '(35565,5404),(35546,5360)'::box) -> Bitmap Index Scan on grect2ind (cost=0.00..5.08 rows=310 width=0) Index Cond: (home_base << '(35565,5404),(35546,5360)'::box) (5 rows) regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';count ------- 2144 (1 row) regards, tom lane
"John Hansen" <john@geeknet.com.au> writes: >> I'll look at problem after GiST concurrency. Fixing >> rtree_gist is bug a fix, not a new feature, so I'm not >> limited by 1 July. > Wont fixing rtree(_gist) require initdb, since the behaviour of the > operators will change? Possibly, but we never guarantee no initdb until final release anyway. Teodor is right that this is a bug fix and so can be postponed on its own terms. But moving rtree_gist into the core looks like a feature change to me, so if that's going to happen it has to happen before 1 July. It would be a lot easier to sell that if it gave the right answers ;-) regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > It would be a lot easier to sell that if it gave the right answers ;-) Of course given that the real rtree index gives the wrong answers perhaps moving rtree_gist into core (after fixing this) is just a bug fix :) -- greg