Re: Question about rtrees (overleft replacing left in nodes) - Mailing list pgsql-general

From William White
Subject Re: Question about rtrees (overleft replacing left in nodes)
Date
Msg-id 406B0003.8030902@frognet.net
Whole thread Raw
In response to Re: Question about rtrees (overleft replacing left in nodes)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom Lane wrote:

> It'd need to be
>
> S &< T iff inf(S) <= sup(T)
>
> to satisfy the geometrical intuition.  (You could quibble about the
> equality case, but box_overlap seems to consider touching boxes to
> overlap, so for consistency this should too.)
>
> However, if this is indeed wrong, why have we not heard bug reports
> stating that rtree indexes don't work?

Because the positionsel/positionjoinsel estimator functions make it all
but impossible to invoke index search on r-trees for position operators;
see below.


 > Can you generate a test case
> in which it fails?

Yes, actually; I was able to demonstrate there is a problem after I'd
already sent off the email.  But you have to "cheat", because the
default cost estimator functions left, overleft, right, and overright
all return a large fixed value (0.1, IIRC two orders of magnitude over
areasel/areajoinsel) which all but guarantees sequential scan for any
relvar with a reasonable number of tuples (I tried 100,000 tuples and
still got sequential scan).

These are, of course, constant cost estimator functions.

The following Perl script temporarily replaces oprrest with "areasel"
and oprjoin with "areajoinsel" (instead of positionsel and
positionjoinsel rsp.).  WARNING: This assumes oids 493 and 494
correspond to box_ops << and &< respectively, that happens to be true in
the pgsql version I'm using at the moment.  Could be replaced with test
for operator name and data type, of course.

Suggested remediation, if this is reproducible:

1. Modify overleft and overright to actually mean "overlap or left" and
"overlap or right" respectively, which might break existing code, or

2. Add two new functions, e.g., "overlap_or_left" and
"overlap_or_right", and two operators (e.g., "&&||<<" and "&&||>>" ...
yes, I'm kidding about the ugly operator names), and modify the operator
class definition to use the new operators instead of the old.  Existing
code would continue to work (since &< and &> would now never invoke
index scan), new code could take advantage of index scan on position
operators (assuming the cost estimator were modified to permit this).

Code follows.  Apologies for Perl rather than straight SQL, this was
adapted from part of a test suite for an interval (as in "interval in R"
not the IMNSHO misnamed SQL "interval" type) extension I'm writing,
which is why I ran across the problem.

---- cut here ----

#!/usr/bin/perl
open(P,"|psql >/dev/null 2>&1");
print P <<"EOF";
DROP INDEX i_test2;
DROP TABLE test2;
CREATE TABLE test2
(
     i box NOT NULL
);

CREATE INDEX i_test2 ON test2 using rtree (i);
EOF

$t0 = time;
for ($i=1;$i<=1000;$i++)
{
     printf(P "INSERT INTO test2 (i) VALUES ('(%d,%d),(%d,%d)');\n",
             $i+1,1,$i,0);
}
print P "VACUUM ANALYZE;\n";
close(P);


open(P,"|psql");
print P <<"EOF";

-- Temporarily use cost estimators for areasel/areajoinsel on << and &<
update pg_operator set oprrest = 'areasel', oprjoin = 'areajoinsel'
     where oid = 494 or oid = 493;

-- This works (at least for 10 results) since it'll be towards the right
-- half of leaves

SELECT * FROM test2 WHERE i &< ('(999.5,1),(998.5,0)') LIMIT 1;

-- This fails since it's to the left half of leaves

SELECT * FROM test2 WHERE i &< ('(11.5,1),(10.5,0)') LIMIT 1;

-- Restore original cost estimators
update pg_operator set oprrest = 'positionsel', oprjoin = 'positionjoinsel'
     where oid = 494 or oid = 493;

-- Now they both work:
SELECT * FROM test2 WHERE i &< ('(999.5,1),(998.5,0)') LIMIT 1;
SELECT * FROM test2 WHERE i &< ('(11.5,1),(10.5,0)') LIMIT 1;

EOF

---- cut here ----


pgsql-general by date:

Previous
From: John DeSoi
Date:
Subject: row-level security model
Next
From: Tom Lane
Date:
Subject: Re: select distinct w/order by