Re: [HACKERS] "ExecInitIndexScan: both left and right..." meaning? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] "ExecInitIndexScan: both left and right..." meaning?
Date
Msg-id 8764.945633824@sss.pgh.pa.us
Whole thread Raw
In response to [HACKERS] "ExecInitIndexScan: both left and right..." meaning?  (Ed Loehr <ELOEHR@austin.rr.com>)
List pgsql-hackers
Well, the good news is that I understand this problem, and in fact fixed
it in current sources several months ago.  The bad news is that the fix
is part of some rather extensive planner revisions, and I don't think
it's going to be safe/practical to back-patch it into REL6_5.

It's possible to duplicate the bug in 6.5.* with this test case:

create table t1 (f1 int, f2 int);
create index t1i on t1 (f1,f2);
create table t2 (f1 int, f2 int);
create index t2i on t2 (f1,f2);
create table t3 (f1 int, f2 int);
create index t3i on t3 (f1,f2);

select t1.f1 from t1,t2,t3 where
t1.f1=1 and t2.f1=1 and t3.f1=1 and
t1.f2=t2.f2 and t1.f2=t3.f2 and t2.f2=t3.f2;
ERROR:  ExecInitIndexScan: both left and right op's are rel-vars

The generated query plan is

Nested Loop  (cost=6.05 rows=1 width=16) ->  Nested Loop  (cost=4.05 rows=1 width=12)       ->  Index Scan using t3i on
t3 (cost=2.05 rows=1 width=4)       ->  Index Scan using t1i on t1  (cost=2.00 rows=1 width=8) ->  Index Scan using t2i
ont2  (cost=2.00 rows=1 width=4)
 

and the source of the problem is that in the innermost indexscan on t1,
the planner is trying to use the WHERE clause "t1.f2=t2.f2" as an index
qualification.  But it can't do that because in this query plan, t2
hasn't been joined to t1 yet.  (It should have used "t1.f2=t3.f2"
instead; the value of t3.f2 is available since t3 is the outer side of
the nestloop join, meaning that in any one scan of t1, a fixed tuple
from t3 is being considered.)

The reason it makes this mistake is an ill-chosen data structure for
representing which WHERE clauses require which sets of outer relations
in order to be used as indexquals on the inside of a nestloop.  In 6.5,
that info was attached to the first WHERE clause in the list of clauses
that might possibly be used with the index --- which in this example is
the t1.f1=1 clause.  Trouble is, that clause can *also* be used in
joining t1 to t3, and there's only one of it, which means its
outer-join-relation field gets overwritten with the info for the join
considered last.  So by the time we actually get around to generating
the plan, the clause list is marked as "OK to use in joining t1 to t3".
Oops.

I have fixed this in current sources by removing the field in question
from RestrictInfo nodes and storing the information in separate lists.
But it's a pretty major change and I don't want to try to back-patch it.

I would suggest, instead, that you work around the problem until 7.0
comes out.  I think you could do this by removing your two-column
indexes in favor of single-column indexes, or even just switching the
order of the indexes (in the above test case, no bug is seen if the
indexes are declared on (f2,f1)).  However switching the order would be
a bit fragile since it'd depend on which fields you compare to constants
and which ones you use as join keys in your queries.  If that doesn't
work, a brute-force solution is to run your application with environment
variable PGOPTIONS="-fn" (forbid nestloop joins), which discourages the
planner from considering nestloop joins at all.  The bug will not arise
if a merge or hash join plan is used.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Keith Parks
Date:
Subject: RE: [HACKERS] NOTICE: LockRelease: locktable lookup failed, no lock
Next
From: Ed Loehr
Date:
Subject: Re: [HACKERS] "ExecInitIndexScan: both left and right..." meaning?