Re: Project proposal/comments please - query optimization - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Project proposal/comments please - query optimization
Date
Msg-id 22543.1123773443@sss.pgh.pa.us
Whole thread Raw
In response to Project proposal/comments please - query optimization  (Kim Bisgaard <kib+pg@dmi.dk>)
Responses Re: Project proposal/comments please - query optimization  (Kim Bisgaard <kib+pg@dmi.dk>)
List pgsql-hackers
Kim Bisgaard <kib+pg@dmi.dk> writes:
> I have noticed a deficiency in the current query optimizer related to 
> "full outer joins". Tom Lane has confirmed to me that it will not be 8.1 
> material.

The particular case you are complaining of is fixed in CVS tip.  There
are related issues involving N-way joins that we're still not very
good at.

regression=# create table at (x int, y timestamp, av text);
CREATE TABLE
regression=# create table bt (x int, y timestamp, bv text);
CREATE TABLE
regression=# create index ati on at(x,y);
CREATE INDEX
regression=# create index bti on bt(x,y);
CREATE INDEX
regression=# explain SELECT x, y, av, bv FROM at a FULL OUTER JOIN bt b USING (x, y) WHERE x = 52981 AND y = '2004-1-1
0:0:0';                                         QUERY PLAN
 
------------------------------------------------------------------------------------------------Merge Full Join
(cost=0.00..9.66rows=1 width=88)  ->  Index Scan using ati on "at" a  (cost=0.00..4.83 rows=1 width=44)        Index
Cond:((x = 52981) AND (y = '2004-01-01 00:00:00'::timestamp without time zone))  ->  Index Scan using bti on bt b
(cost=0.00..4.83rows=1 width=44)        Index Cond: ((x = 52981) AND (y = '2004-01-01 00:00:00'::timestamp without time
zone))
(5 rows)

regression=#

This only works for WHERE clauses that equate join alias variables to
pseudoconstants.  I have this in my notes:

Consider this version of Kim Bisgaard's example:SELECT FROM a join (b full join c using (id)) using (id)
If A is small and B,C have indexes on ID then it is interesting to consider
a plan likeNest Loop    Scan A    Merge Full Join        Indexscan B using id = outer.id        Indexscan C using id =
outer.id
We are fairly far from being able to do this. generate_outer_join_implications
could easily be modified to generate derived equalities (I think it works to
allow a deduction against any clause not overlapping the outerjoin itself)
but the planner would want to evaluate them at the wrong level, and the
executor doesn't have support for passing the outer variable down more than
one level of join.  This is why the existing hack works only for equalities
to pseudoconstants.  We could maybe mark join RestrictInfos as "valid only
below xxx" and ignore them when processing a join that includes all of the
indicated rels?  Still not clear how you get the planner to recognize the
above as an inner indexscan situation though.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Matt Miller
Date:
Subject: Re: [GENERAL] Testing of MVCC
Next
From: Martijn van Oosterhout
Date:
Subject: Determining return type of polymorphic function