Re: Proposed Query Planner TODO items - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Proposed Query Planner TODO items
Date
Msg-id 29947.1070985059@sss.pgh.pa.us
Whole thread Raw
In response to Proposed Query Planner TODO items  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Proposed Query Planner TODO items
Re: Proposed Query Planner TODO items
List pgsql-hackers
Josh Berkus <josh@agliodbs.com> writes:
> Summary: Currently, queries with complex "or group" criteria get devolved by 
> the planner into canonical and-or filters resulting in very poor execution on
> large data sets.   We should find better ways of dealing with these queries, 
> for example UNIONing.

Could we see the actual present query plans for both the TPC-R query
and the UNION version?  (I'll settle for "explain" on the slow
version, but "explain analyze" on the other, please.)

In general I am suspicious of proposals to rewrite queries into UNION
"equivalents", because the "equivalent" usually isn't exactly
equivalent, at least not without conditions that the planner can't
easily prove.  This proposal looks a lot like the KSQO optimization that
we put in and then took out again several years ago --- it also rewrote
queries into a UNION form, only the UNION didn't necessarily produce
identical results.

I am thinking that the guys who do this query fast are probably
extracting single-relation subsets of the big OR/AND clause, so that
they can do some filtering of the input tables before the join.  Our
existing planner would think that the OR/AND clause is only usable at
the join step, which is why it's seqscanning.  But if we pulled out
subsets, we could have for instance

WHERE t1.a = t2.a
AND (( t1.c = x  AND t1.f IN (m, n, o)  AND t2.d = v  AND t2.e BETWEEN j AND k)OR( t1.c = y  AND t1.f IN (n, o, p)  AND
t2.d= v  AND t2.e BETWEEN k AND h)OR ( t1.c = z  AND t1.f IN (p, q)  AND t2.d = w  AND t2.e BETWEEN k AND h))
 
AND ( t1.c = x OR t1.c = y OR t1.c = z )

which is redundant, but that last clause could enable an indexscan on t1.c.

However ... the planner has code in it already that should do something
close to that, so there may be something I am missing.  Again, could we
see EXPLAIN results?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Doug McNaught
Date:
Subject: Re: PostgreSQL port to pure Java?
Next
From: "Keith Bottner"
Date:
Subject: Re: PostgreSQL port to pure Java?