Re: pushing order by + limit to union subqueries - Mailing list pgsql-performance

From Tom Lane
Subject Re: pushing order by + limit to union subqueries
Date
Msg-id 28519.1425140641@sss.pgh.pa.us
Whole thread Raw
In response to pushing order by + limit to union subqueries  (Paolo Losi <paolo.losi@gmail.com>)
List pgsql-performance
Paolo Losi <paolo.losi@gmail.com> writes:
>  I've noticed that order by / limit are not distributed to union subqueries
> by the planner:

> Example:

> q1: (select * from t1) union all (select * from t2) order by x limit 10;
> q2: (select * from t1 order by x limit 10) union all (select * from t2
> order by x limit 10)
>       order by x limit 10;

> both queries should be equivalent, but the planner provides hugely different
> plans. I was expecting that the planner could rewrite the first to the
> second.
> Am I overlooking something? If this is the case, can anyone explain why this
> optimization is not performed?

There would be cases where that would be a win, and there would be cases
where it wouldn't be, so I'd not be in favor of making the transformation
blindly.  Unfortunately, given the current state of the planner that's
all we could do really, because the subqueries are planned at arm's
length and then we just mechanically combine them.  Doing it "right" would
entail fully planning each subquery twice, which would be very expensive.

I have a longstanding desire to rewrite the upper levels of the planner to
use path generation and comparison, which should make it more practical
for the planner to compare alternative implementations of UNION and other
top-level constructs.  But I've been saying I would do that for several
years now, so don't hold your breath :-(

            regards, tom lane


pgsql-performance by date:

Previous
From: Paolo Losi
Date:
Subject: pushing order by + limit to union subqueries
Next
From: Gunnlaugur Thor Briem
Date:
Subject: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT