Thread: Pushing limit into subqueries of a union

Pushing limit into subqueries of a union

From
Phil Endecott
Date:
Dear Experts,

Here is another "how can I rewrite this to go faster" idea.

I have two tables T1 and T2 and a view V that is the UNION ALL of T1 and
T2.  The tables have an editdate field, and I want to get the n most
recently changed rows:

   select * from V order by editdate desc limit 40;

This seems to unconditionally read the whole of T1 and T2, so it is slow.

T1 and T2 both have indexes on the editdate attribute, so if I write

(select * from T1 order by editdate desc limit 40)
union all (select * from T2 order by editdate desc limit 40)
order by editdate desc limit 40;

I get the same results, about 1000 times faster.

I presume that PostgreSQL doesn't try to push the limit clause into the
subqueries of a UNION ALL in this way.  I believe it is safe, isn't it?


Cheers,  Phil.




Re: Pushing limit into subqueries of a union

From
Tom Lane
Date:
Phil Endecott <spam_from_postgresql_general@chezphil.org> writes:
> I presume that PostgreSQL doesn't try to push the limit clause into the
> subqueries of a UNION ALL in this way.  I believe it is safe, isn't it?

Hmm.  You don't actually want to push the LIMIT as such into the subplan
--- that would create an extra level of plan node that would simply
waste time at runtime.  What you want is for the subquery to be planned
on the assumption that only a small number of tuples will be fetched,
so that "fast start" plans are preferred.

We have all the mechanism for this, but prepunion.c isn't exploiting it.
I'll see what I can do.

            regards, tom lane