Home > mailing lists

Re: Overhead of union versus union all - Mailing list pgsql-general

From	Simon Riggs
Subject	Re: Overhead of union versus union all
Date	July 10, 2009 06:21:24
Msg-id	1247217646.11347.559.camel@ebony.2ndQuadrant Whole thread
In response to	Re: Overhead of union versus union all (Scott Marlowe <scott.marlowe@gmail.com>)
Responses	Re: Overhead of union versus union all
List	pgsql-general

Tree view

On Thu, 2009-07-09 at 20:41 -0600, Scott Marlowe wrote:
> On Thu, Jul 9, 2009 at 7:58 PM, Bruce Momjian<bruce@momjian.us> wrote:
> > Scott Bailey wrote:
> >> Alvaro Herrera wrote:
> >> > Tim Keitt wrote:
> >> >> I am combining query results that I know are disjoint. I'm wondering
> >> >> how much overhead there is in calling union versus union all. (Just
> >> >> curious really; I can't see a reason not to use union all.)
> >> >
> >> > UNION needs to uniquify the output, for which it plasters an additional
> >> > sort step, whereas UNION ALL does not need to uniquify its output and
> >> > thus it can avoid the sort step.  Using UNION ALL is recommended
> >> > wherever possible.
> >> >
> >> I think I read somewhere that as of 8.4 it no longer required the sort
> >> step, due to the improvements in hashing. Here it is
> >>
> >> http://wiki.postgresql.org/wiki/WhatsNew84#Performance
> >
> > Oh, yea, hashing is used in some cases rather than sort.  I assume sort
> > is still used if the hash exceeds workmem size.
>
> The important point being that it's still more expensive than a plain
> union all thought, right?

I think it should be possible to use predtest theorem proving to discard
the sort/hash step in cases where we can prove the sets are disjoint.
Often there are top-level quals that can be compared in the WHERE
clauses of the sub-queries, so a shallow search could be quite
profitable in allowing us to rewrite a UNION into a UNION ALL.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

pgsql-general by date:

From: m_lists@yahoo.it
Date: 10 July 2009, 04:19:06
Subject: Re: Performance problem with low correlation data

From: Stuart Bishop
Date: 10 July 2009, 06:28:56
Subject: Re: ubuntu packages for 8.4

Re: Overhead of union versus union all - Mailing list pgsql-general

Previous

Next