Re: Overhead of union versus union all - Mailing list pgsql-general

From Simon Riggs
Subject Re: Overhead of union versus union all
Date
Msg-id 1247233247.11347.587.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: Overhead of union versus union all  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Overhead of union versus union all
List pgsql-general
On Fri, 2009-07-10 at 09:28 -0400, Bruce Momjian wrote:
> Simon Riggs wrote:
> >
> > On Fri, 2009-07-10 at 08:59 -0400, Bruce Momjian wrote:
> >
> > > > I think it should be possible to use predtest theorem proving to
> > > discard
> > > > the sort/hash step in cases where we can prove the sets are
> > > disjoint.
> > > > Often there are top-level quals that can be compared in the WHERE
> > > > clauses of the sub-queries, so a shallow search could be quite
> > > > profitable in allowing us to rewrite a UNION into a UNION ALL.
> > >
> > > I assume we would still need the distinct removal step;  we just avoid
> > > the sort/hash.
> >
> > I mean it seems possible to prove that the distinct removal step is not
> > necessary, by proving that the various sub-queries are already disjoint.
> > It's a common manual optimization, so automating it seems a reasonable
> > future goal.
>
> I am confused what sub-queries produce _distinct_ output.  I know there
> are some that produce _ordered_ output.

None, that was not my point.

If you have a query like this

 Select ..., status, ...
 ...
 where status = '1'
 union
 Select ..., status, ...
 ...
 where status = '2';

or a query like this

 Select '1', ...
 ...
 union
 Select '2', ...
 ...
 ;

or a query like this

 Select '1', ...
 ...
 union
 Select status, ...
 ...
 where status != '1';
 ;

then it is clear that we could automatically prove that the the distinct
step is redundant and so we could either hash or sort. This is the same
as replacing the UNION with UNION ALL.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Overhead of union versus union all
Next
From: Bruce Momjian
Date:
Subject: Re: Overhead of union versus union all