Home > mailing lists

Re: Increase performance of a UNION query that thakes 655.07 msec to be runned ? - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Increase performance of a UNION query that thakes 655.07 msec to be runned ?
Date	February 6, 2004 14:29:47
Msg-id	29924.1076081304@sss.pgh.pa.us Whole thread Raw
In response to	Increase performance of a UNION query that thakes 655.07 msec to be runned ? ("Bruno BAGUETTE" <pgsql-ml@baguette.net>)
Responses	RE : Increase performance of a UNION query that thakes 655.07 msec to be runned ?
List	pgsql-performance

Tree view

"Bruno BAGUETTE" <pgsql-ml@baguette.net> writes:
> Do you see a way to get better performances with this query which takes
> currently 655.07 msec to be done.

> levure=> explain analyze SELECT distinct lower(substr(l_name, 1, 1)) AS
> initiale FROM people
> levure-> UNION
> levure-> SELECT distinct lower(substr(org_name, 1, 1)) AS initiale FROM
> organizations
> levure-> ORDER BY initiale;

This is inherently a bit inefficient since the UNION implies a DISTINCT
step, thus partially repeating the DISTINCT work done inside each SELECT.
It would likely be a tad faster to drop the DISTINCTs from the
subselects and rely on UNION to do the filtering.  However, you're still
gonna have a big SORT/UNIQUE step.

As of PG 7.4 you could probably get a performance win by converting the
thing to use GROUP BY instead of DISTINCT or UNION:

select initiale from (
  select lower(substr(l_name,1,1)) as initiale from people
  union all
  select lower(substr(org_name,1,1)) as initiale from organizations
) ss
group by initiale order by initiale;

This should use a HashAggregate to do the unique-ification.  I think
that will be faster than Sort/Unique.

            regards, tom lane

pgsql-performance by date:

From: Tom Lane
Date: 06 February 2004, 13:55:36
Subject: Re: Seq scan on zero-parameters function

From: Stephan Szabo
Date: 06 February 2004, 14:52:52
Subject: Re: Increase performance of a UNION query that thakes

Re: Increase performance of a UNION query that thakes 655.07 msec to be runned ? - Mailing list pgsql-performance

Previous

Next