Thread: UNION and bad performance

UNION and bad performance

From
Olivier Pala
Date:
Hi,

I have a performance trouble with UNION query


First I have this view :

    SELECT a,b,c FROM table1, table2 WHERE jointure AND condition

    Result :
150 000 records in 1~2 s



Then, I adding an UNION into the same view :

    SELECT a,b,c FROM table1, table2 WHERE jointure AND condition
    UNION
    SELECT a,b,c FROM table3

    Result :
150 200 records in 6~7 s


Why, do I have bad performance only for 200 adding records ?

Thanks

SGBD : Postgres 8.3 et 8.4

Re: UNION and bad performance

From
"Marc Mamin"
Date:

Hello,

 

UNION will remove all duplicates, so that the result additionally requires to be sorted.

Anyway,  for performance issues, you should always start investigation with explain analyze .

regards,

 

Marc Mamin

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Olivier Pala
Sent: Donnerstag, 9. Dezember 2010 11:52
To: pgsql-performance@postgresql.org
Cc: Olivier Pala
Subject: [PERFORM] UNION and bad performance

 

Hi,

I have a performance trouble with UNION query


First I have this view :

    SELECT a,b,c FROM table1, table2 WHERE jointure AND condition

    Result : 150 000 records in 1~2 s



Then, I adding an UNION into the same view :

    SELECT a,b,c FROM table1, table2 WHERE jointure AND condition
    UNION
    SELECT a,b,c FROM table3

    Result : 150 200 records in 6~7 s



Why, do I have bad performance only for 200 adding records ?

Thanks

SGBD : Postgres 8.3 et 8.4

Re: UNION and bad performance

From
Andreas Kretschmer
Date:
Marc Mamin <M.Mamin@intershop.de> wrote:

> Hello,
>
>
>
> UNION will remove all duplicates, so that the result additionally requires to
> be sorted.

Right, to avoid the SORT and UNIQUE - operation you can use UNION ALL


>
> Anyway,  for performance issues, you should always start investigation with
> explain analyze .

ACK.

    Arguments to support bottom-posting...

    A: Because we read from top to bottom, left to right.
    Q: Why should I start my reply below the quoted text?

    A: Because it messes up the order in which people normally read text.
    Q: Why is top-posting such a bad thing?

    A: The lost context.
    Q: What makes top-posted replies harder to read than bottom-posted?

    A: Yes.
    Q: Should I trim down the quoted part of an email to which I'm replying?




Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: UNION and bad performance

From
pasman pasmański
Date:
> UNION will remove all duplicates, so that the result additionally requires to
> be sorted.

>Right, to avoid the SORT and UNIQUE - operation you can use UNION ALL


by the way maybe apply hashing to calculate UNION be better ?


------------
pasman

Re: UNION and bad performance

From
Robert Haas
Date:
2010/12/12 pasman pasmański <pasman.p@gmail.com>:
>> UNION will remove all duplicates, so that the result additionally requires to
>> be sorted.
>
>>Right, to avoid the SORT and UNIQUE - operation you can use UNION ALL
>
>
> by the way maybe apply hashing to calculate UNION be better ?

The planner already considers such plans.

rhaas=# explain select a from generate_series(1,100) a union select a
from generate_series(1,100) a;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 HashAggregate  (cost=45.00..65.00 rows=2000 width=4)
   ->  Append  (cost=0.00..40.00 rows=2000 width=4)
         ->  Function Scan on generate_series a  (cost=0.00..10.00
rows=1000 width=4)
         ->  Function Scan on generate_series a  (cost=0.00..10.00
rows=1000 width=4)
(4 rows)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: UNION and bad performance

From
David G Johnston
Date:
pinker wrote
>>> rhaas=# explain select a from generate_series(1,100) a union select a
>>> from generate_series(1,100) a;
>>>                                      QUERY PLAN
>>> --------------------------------------------------------------------------------------
>>> HashAggregate  (cost=45.00..65.00 rows=2000 width=4)
>>>   ->  Append  (cost=0.00..40.00 rows=2000 width=4)
>
>
> Why in this case the estimated number of rows is 2000? Is it standard
> planner behavior?

http://www.postgresql.org/docs/9.1/static/sql-createfunction.html

Note the "ROWS" property.

Functions are black-boxes to the planner so it has no means of estimating a
row count.  So a set returning function uses 1,000 and all others use 1.

Determining "COST" is similarly problematic.

David J.







--
View this message in context: http://postgresql.1045698.n5.nabble.com/UNION-and-bad-performance-tp3301375p5806450.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: UNION and bad performance

From
pinker
Date:
>> rhaas=# explain select a from generate_series(1,100) a union select a
>> from generate_series(1,100) a;
>>                                      QUERY PLAN
>> --------------------------------------------------------------------------------------
>> HashAggregate  (cost=45.00..65.00 rows=2000 width=4)
>>   ->  Append  (cost=0.00..40.00 rows=2000 width=4)


Why in this case the estimated number of rows is 2000? Is it standard
planner behavior?





--
View this message in context: http://postgresql.1045698.n5.nabble.com/UNION-and-bad-performance-tp3301375p5806445.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.