Thread: UNION and bad performance
I have a performance trouble with UNION query
First I have this view :
SELECT a,b,c FROM table1, table2 WHERE jointure AND condition
Result : 150 000 records in 1~2 s
Then, I adding an UNION into the same view :
SELECT a,b,c FROM table1, table2 WHERE jointure AND condition
UNION
SELECT a,b,c FROM table3
Result : 150 200 records in 6~7 s
Why, do I have bad performance only for 200 adding records ?
Thanks
SGBD : Postgres 8.3 et 8.4
Hello,
UNION will remove all duplicates, so that the result additionally requires to be sorted.
Anyway, for performance issues, you should always start investigation with explain analyze .
regards,
Marc Mamin
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Olivier Pala
Sent: Donnerstag, 9. Dezember 2010 11:52
To: pgsql-performance@postgresql.org
Cc: Olivier Pala
Subject: [PERFORM] UNION and bad performance
Hi,
I have a performance trouble with UNION query
First I have this view :
SELECT a,b,c FROM table1, table2 WHERE jointure AND condition
Result : 150 000 records in 1~2 s
Then, I adding an UNION into the same view :
SELECT a,b,c FROM table1, table2 WHERE jointure AND condition
UNION
SELECT a,b,c FROM table3
Result : 150 200 records in 6~7 s
Why, do I have bad performance only for 200 adding records ?
Thanks
SGBD : Postgres 8.3 et 8.4
Marc Mamin <M.Mamin@intershop.de> wrote: > Hello, > > > > UNION will remove all duplicates, so that the result additionally requires to > be sorted. Right, to avoid the SORT and UNIQUE - operation you can use UNION ALL > > Anyway, for performance issues, you should always start investigation with > explain analyze . ACK. Arguments to support bottom-posting... A: Because we read from top to bottom, left to right. Q: Why should I start my reply below the quoted text? A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: The lost context. Q: What makes top-posted replies harder to read than bottom-posted? A: Yes. Q: Should I trim down the quoted part of an email to which I'm replying? Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
> UNION will remove all duplicates, so that the result additionally requires to > be sorted. >Right, to avoid the SORT and UNIQUE - operation you can use UNION ALL by the way maybe apply hashing to calculate UNION be better ? ------------ pasman
2010/12/12 pasman pasmański <pasman.p@gmail.com>: >> UNION will remove all duplicates, so that the result additionally requires to >> be sorted. > >>Right, to avoid the SORT and UNIQUE - operation you can use UNION ALL > > > by the way maybe apply hashing to calculate UNION be better ? The planner already considers such plans. rhaas=# explain select a from generate_series(1,100) a union select a from generate_series(1,100) a; QUERY PLAN -------------------------------------------------------------------------------------- HashAggregate (cost=45.00..65.00 rows=2000 width=4) -> Append (cost=0.00..40.00 rows=2000 width=4) -> Function Scan on generate_series a (cost=0.00..10.00 rows=1000 width=4) -> Function Scan on generate_series a (cost=0.00..10.00 rows=1000 width=4) (4 rows) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pinker wrote >>> rhaas=# explain select a from generate_series(1,100) a union select a >>> from generate_series(1,100) a; >>> QUERY PLAN >>> -------------------------------------------------------------------------------------- >>> HashAggregate (cost=45.00..65.00 rows=2000 width=4) >>> -> Append (cost=0.00..40.00 rows=2000 width=4) > > > Why in this case the estimated number of rows is 2000? Is it standard > planner behavior? http://www.postgresql.org/docs/9.1/static/sql-createfunction.html Note the "ROWS" property. Functions are black-boxes to the planner so it has no means of estimating a row count. So a set returning function uses 1,000 and all others use 1. Determining "COST" is similarly problematic. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/UNION-and-bad-performance-tp3301375p5806450.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
>> rhaas=# explain select a from generate_series(1,100) a union select a >> from generate_series(1,100) a; >> QUERY PLAN >> -------------------------------------------------------------------------------------- >> HashAggregate (cost=45.00..65.00 rows=2000 width=4) >> -> Append (cost=0.00..40.00 rows=2000 width=4) Why in this case the estimated number of rows is 2000? Is it standard planner behavior? -- View this message in context: http://postgresql.1045698.n5.nabble.com/UNION-and-bad-performance-tp3301375p5806445.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.