Re: Extremely slow HashAggregate in simple UNION query - Mailing list pgsql-performance
From | Pavel Stehule |
---|---|
Subject | Re: Extremely slow HashAggregate in simple UNION query |
Date | |
Msg-id | CAFj8pRBsp+8FNEqVz9b76qMGn8P3-O5kFNiLkvu55Q+d+ntx5A@mail.gmail.com Whole thread Raw |
In response to | Re: Extremely slow HashAggregate in simple UNION query (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: Extremely slow HashAggregate in simple UNION query
|
List | pgsql-performance |
čt 22. 8. 2019 v 3:11 odesílatel Jeff Janes <jeff.janes@gmail.com> napsal:
On Tue, Aug 20, 2019 at 11:12 AM Felix Geisendörfer <felix@felixge.de> wrote:...[1] My actual query had bad estimates for other reasons (GIN Index), but that's another story. The query above was of course deliberately designed to have bad estimates.As noted elsewhere, v12 thwarts your attempts to deliberately design the bad estimates. You can still get them, you just have to work a bit harder at it:CREATE FUNCTION j (bigint, bigint) returns setof bigint as $$ select generate_series($1,$2) $$ rows 1000 language sql;EXPLAIN ANALYZE
SELECT * FROM j(1, 1) a, j(1, 1) b
UNION
SELECT * FROM j(1, 1) a, j(1, 1) b;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=80021.00..100021.00 rows=2000000 width=16) (actual time=11.332..13.241 rows=1 loops=1)
Group Key: a.a, b.b
-> Append (cost=0.50..70021.00 rows=2000000 width=16) (actual time=0.118..0.163 rows=2 loops=1)
-> Nested Loop (cost=0.50..20010.50 rows=1000000 width=16) (actual time=0.117..0.118 rows=1 loops=1)
-> Function Scan on j a (cost=0.25..10.25 rows=1000 width=8) (actual time=0.087..0.088 rows=1 loops=1)
-> Function Scan on j b (cost=0.25..10.25 rows=1000 width=8) (actual time=0.027..0.027 rows=1 loops=1)
-> Nested Loop (cost=0.50..20010.50 rows=1000000 width=16) (actual time=0.044..0.044 rows=1 loops=1)
-> Function Scan on j a_1 (cost=0.25..10.25 rows=1000 width=8) (actual time=0.022..0.022 rows=1 loops=1)
-> Function Scan on j b_1 (cost=0.25..10.25 rows=1000 width=8) (actual time=0.020..0.021 rows=1 loops=1)
Planning Time: 0.085 ms
Execution Time: 69.277 ms
(11 rows)But the same advance in v12 which makes it harder to fool with your test case also opens the possibility of fixing your real case.
I think so much more interesting should be long time after query processing - last row was processed in 13ms, but Execution Time was 69ms .. so some cleaning is 56ms - that is pretty long.
I've made an extension which has a function which always returns true, but lies about how often it is expected to return true. See the attachment. With that, you can fine-tune the planner.CREATE EXTENSION pg_selectivities ;EXPLAIN ANALYZE
SELECT * FROM j(1, 1) a, j(1, 1) b where pg_always(0.00001)
UNION
SELECT * FROM j(1, 1) a, j(1, 1) b where pg_always(0.00001);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=45021.40..45021.60 rows=20 width=16) (actual time=0.226..0.227 rows=1 loops=1)
Group Key: a.a, b.b
-> Append (cost=0.50..45021.30 rows=20 width=16) (actual time=0.105..0.220 rows=2 loops=1)
-> Nested Loop (cost=0.50..22510.50 rows=10 width=16) (actual time=0.104..0.105 rows=1 loops=1)
Join Filter: pg_always('1e-05'::double precision)
-> Function Scan on j a (cost=0.25..10.25 rows=1000 width=8) (actual time=0.066..0.066 rows=1 loops=1)
-> Function Scan on j b (cost=0.25..10.25 rows=1000 width=8) (actual time=0.035..0.035 rows=1 loops=1)
-> Nested Loop (cost=0.50..22510.50 rows=10 width=16) (actual time=0.112..0.113 rows=1 loops=1)
Join Filter: pg_always('1e-05'::double precision)
-> Function Scan on j a_1 (cost=0.25..10.25 rows=1000 width=8) (actual time=0.077..0.077 rows=1 loops=1)
-> Function Scan on j b_1 (cost=0.25..10.25 rows=1000 width=8) (actual time=0.034..0.034 rows=1 loops=1)
Planning Time: 0.139 ms
Execution Time: 0.281 msCheers,Jeff
pgsql-performance by date: