Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0 - Mailing list pgsql-performance

From Amit Langote
Subject Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0
Date
Msg-id 8db4eca8-6e76-70ff-f934-e0228a1cc298@lab.ntt.co.jp
Whole thread Raw
In response to Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-performance
Hi,

On 2018/12/05 6:55, Alvaro Herrera wrote:
> On 2018-Dec-04, Alvaro Herrera wrote:
> 
>> CREATE TABLE precio(fecha timestamp, pluid int, loccd int, plusalesprice int) PARTITION BY RANGE (fecha); 
>> SELECT format('CREATE TABLE public.precio_%s PARTITION OF public.precio (PRIMARY KEY (fecha, pluid, loccd) ) FOR
VALUESFROM (''%s'')TO(''%s'')', i, a, b) FROM (SELECT '1990-01-01'::timestam p+(i||'days')::interval a,
'1990-01-02'::timestamp+(i||'days')::intervalb, i FROM generate_series(1,999) i)x \gexec
 
> 
> Actually, the primary keys are not needed; it's just as slow without
> them.

I ran the original unmodified query at [1] (the one that produces an empty
plan due to all children being pruned) against the server built with
patches I posted on the "speeding up planning with partitions" [2] thread
and it finished in a jiffy.

explain SELECT l_variacao.fecha, l_variacao.loccd , l_variacao.pant ,
l_variacao.patual , max_variacao.var_max FROM (SELECT p.fecha, p.loccd,
p.plusalesprice patual, da.plusalesprice pant, abs(p.plusalesprice -
da.plusalesprice) as var from precio p, (SELECT p.fecha, p.plusalesprice,
p.loccd from precio p WHERE p.fecha between '2017-03-01' and '2017-03-02'
and p.pluid = 2) da WHERE p.fecha between '2017-03-01' and '2017-03-02'
and p.pluid = 2 and p.loccd = da.loccd and p.fecha = da.fecha) l_variacao,
(SELECT max(abs(p.plusalesprice - da.plusalesprice)) as var_max from
precio p, (SELECT p.fecha, p.plusalesprice, p.loccd from precio p WHERE
p.fecha between '2017-03-01' and '2017-03-02' and p.pluid = 2) da WHERE
p.fecha between '2017-03-01' and '2017-03-02' and p.pluid = 2 and p.loccd
= da.loccd and p.fecha = da.fecha) max_variacao WHERE max_variacao.var_max
= l_variacao.var;
QUERY PLAN
───────────────────────────────────────────
 Result  (cost=0.00..0.00 rows=0 width=24)
   One-Time Filter: false
(2 rows)

Time: 50.792 ms

That's because one of the things changed by one of the patches is that
child EC members are added only for the non-dummy children.  In this case,
since all the children are pruned, there should be zero child EC members,
which is what would happen in PG 10 too.  The partitionwise join related
changes in PG 11 moved the add_child_rel_equivalences call in
set_append_rel_size such that child EC members would be added even before
checking if the child rel is dummy, but for a reason named in the comment
above the call:

   ... Even if this child is
 * deemed dummy, it may fall on nullable side in a child-join, which
 * in turn may participate in a MergeAppend, where we will need the
 * EquivalenceClass data structures.

However, I think we can skip adding the dummy child EC members here  and
instead make it a responsibility of partitionwise join code in joinrels.c
to add the needed EC members.  Attached a patch to show what I mean, which
passes the tests and gives this planning time:

                            QUERY PLAN
───────────────────────────────────────────────────────────────────
 Result  (cost=0.00..0.00 rows=0 width=24) (actual rows=0 loops=1)
   One-Time Filter: false
 Planning Time: 512.788 ms
 Execution Time: 0.162 ms

which is not as low as with the patches at [2] for obvious reasons, but as
low as we can hope to get with PG 11.  Sadly, planning time is less with
PG 10.6:

                            QUERY PLAN
───────────────────────────────────────────────────────────────────
 Result  (cost=0.00..0.00 rows=0 width=24) (actual rows=0 loops=1)
   One-Time Filter: false
 Planning time: 254.533 ms
 Execution time: 0.080 ms
(4 rows)

But I haven't looked closely at what else in PG 11 makes the planning time
twice that of 10.

> I noticed another interesting thing, which is that if I modify the query
> to actually reference some partition that I do have (as opposed to the
> above, which just takes 30s to prune everything) the plan is mighty
> curious ... if only because in one of the Append nodes, partitions have
> not been pruned as they should.
>
> So, at least two bugs here,
> 1. the equivalence-class related slowness,
> 2. the lack of pruning

I haven't reproduced 2 yet.  Can you share the modified query?

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/20181128004402.GC30707%40telsasoft.com

Attachment

pgsql-performance by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0
Next
From: Amit Langote
Date:
Subject: Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0