Re: partitioning performance tests after recent patches - Mailing list pgsql-hackers

From Floris Van Nee
Subject Re: partitioning performance tests after recent patches
Date
Msg-id 1555313609039.42734@Optiver.com
Whole thread Raw
In response to Re: partitioning performance tests after recent patches  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: partitioning performance tests after recent patches
List pgsql-hackers
Hi David,

Thanks for your reply. I really appreciate your work on run-time pruning!
Here's the output of explain/analyze for HEAD. At run-time, technically all partitions could be pruned directly.
However,one partition remains in the output of explain/analyze because of other difficulties with removing all of them,
ifI remember correctly? Still, that partition is never executed.  The only difference I can see is the Limit node on
top,as well as apparently another  partition appearing in the analyze output (4096_4096, last partition, remains in the
firstplan. 4096_1, the first partition, remains the second plan). 

-- select_now.sql
explain(analyze, verbose, buffers on)
select * from :tbl where a='abc' and updated_at between now() and now()+interval '1d';

Append  (cost=0.16..8949.61 rows=4096 width=112) (actual time=0.000..0.000 rows=0 loops=1)
  Subplans Removed: 4095
  ->  Index Scan using p4096_4096_a_updated_at_idx on public.p4096_4096  (cost=0.16..2.18 rows=1 width=112) (never
executed)
        Output: p4096_4096.a, p4096_4096.b, p4096_4096.c, p4096_4096.d, p4096_4096.updated_at
        Index Cond: ((p4096_4096.a = 'abc'::text) AND (p4096_4096.updated_at >= now()) AND (p4096_4096.updated_at <=
(now()+ '1 day'::interval))) 
Planning Time: 237.603 ms
Execution Time: 0.475 ms

-- select_now_limit.sql
explain(analyze, verbose, buffers on)
select * from :tbl where a='abc' and updated_at between now() and now()+interval '1d'
order by a, updated_at desc limit 1;

Limit  (cost=645.53..647.56 rows=1 width=112) (actual time=0.002..0.002 rows=0 loops=1)
  Output: p4096_1.a, p4096_1.b, p4096_1.c, p4096_1.d, p4096_1.updated_at
  ->  Append  (cost=645.53..8949.61 rows=4096 width=112) (actual time=0.000..0.000 rows=0 loops=1)
        Subplans Removed: 4095
        ->  Index Scan using p4096_1_a_updated_at_idx on public.p4096_1  (cost=0.57..2.03 rows=1 width=54) (never
executed)
              Output: p4096_1.a, p4096_1.b, p4096_1.c, p4096_1.d, p4096_1.updated_at
              Index Cond: ((p4096_1.a = 'abc'::text) AND (p4096_1.updated_at >= now()) AND (p4096_1.updated_at <=
(now()+ '1 day'::interval))) 
Planning Time: 3897.687 ms
Execution Time: 0.491 ms

Regarding the nested loops - thanks for your explanation. I can see this is more complicated than I initially thought.
Itmay be doable to determine if your set of pruned partitions is still valid, but it's more difficult to determine if,
ontop of that, extra partitions must be included due to widening of the range.  

-Floris

________________________________________
From: David Rowley <david.rowley@2ndquadrant.com>
Sent: Monday, April 15, 2019 1:25 AM
To: Floris Van Nee
Cc: Pg Hackers
Subject: Re: partitioning performance tests after recent patches [External]

On Mon, 15 Apr 2019 at 07:19, Floris Van Nee <florisvannee@optiver.com> wrote:
> 3) What could be causing the big performance difference between case 7 (simple SELECT) and 8 (simple SELECT with
ORDERBY <index> LIMIT 1)? For 4096 partitions, TPS of 7) is around 5, while adding the ORDER BY <index> LIMIT 1 makes
TPSdrop well below 1. In theory, run-time pruning of the right chunk should take exactly the same amount of time in
bothcases, because both are pruning timestamp now() on the same number of partitions. The resulting plans are also
identicalwith the exception of the top LIMIT-node (in PG11 they differ slightly as a MergeAppend is chosen for the
ORDERBY instead of an Append, in HEAD with ordered append this is not necessary anymore). Am I missing something here? 

With the information provided, I don't really see any reason why the
ORDER BY LIMIT would slow it down if the plan is the same apart from
the LIMIT node. Please share the EXPLAIN ANALYZE output of each.

> 4) A more general question about run-time pruning in nested loops, like the one for case 14. I believe I read in one
ofthe previous threads that run-time pruning only reoccurs if it determines that the value that determines which
partitionsmust be excluded has changed in between iterations. How is this defined? Eg. let's say partitions are 1-day
wideand the first iteration of the loop filters on the partitioned table for timestamp between 14-04-2019 12:00 and
14-04-201920:00 (dynamically determined). Then the second iteration comes along and now filters on values between
14-04-201912:00 and 14-04-2019 19:00. The partition that should be scanned hasn't changed, because both timestamps fall
intothe same partition. Is the full process of run-time pruning applied again, or is there some kind of shortcut that
firstchecks if the previous pruning result is still valid even if the value has changed slightly? If not, would this be
apossible optimization, as I think it's a case that occurs very often? I don't know the run-time pruning code very well
though,so it may just be a crazy idea that can't be practically achieved. 

Currently, there's no shortcut. It knows which parameters partition
pruning depends on and it reprunes whenever the value of ones of these
changes.

I'm not really sure how rechecking would work exactly. There are cases
where it wouldn't be possible, say the condition was: partkey >= $1
and there was no partition for $1 since it was beyond the range of the
defined range partitions.  How could we tell if we can perform the
shortcut if the next param value falls off the lower bound of the
defined partitions?  The first would include no partitions and the
second includes all partitions, but the actual value of $1 belongs to
no partition in either case so we can't check to see if it matches the
same partition.  Perhaps it could work for equality operators when
just a single partition is matched in the first place, it might then
be possible to do a shortcircuit recheck to see if the same partition
matches the next set of values.  The problem with that is that
run-time pruning code in the executor does not care about which
operators are used. It just passes those details off to the pruning
code to deal with it.  Perhaps something can be decided in the planner
in analyze_partkey_exprs() to have it set a "do_recheck" flag to tell
the executor to check before pruning again...  Or maybe it's okay to
just try a recheck when we match to just a single partition and just
recheck the new values are allowed in that partition when re-pruning.
However, that might be just too overly dumb since for inequality
operators the original values may never even have falling inside the
partition's bounds in the first place.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Mailing list not working
Next
From: Peter Eisentraut
Date:
Subject: Re: cache lookup failed for collation 0