Re: Delay locking partitions during query execution - Mailing list pgsql-hackers

From David Rowley
Subject Re: Delay locking partitions during query execution
Date
Msg-id CAKJS1f_iVpfwYwKj3QrJxHBomDifY4_b6-yo89wMy=n9D6skyQ@mail.gmail.com
Whole thread Raw
In response to Re: Delay locking partitions during query execution  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Delay locking partitions during query execution  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: Delay locking partitions during query execution  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
On Fri, 4 Jan 2019 at 13:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 1/3/19 11:57 PM, David Rowley wrote:
> > You'll know you're getting a generic plan when you see "Filter (a =
> > $1)" and see "Subplans Removed: 9999" below the Append.
> >
>
> Indeed, with prepared statements I now see some improvements:
>
>     partitions    0      100     1000    10000
>     --------------------------------------------
>     master       19     1590     2090      128
>     patched      18     1780     6820     1130
>
> So, that's nice. I wonder why the throughput drops so fast between 1k
> and 10k partitions, but I'll look into that later.

Those look strange. Why is it so slow with the non-partitioned case?
I'd have expected that to be the fastest result.

> Does this mean this optimization can only ever work with prepared
> statements, or can it be made to work with regular plans too?

That's a good question.  I confirm it's only any use of run-time
pruning occurs during init plan. For this patch to be any use, you'll
need to see a "Subplans Removed: <N>" in the query's EXPLAIN ANALYZE
output.  If you don't see this then all Append/MergeAppend subplans
were initialised and the relation lock would have been obtained
regardless of if delaylock is set for the relation. The effect of the
patch here would just have been to obtain the lock during the first
call to ExecGetRangeTableRelation() for that relation instead of
during AcquireExecutorLocks().  There may actually be a tiny overhead
in this case since AcquireExecutorLocks() must skip the delaylock
relations, but they'll get locked later anyway. I doubt you could
measure that though.

When run-time pruning is able to prune partitions before execution
starts then the optimisation is useful since AcquireExecutorLocks()
won't obtain the lock and ExecGetRangeTableRelation() won't be called
for all pruned partition's rels as we don't bother to init the
Append/MergeAppend subplan for those.

I'm a little unsure if there are any cases where this type of run-time
pruning can occur when PREPAREd statements are not in use.  Initplan
parameters can't prune before executor run since we need to run the
executor to obtain the values of those. Likewise for evaluation of
volatile functions. So I think run-time pruning before initplan is
only ever going to happen for PARAM_EXTERN type parameters, i.e. with
PREPAREd statements (REF: analyze_partkey_exprs() partprune.c).
Without PREPAREd statements, if the planner itself was unable to prune
the partitions it would already have obtained the lock during
planning, so AcquireExecutorLocks(), in this case, would bump into the
local lock hash table entry and forego trying to obtain the lock
itself.  That's not free, but it's significantly faster than obtaining
a lock.

Or in short... it only good for prepared statements where the
statement's parameters allow for run-time pruning. However, that's a
pretty large case since the planner is still very slow at planning for
large numbers of partitions, meaning it's common (or at least it will
be) for people to use PREPAREd statement and plan_cache_mode =
force_generic_plan;

> >> Furthermore, I've repeatedly ran into this issue:
> >>
> >> test=# \d hashp
> >> ERROR:  unrecognized token: "false"
> >> LINE 2: ...catalog.array_to_string(array(select rolname from pg_catalog...
> >>                                                              ^
> >> I have no idea why it breaks like this, and it's somewhat random (i.e.
> >> not readily reproducible). But I've only ever seen it with this patch
> >> applied.
> >
> > You'll probably need to initdb with the patch applied as there's a new
> > field in RangeTblEntry. If there's a serialised one of these stored in
> > the in the catalogue somewhere then the new read function will have
> > issues reading the old serialised format.
> >
>
> D'oh! That explains it, because switching from/to patched binaries might
> have easily been triggering the error. I've checked that there are no
> changes to catalogs, but it did not occur to me adding a new RTE field
> could have such consequences ...

schema-wise, no changes, but data-wise, there are changes.

$ pg_dump --schema=pg_catalog --data-only postgres | grep ":rellockmode" | wc -l
    121

All of which are inside the pg_rewrite table:

$ pg_dump --schema=pg_catalog --data-only --table=pg_rewrite postgres
| grep ":rellockmode" | wc -l
    121

I just used ":rellockmode" here as it's a field that exists in RangeTblEntry.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: inconsistency and inefficiency in setup_conversion()
Next
From: Mithun Cy
Date:
Subject: Re: WIP: Avoid creation of the free space map for small tables