Re: Parallel Append can break run-time partition pruning - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Parallel Append can break run-time partition pruning
Date
Msg-id CA+HiwqG7mSYLhCkGLMyAfx_mWAAyDzV_DyN5K2CdTgzXXgofAw@mail.gmail.com
Whole thread Raw
In response to Parallel Append can break run-time partition pruning  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Parallel Append can break run-time partition pruning
Re: Parallel Append can break run-time partition pruning
List pgsql-hackers
On Wed, Apr 15, 2020 at 4:18 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I'm a bit divided on what the correct fix is.  If I blame Parallel
> Append for not trying hard enough to pull up the lower Append in
> accumulate_append_subpath(), then clearly the parallel append code is
> to blame.

I spent some time trying to understand how Append parallelism works
and I am tempted to agree with you that there might be problems with
how accumulate_append_subpath()'s interacts with parallelism. Maybe it
would be better to disregard a non-parallel-aware partial Append if it
requires us to fail on flattening a child Append.  I have as attached
a PoC fix to show that.  While a nested Append is not really a problem
in general, it appears to me that our run-time code is not in position
to work correctly with them, or at least not with how things stand
today...

> However, perhaps run-time pruning should be tagging on
> PartitionPruneInfo to more than top-level Appends. Fixing the latter
> case, code-wise is about as simple as removing the "rel->reloptkind ==
> RELOPT_BASEREL &&" line from create_append_plan(). Certainly, if the
> outer Append hadn't been a single subpath Append, then we wouldn't
> have pulled up the lower-level Append, so perhaps we should be
> run-time pruning lower-level ones too.

While looking at this, I observed that the PartitionPruneInfo of the
top-level Append (the one that later gets thrown out) contains bogus
information:

   {PARTITIONPRUNEINFO
   :prune_infos ((
      {PARTITIONEDRELPRUNEINFO
      :rtindex 1
      :present_parts (b 0)
      :nparts 1
      :subplan_map  0
      :subpart_map  1

One of these should be -1.

      {PARTITIONEDRELPRUNEINFO
      :rtindex 2
      :present_parts (b)
      :nparts 2
      :subplan_map  -1 -1
      :subpart_map  -1 -1

subplan_map values are not correct, because subpaths list that would
have been passed would not include paths of lower-level partitions as
the flattening didn't occur.

   ))
   :other_subplans (b)
   }

I guess the problem is that we let an Append be nested, but don't
account for that in how partitioned_rels list it parent Append is
constructed.  The top-level Append's partitioned_rels should not have
contained sub-partitioned table's RTI if it got its own Append.  Maybe
if we want to make run-time pruning work with nested Appends, we need
to fix how partitioned_rels are gathered.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority
Next
From: Kashif Zeeshan
Date:
Subject: Re: WIP/PoC for parallel backup