Re: initial pruning in parallel append - Mailing list pgsql-hackers

From Robert Haas
Subject Re: initial pruning in parallel append
Date
Msg-id CA+Tgmob9gZZa6BVcxYAGxRjoV55+8oVLWTUbB1fE+qv48YML2A@mail.gmail.com
Whole thread Raw
In response to Re: initial pruning in parallel append  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: initial pruning in parallel append
List pgsql-hackers
On Tue, Aug 8, 2023 at 2:58 AM Amit Langote <amitlangote09@gmail.com> wrote:
> > That doesn't seem like a big problem because there aren't many node
> > types that do pruning, right? I think we're just talking about Append
> > and MergeAppend, or something like that, right?
>
> MergeAppend can't be parallel-aware atm, so only Append.

Well, the question isn't whether it's parallel-aware, but whether
startup-time pruning happens there.

> So change the ordering of the following code in ParallelQueryMain():

Yeah, that would be a reasonable thing to do.

> Or we could consider something like the patch I mentioned in my 1st
> email.  The idea there was to pass the pruning result via a separate
> channel, not the DSM chunk linked into the PlanState tree.  To wit, on
> the leader side, ExecInitParallelPlan() puts the serialized
> List-of-Bitmapset into the shm_toc with a dedicated PARALLEL_KEY,
> alongside PlannedStmt, ParamListInfo, etc.  The List-of-Bitmpaset is
> initialized during the leader's ExecInitNode().  On the worker side,
> ExecParallelGetQueryDesc() reads the List-of-Bitmapset string and puts
> the resulting node into the QueryDesc, that ParallelQueryMain() then
> uses to do ExecutorStart() which copies the pointer to
> EState.es_part_prune_results.  ExecInitAppend() consults
> EState.es_part_prune_results and uses the Bitmapset from there, if
> present, instead of performing initial pruning.

This also seems reasonable.

> I'm assuming it's not
> too ugly if ExecInitAppend() uses IsParallelWorker() to decide whether
> it should be writing to EState.es_part_prune_results or reading from
> it -- the former if in the leader and the latter in a worker.

I don't think that's too ugly. I mean you have to have an if statement
someplace.

> If we
> are to go with this approach we will need to un-revert ec386948948c,
> which moved PartitionPruneInfo nodes out of Append/MergeAppend nodes
> to a List in PlannedStmt (copied into EState.es_part_prune_infos),
> such that es_part_prune_results mirrors es_part_prune_infos.

The comment for the revert (which was
5472743d9e8583638a897b47558066167cc14583) points to
https://www.postgresql.org/message-id/4191508.1674157166@sss.pgh.pa.us
as the reason, but it's not very clear to me why that email led to
this being reverted. In any event, I agree that if we go with your
idea to pass this via a separate PARALLEL_KEY, unreverting that patch
seems to make sense, because otherwise I think we don't have a fast
way to find the nodes that contain the state that we care about.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Next
From: Peter Eisentraut
Date:
Subject: Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?