Re: Parallel leader process info in EXPLAIN - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Parallel leader process info in EXPLAIN
Date
Msg-id 18323.1580078950@sss.pgh.pa.us
Whole thread Raw
In response to Re: Parallel leader process info in EXPLAIN  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Parallel leader process info in EXPLAIN  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Thomas Munro <thomas.munro@gmail.com> writes:
> I think I'm going to abandon 0002 for now, because that stuff is being
> refactored independently over here, so rebasing would be futile:
> https://www.postgresql.org/message-id/flat/CAOtHd0AvAA8CLB9Xz0wnxu1U%3DzJCKrr1r4QwwXi_kcQsHDVU%3DQ%40mail.gmail.com

Yeah, your 0002 needs some rethinking.  I kind of like the proposed
change in the text-format output:

          Workers Launched: 4
          ->  Sort (actual rows=2000 loops=15)
                Sort Key: tenk1.ten
-               Sort Method: quicksort  Memory: xxx
+               Leader:  Sort Method: quicksort  Memory: xxx
                Worker 0:  Sort Method: quicksort  Memory: xxx
                Worker 1:  Sort Method: quicksort  Memory: xxx
                Worker 2:  Sort Method: quicksort  Memory: xxx

but it's quite unclear to me how that translates into non-text
formats, especially if we're not to break invariants about which
fields are present in a non-text output structure (cf [1]).

I've occasionally wondered whether we'd be better off presenting
this info as if the leader were "worker 0" and then the N workers
are workers 1 to N.  I've not worked out the implications of that
in any detail though.  It's fairly easy to see what to do for
fields that can be aggregated (the numbers printed for the node
as a whole are totals), but it doesn't help us any with something
like Sort Method.

On a narrower note, I'm not at all happy with the fact that 0001
adds yet another field to *every* PlanState.  I think this is
doubling down on a fundamentally wrong decision to have
ExecParallelRetrieveInstrumentation do some aggregation immediately.
I think we should abandon that and just say that it returns the raw
leader and per-worker data, and then explain.c can aggregate as it
wishes.

            regards, tom lane

[1] https://www.postgresql.org/message-id/19416.1580069629%40sss.pgh.pa.us



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()
Next
From: Tom Lane
Date:
Subject: Re: EXPLAIN's handling of output-a-field-or-not decisions