On Thu, Apr 13, 2023 at 10:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes: > * For the diff in standard_planner, I was wondering why not move the > initPlans up to the Gather node, just as we did before. So I tried that > way but did not notice the breakage of regression tests as stated in the > comments. Would you please confirm that?
Try it with debug_parallel_query = regress.
Ah, I see. With DEBUG_PARALLEL_REGRESS the initPlans that move to the Gather would become invisible along with the Gather node.
As I tried this, I found that the breakage caused by moving the initPlans to the Gather node might be more than just being cosmetic. Sometimes it may cause wrong results. As an example, consider
create table a (i int, j int); insert into a values (1, 1); create index on a(i, j);
set enable_seqscan to off; set debug_parallel_query to on;
# select min(i) from a; min ----- 0 (1 row)
As we can see, the result is not correct. And the plan looks like
# explain (verbose, costs off) select min(i) from a; QUERY PLAN ----------------------------------------------------------- Gather Output: ($0) Workers Planned: 1 Single Copy: true InitPlan 1 (returns $0) -> Limit Output: a.i -> Index Only Scan using a_i_j_idx on public.a Output: a.i Index Cond: (a.i IS NOT NULL) -> Result Output: $0 (12 rows)
The initPlan has been moved from the Result node to the Gather node. As a result, when doing tuple projection for the Result node, we'd get a ParamExecData entry with NULL execPlan. So the initPlan does not get chance to be executed. And we'd get the output as the default value from the ParamExecData entry, which is zero as shown.
So now I begin to wonder if this wrong result issue is possible to exist in other places where we move initPlans. But I haven't tried hard to verify that.