Thread: Duplicate Workers entries in some EXPLAIN plans

Duplicate Workers entries in some EXPLAIN plans

From
Maciek Sakrejda
Date:
Hello,

I ran into an odd behavior with some EXPLAIN results in Postgres 11.5. I noticed this with JSON format first, but similar issues exist with the other formats as well for this query. I think I can follow up with the query and full plan if needed, but essentially, the issue is that the Sort node has two different entries for the "Workers" key (something that technically JSON does allow, but such JSON structures are very difficult to work with, and JSON library support for them is poor). The node looks like this (some details elided):

{
  "Node Type": "Sort",
  ...
  "Workers": [
    {
      "Worker Number": 0,
      "Sort Method": "external merge",
      "Sort Space Used": 20128,
      "Sort Space Type": "Disk"
    },
    {
      "Worker Number": 1,
      "Sort Method": "external merge",
      "Sort Space Used": 20128,
      "Sort Space Type": "Disk"
    }
  ],
  ...
  "Workers": [
    {
      "Worker Number": 0,
      "Actual Startup Time": 309.726,
      "Actual Total Time": 310.179,
      "Actual Rows": 4128,
      "Actual Loops": 1,
      "Shared Hit Blocks": 2872,
      "Shared Read Blocks": 7584,
      "Shared Dirtied Blocks": 0,
      "Shared Written Blocks": 0,
      "Local Hit Blocks": 0,
      "Local Read Blocks": 0,
      "Local Dirtied Blocks": 0,
      "Local Written Blocks": 0,
      "Temp Read Blocks": 490,
      "Temp Written Blocks": 2529
    },
    {
      "Worker Number": 1,
      "Actual Startup Time": 306.523,
      "Actual Total Time": 307.001,
      "Actual Rows": 4128,
      "Actual Loops": 1,
      "Shared Hit Blocks": 3356,
      "Shared Read Blocks": 7100,
      "Shared Dirtied Blocks": 0,
      "Shared Written Blocks": 0,
      "Local Hit Blocks": 0,
      "Local Read Blocks": 0,
      "Local Dirtied Blocks": 0,
      "Local Written Blocks": 0,
      "Temp Read Blocks": 490,
      "Temp Written Blocks": 2529
    }
  ],
  "Plans:" ...
}

YAML and XML formats both have parallel issues. TEXT format is a little different but also seems odd, with multiple lines in the plan node for each worker:

  Sort Method: external merge  Disk: 4920kB
  Worker 0:  Sort Method: external merge  Disk: 5880kB
  Worker 1:  Sort Method: external merge  Disk: 5920kB
  Buffers: shared hit=682 read=10188, temp read=1415 written=2101
  Worker 0: actual time=130.058..130.324 rows=1324 loops=1
    Buffers: shared hit=337 read=3489, temp read=505 written=739
  Worker 1: actual time=130.273..130.512 rows=1297 loops=1
    Buffers: shared hit=345 read=3507, temp read=505 written=744

Is this a bug?

Re: Duplicate Workers entries in some EXPLAIN plans

From
Tom Lane
Date:
Maciek Sakrejda <maciek@pganalyze.com> writes:
> I ran into an odd behavior with some EXPLAIN results in Postgres 11.5. I
> noticed this with JSON format first, but similar issues exist with the
> other formats as well for this query. I think I can follow up with the
> query and full plan if needed, but essentially, the issue is that the Sort
> node has two different entries for the "Workers" key (something that
> technically JSON does allow, but such JSON structures are very difficult to
> work with, and JSON library support for them is poor).

Yeah, this was already complained of here:

https://www.postgresql.org/message-id/flat/41ee53a5-a36e-cc8f-1bee-63f6565bb1ee%40dalibo.com

I think the text-mode output is intentional, but the other formats
need more work.  We also need to think about whether we can change
this without big backwards-compatibility problems.

            regards, tom lane



Re: Duplicate Workers entries in some EXPLAIN plans

From
Maciek Sakrejda
Date:
Thanks, I searched for previous reports of this, but I did not see that one. In that thread, Andrew Dunstan suggested

>Maybe a simpler fix would be to rename one set of nodes to "Sort-Workers" or some such.

Is that feasible? Maybe as "Workers (Sort)"?

>We also need to think about whether we can change
>this without big backwards-compatibility problems.

As in, due to users relying on this idiosyncratic output and working around parsing issues (ruby, python, and node's built-in parsers all seem to just keep the last entry when keys repeat by default), or because merging the nodes would introduce new entries in the Workers nodes that users may not expect?

Re: Duplicate Workers entries in some EXPLAIN plans

From
Maciek Sakrejda
Date:
Should I move this to a pgsql-hackers discussion? I noticed that jsonb also appears to keep the last JSON entry in the face of multiple keys, so it'd be nice to have something more usable. I'm not much of a C programmer, but I think I see how to rename the second fields to Sort Workers if this solution is acceptable. Looking at the code in explain.c, there do not appear to be any other EXPLAIN node fields in a similar situation (I grepped for ExplainOpenGroup and "Workers" is the only one that occurs twice).

Re: Duplicate Workers entries in some EXPLAIN plans

From
Tom Lane
Date:
Maciek Sakrejda <maciek@pganalyze.com> writes:
> Should I move this to a pgsql-hackers discussion? I noticed that jsonb also
> appears to keep the last JSON entry in the face of multiple keys, so it'd
> be nice to have something more usable. I'm not much of a C programmer, but
> I think I see how to rename the second fields to Sort Workers if this
> solution is acceptable.

Yeah, the actual code change should be pretty trivial --- the hard part
here is to get consensus on what behavior change we want.  It's not
unreasonable to decide that on pgsql-bugs ... but since there hasn't
been much commentary yet, maybe moving to -hackers is what to do
to seek consensus.

            regards, tom lane