Re: JIT performance bug/regression & JIT EXPLAIN - Mailing list pgsql-hackers

From Andres Freund
Subject Re: JIT performance bug/regression & JIT EXPLAIN
Date
Msg-id 20191029000229.fkjmuld3g7f2jq7i@alap3.anarazel.de
Whole thread Raw
In response to Re: JIT performance bug/regression & JIT EXPLAIN  (Maciek Sakrejda <m.sakrejda@gmail.com>)
Responses Re: JIT performance bug/regression & JIT EXPLAIN
List pgsql-hackers
Hi,

On 2019-10-28 11:27:02 -0700, Maciek Sakrejda wrote:
> >But that's pretty crappy, because it means that the 'shape' of the output depends on the jit_details option.
> 
> Yeah, that would be hard to work with. What about adding it as a sibling group?
> 
> "Filter": "(lineitem.l_shipdate <= '1998-09-18 00:00:00'::timestamp
> without time zone)",
> "Filter JIT": {
>   "Expr": "evalexpr_0_2",
>   "Deform Scan": "deform_0_3",
>   "Deform Outer": null,
>   "Deform Inner": null
> }
> 
> Also not that pretty, but at least it's easier to work with

What I dislike about that is that it basically again is introducing
something that requires either pattern matching on key names (i.e. a key
of '(.*) JIT' is one that has information about JIT, and the associated
expresssion is in key $1), or knowing all the potential keys an
expression could be in.


> (I also
> changed the dashes to spaces since that's what the rest of EXPLAIN is
> doing as a matter of style).

That makes sense.


> >But the compat break due to that change is not small- perhaps we could instead mark that in another way?
> 
> We could add a "Projects" boolean key instead? Of course that's more
> awkward in text mode. Maybe compat break is less of an issue in text
> mode and we can treat this differently?

Yea, I think projects as a key for each node makes sense. For text mode
I guess we could just display the key on the same line when es->verbose
is set?  Still not sure if not just changing the output is the better
approach.

Another alternative would be to just remove the 'Output' line when a
node doesn't project - it can't really carry meaning in those cases
anyway?


> >For HashJoin/Hash I've added 'Outer Hash Key' and 'Hash Key' for each key, but only in verbose mode.
> 
> That reads pretty well to me. What does the structured output look
> like?

Just a new "Outer Hash Key" for the HashJoin node, and "Hash Key" for
the Hash node. Perhaps the latter should be 'Inner Hash Key' - while
that's currently a bit confusing because of Hash's subtree being the
outer tree, it'd reduce changes when merging Hash into HashJoin [1], and
it's clearer when looking at the HashJoin node itself.

Here's an example query:

EXPLAIN (VERBOSE, FORMAT JSON, COSTS OFF) SELECT pc.oid::regclass, pc.relkind, pc.relfilenode, pc_t.oid::regclass as
toast_rel,pc_t.relfilenode as toast_relfilenode FROM pg_class pc LEFT OUTER JOIN pg_class pc_t ON (pc.reltoastrelid =
pc_t.oid);
[
  {
    "Plan": {
      "Node Type": "Hash Join",
      "Parallel Aware": false,
      "Join Type": "Left",
      "Project": ["(pc.oid)::regclass", "pc.relkind", "pc.relfilenode", "(pc_t.oid)::regclass", "pc_t.relfilenode"],
      "Inner Unique": true,
      "Hash Cond": "(pc.reltoastrelid = pc_t.oid)",
      "Outer Hash Key": "pc.reltoastrelid",
      "Plans": [
        {
          "Node Type": "Seq Scan",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Relation Name": "pg_class",
          "Schema": "pg_catalog",
          "Alias": "pc",
          "Output": ["pc.oid", "pc.relname", "pc.relnamespace", "pc.reltype", "pc.reloftype", "pc.relowner",
"pc.relam","pc.relfilenode", "pc.reltablespace", "pc.relpages", "pc.reltuples", "pc.relallvisible", "pc.reltoastrelid",
"pc.relhasindex","pc.relisshared", "pc.relpersistence", "pc.relkind", "pc.relnatts", "pc.relchecks", "pc.relhasrules",
"pc.relhastriggers","pc.relhassubclass", "pc.relrowsecurity", "pc.relforcerowsecurity", "pc.relispopulated",
"pc.relreplident","pc.relispartition", "pc.relrewrite", "pc.relfrozenxid", "pc.relminmxid", "pc.relacl",
"pc.reloptions","pc.relpartbound"]
 
        },
        {
          "Node Type": "Hash",
          "Parent Relationship": "Inner",
          "Parallel Aware": false,
          "Output": ["pc_t.oid", "pc_t.relfilenode"],
          "Hash Key": "pc_t.oid",
          "Plans": [
            {
              "Node Type": "Seq Scan",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Relation Name": "pg_class",
              "Schema": "pg_catalog",
              "Alias": "pc_t",
              "Project": ["pc_t.oid", "pc_t.relfilenode"]
            }
          ]
        }
      ]
    }
  }
]

and in plain text:

Hash Left Join
  Project: (pc.oid)::regclass, pc.relkind, pc.relfilenode, (pc_t.oid)::regclass, pc_t.relfilenode
  Inner Unique: true
  Hash Cond: (pc.reltoastrelid = pc_t.oid)
  Outer Hash Key: pc.reltoastrelid
  ->  Seq Scan on pg_catalog.pg_class pc
        Output: pc.oid, pc.relname, pc.relnamespace, pc.reltype, pc.reloftype, pc.relowner, pc.relam, pc.relfilenode,
pc.reltablespace,pc.relpages, pc.reltuples, pc.relallvisible, pc.reltoastrelid, pc.relhasindex, pc.relisshared,
pc.relpersistence,pc.relkind, pc.relnatts, pc.relchecks, pc.relhasrules, pc.relhastriggers, pc.relhassubclass,
pc.relrowsecurity,pc.relforcerowsecurity, pc.relispopulated, pc.relreplident, pc.relispartition, pc.relrewrite,
pc.relfrozenxid,pc.relminmxid, pc.relacl, pc.reloptions, pc.relpartbound
 
  ->  Hash
        Output: pc_t.oid, pc_t.relfilenode
        Hash Key: pc_t.oid
        ->  Seq Scan on pg_catalog.pg_class pc_t
              Project: pc_t.oid, pc_t.relfilenode

which also serves as an example about my previous point about
potentially just hiding the 'Output: ' bit when no projection is done:
It's very verbose, without adding much, while hiding that there's
actually nothing being done at the SeqScan level.

I've attached a rebased version of the patcheset. No changes except for
a minor conflict, and removing some already applied bugfixes.

Greetings,

Andres Freund

[1] https://www.postgresql.org/message-id/20191028231526.wcnwag7lllkra4qt%40alap3.anarazel.de

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: JIT performance bug/regression & JIT EXPLAIN
Next
From: Michael Paquier
Date:
Subject: Re: [PATCH] Do not use StdRdOptions in Access Methods