Re: JIT performance bug/regression & JIT EXPLAIN - Mailing list pgsql-hackers

From Andres Freund
Subject Re: JIT performance bug/regression & JIT EXPLAIN
Date
Msg-id 20200127174103.i4nxrzromqk24pfn@alap3.anarazel.de
Whole thread Raw
In response to Re: JIT performance bug/regression & JIT EXPLAIN  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: JIT performance bug/regression & JIT EXPLAIN  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2020-01-27 12:15:53 -0500, Tom Lane wrote:
> Maciek Sakrejda <m.sakrejda@gmail.com> writes:
> > On Fri, Nov 15, 2019 at 5:49 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >> Personally, I don't care very much about backward-compatibility, or
> >> about how hard it is for tools to parse. I want it to be possible, but
> >> if it takes a little extra effort, so be it.
>
> > I think these are two separate issues. I agree on
> > backward-compatibility (especially if we can embed a server version in
> > structured EXPLAIN output to make it easier for tools to track format
> > differences), but not caring how hard it is for tools to parse? What's
> > the point of structured formats, then?
>
> I'd not been paying any attention to this thread, but Andres just
> referenced it in another discussion, so I went back and read it.
> Here's my two cents:
>
> * I agree with Robert that conditionally changing "Output" to "Project" is
> an absolutely horrid idea.

Yea, I think I'm convinced on that front. I never liked the idea, and
the opposition has been pretty unanimous...


> That will break every tool that looks at this stuff, and it just flies
> in the face of the design principle that the output schema should be
> stable, and it'll be a long term pain-in-the-rear for regression test
> back-patching, and it will confuse users much more than it will help
> them.  The other idea of suppressing "Output" in cases where no
> projection is happening might be all right, but only in text format
> where we don't worry about schema stability.  Another idea perhaps is
> to emit "Output: all columns" (in text formats, less sure what to do
> in structured formats).

I think I like the "all columns" idea. Not what I'd do on a green field,
but...

If we were just dealing with the XML format, we could just add a

<Projecting>True/False</Projecting>
to the current
<Output>
   <Item>a</Item>
   <Item>b</Item>
   ...
</Output>

and it'd make plenty sense. but for json's
    "Output": ["a", "b"]
and yaml's
    Output:
      - "a"
      - "b"
that's not an option as far as I can tell. Not sure what to do about
that.



> * In the structured formats, I think it should be okay to convert
> expression-ish fields from being raw strings to being {Expression}
> sub-nodes with the raw string as one field.  Aside from making it easy
> to inject JIT info, that would also open the door to someday showing
> expressions in some more-parse-able format than a string, since other
> representations could also be added as new fields.  (I have a vague
> recollection of wanting a list of all the Vars used in an expression,
> for example.)

Cool. Being extendable seems like a good direction. That's what I
primarily dislike about the various work-arounds for how to associate
information about JIT by a "related" name.

That'd e.g. open the door to have both a normalized and an original
expression in the explain output. Which would be quite valuable for
some monitoring tools.


> * Unfortunately that does nothing for the problem of how to show
> per-expression JIT info in text format.  Maybe we just shouldn't.
> I do not think that the readability-vs-usefulness tradeoff is going
> to be all that good there, anyway.  Certainly for testing purposes
> it's going to be more useful to examine portions of a structured output.

I think I can live with that, I don't think it's going to be a very
commonly used option. It's basically useful for regression tests, JIT
improvements, and people that want to see whether they can change their
query / schema to make better use of JIT - the latter category won't be
many, I think.

Since this is going to be a default off option anyway, I don't think
we'd need to be as concerned with compatibility. But even leaving
compatibility aside, it's not that clear how to best attach information
in the current text format, without being confusing.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: JIT performance bug/regression & JIT EXPLAIN
Next
From: Robert Haas
Date:
Subject: Re: [PoC] Non-volatile WAL buffer