Re: Hook for extensible parsing. - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Hook for extensible parsing.
Date
Msg-id CAOBaU_Y5XoVb+aQG2Ub3y5V4heEkTo6D3e8uMwqFEC-h5aZ+dw@mail.gmail.com
Whole thread Raw
In response to Re: Hook for extensible parsing.  (Jim Mlodgenski <jimmy76@gmail.com>)
Responses Re: Hook for extensible parsing.  (Jim Mlodgenski <jimmy76@gmail.com>)
List pgsql-hackers
Thanks for the review Jim!

On Wed, Jul 7, 2021 at 3:26 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:
>
> On Sat, Jun 12, 2021 at 4:29 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> The patches all build properly and pass all regressions tests.

Note that the cfbot reports a compilation error on windows.  That's on
the grammar extension part, so I'm really really interested in trying
to fix that for now, as it's mostly a quick POC to demonstrate how one
could implement a different grammar and validate that everything works
as expected.

Also, if this patch is eventually committed and having some code to
experience the hook is wanted it would probably be better to have a
very naive parser (based on a few strcmp() calls or something like
that) to validate the behavior rather than having a real parser.

> > pg_parse_query() will instruct plugins to parse a query at a time.  They're
> > free to ignore that mode if they want to implement the 3rd mode.  If so, they
> > should either return multiple RawStmt, a single RawStmt with a 0 or
> > strlen(query_string) stmt_len, or error out.  Otherwise, they will implement
> > either mode 1 or 2, and they should always return a List containing a single
> > RawStmt with properly set stmt_len, even if the underlying statement is NULL.
> > This is required to properly skip valid strings that don't contain a
> > statements, and pg_parse_query() will skip RawStmt that don't contain an
> > underlying statement.
>
> Wouldn't we want to only loop through the individual statements if parser_hook
> exists? The current patch seems to go through the new code path regardless
> of the hook being grabbed.

I did think about it, but I eventually chose to write it this way.
Having a different code path for the no-hook situation won't make the
with-hook code any easier (it should only remove some check for the
hook in some places that have 2 or 3 other checks already).  On the
other hand, having a single code path avoid some (minimal) code
duplication, and also ensure that the main loop is actively tested
even without the hook being set.  That's not 100% coverage, but it's
better than nothing.  Performance wise, it shouldn't make any
noticeable difference for the no-hook case.



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: track_planning causing performance regression
Next
From: Ronan Dunklau
Date:
Subject: Re: [PATCH] Use optimized single-datum tuplesort in ExecSort