Re: build remaining Flex files standalone - Mailing list pgsql-hackers

From John Naylor
Subject Re: build remaining Flex files standalone
Date
Msg-id CAFBsxsEospoUX=QYkfC=WcJqNB+iZtBf=BaRwn-zbHa48X0NKQ@mail.gmail.com
Whole thread Raw
In response to Re: build remaining Flex files standalone  (Andres Freund <andres@anarazel.de>)
Responses Re: build remaining Flex files standalone
List pgsql-hackers
For v3, I addressed some comments and added .h files to the
headerscheck exceptions.

On Tue, Aug 16, 2022 at 1:11 AM Andres Freund <andres@anarazel.de> wrote:

> On 2022-08-13 15:39:06 +0700, John Naylor wrote:
> > Here are the rest. Most of it was pretty straightforward, with the
> > main exception of jsonpath_scan.c, which is not quite finished. That
> > one passes tests but still has one compiler warning. I'm unsure how
> > much of what is there already is really necessary or was cargo-culted
> > from elsewhere without explanation. For starters, I'm not sure why the
> > grammar has a forward declaration of "union YYSTYPE". It's noteworthy
> > that it used to compile standalone, but with a bit more stuff, and
> > that was reverted in 550b9d26f80fa30. I can hack on it some more later
> > but I ran out of steam today.

I've got it in half-way decent shape now, with an *internal.h header
and some cleanups.

> > - Include order seems to matter for the grammar's .h file. I didn't
> > test if that was the case every time, and after a few miscompiles just
> > always made it the last inclusion, but I'm wondering if we should keep
> > those inclusions outside %top{} and put it at the start of the next
> > %{} ?
>
> I think we have a few of those dependencies already, see e.g.
> /*
>  * NB: include gram.h only AFTER including scanner.h, because scanner.h
>  * is what #defines YYLTYPE.
>  */

Went with something like this in all cases:

/*
 * NB: include bootparse.h only AFTER including bootstrap.h, because bootstrap.h
 * includes node definitions needed for YYSTYPE.
 */

Future cleanup: I see this in headerscheck:

# We can't make these Bison output files compilable standalone
# without using "%code require", which old Bison versions lack.
# parser/gram.h will be included by parser/gramparse.h anyway.

That directive has been supported in Bison since 2.4.2.

> > From d723ba14acf56fd432e9e263db937fcc13fc0355 Mon Sep 17 00:00:00 2001
> > From: John Naylor <john.naylor@postgresql.org>
> > Date: Thu, 11 Aug 2022 19:38:37 +0700
> > Subject: [PATCH v201 1/9] Build guc-file.c standalone
>
> Might be worth doing some of the moving around here separately from the
> parser/scanner specific bits.

Done in 0001/0003.

> > +/* functions shared between guc.c and guc-file.l */
> > [...]
> I think I prefer your suggestion of a guc_internal.h upthread.

Started in 0002, but left open the headerscheck failure.

Also, if such a thing is meant to be #include'd only by two generated
files, maybe it should just live in the directory where they live, and
not in the src/include dir?

> > From 7d4ecfcb3e91f3b45e94b9e64c7c40f1bbd22aa8 Mon Sep 17 00:00:00 2001
> > From: John Naylor <john.naylor@postgresql.org>
> > Date: Fri, 12 Aug 2022 15:45:24 +0700
> > Subject: [PATCH v201 2/9] Build booscanner.c standalone
>
> > -# bootscanner is compiled as part of bootparse
> > -bootparse.o: bootscanner.c
> > +# See notes in src/backend/parser/Makefile about the following two rules
> > +bootparse.h: bootparse.c
> > +     touch $@
> > +
> > +bootparse.c: BISONFLAGS += -d
> > +
> > +# Force these dependencies to be known even without dependency info built:
> > +bootparse.o bootscan.o: bootparse.h
>
> Wonder if we could / should wrap this is something common. It's somewhat
> annoying to repeat this stuff everywhere.

I haven't looked at the Meson effort recently, but if the build rule
is less annoying there, I'm inclined to leave this as a wart until
autotools are retired.

> > diff --git a/src/test/isolation/specscanner.l b/src/test/isolation/specscanner.l
> > index aa6e89268e..2dc292c21d 100644
> > --- a/src/test/isolation/specscanner.l
> > +++ b/src/test/isolation/specscanner.l
> > @@ -1,4 +1,4 @@
> > -%{
> > +%top{
> >  /*-------------------------------------------------------------------------
> >   *
> >   * specscanner.l
> > @@ -9,7 +9,14 @@
> >   *
> >   *-------------------------------------------------------------------------
> >   */
> > +#include "postgres_fe.h"
>
> Miniscule nitpick: I think we typically leave an empty line between header and
> first include.

In a small unscientific sample it seems like the opposite is true
actually, but I'll at least try to be consistent within the patch set.

> > diff --git a/contrib/cube/cubedata.h b/contrib/cube/cubedata.h
> > index dbe7d4f742..0b373048b5 100644
> > --- a/contrib/cube/cubedata.h
> > +++ b/contrib/cube/cubedata.h
> > @@ -67,3 +67,7 @@ extern void cube_scanner_finish(void);
> >
> >  /* in cubeparse.y */
> >  extern int   cube_yyparse(NDBOX **result);
> > +
> > +/* All grammar constructs return strings */
> > +#define YYSTYPE char *
>
> Why does this need to be defined in a semi-public header? If we do this in
> multiple files we'll end up with the danger of macro redefinition warnings.

I tried to put all the Flex/Bison stuff in another *_internal header,
but that breaks the build. Putting just this one symbol in a header is
silly, but done that way for now. Maybe two copies of the symbol?

Another future cleanup: "%define api.prefix {cube_yy}" etc would cause
it to be spelled CUBE_YYSTYPE (other macros too), sidestepping this
problem (requires Bison 2.6). IIUC, doing it our way has been
deprecated for 9 years.

> > +extern int scanbuflen;
>
> The code around scanbuflen seems pretty darn grotty. Allocating enough memory
> for the entire list by allocating the entire string size... I don't know
> anything about contrib/cube, but isn't that in effect O(inputlen^2) memory?

Neither do I.




--
John Naylor
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: pg_receivewal and SIGTERM
Next
From: Peter Eisentraut
Date:
Subject: Move NON_EXEC_STATIC from c.h