Thread: Build failure with GCC 15 (defaults to -std=gnu23)

Build failure with GCC 15 (defaults to -std=gnu23)

From
Sam James
Date:
Hi,

postgres-17.1 fails to build with upcoming GCC 15 which defaults to
-std=gnu23 as follows:
```
In file included from ../../src/include/postgres_fe.h:25,
                 from checksum_helper.c:17:
../../src/include/c.h:456:23: error: two or more data types in declaration specifiers
  456 | typedef unsigned char bool;
      |                       ^~~~
../../src/include/c.h:456:1: warning: useless type name in empty declaration
  456 | typedef unsigned char bool;
      | ^~~~~~~
```

src/include/c.h does attempt to check for whether bool is already defined but
the check doesn't work.

There may be more issues afterwards but I haven't tried papering over
the above issue. It should be possible to reproduce w/
CFLAGS="... -std=gnu23" or CFLAGS="... -std=c23" on older GCC or Clang.

thanks,
sam



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Sam James <sam@gentoo.org> writes:
> postgres-17.1 fails to build with upcoming GCC 15 which defaults to
> -std=gnu23 as follows:

I do not think we claim to support C23 yet.

Having said that, I can reproduce this on gcc 14 using -std=gnu23.
It appears that configure is deciding that <stdbool.h> is not
conformant to C99 because it doesn't declare "bool" as a macro.
Did C23 really remove that !?

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
On Mon, Nov 18, 2024 at 7:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Sam James <sam@gentoo.org> writes:
> > postgres-17.1 fails to build with upcoming GCC 15 which defaults to
> > -std=gnu23 as follows:
>
> I do not think we claim to support C23 yet.
>
> Having said that, I can reproduce this on gcc 14 using -std=gnu23.
> It appears that configure is deciding that <stdbool.h> is not
> conformant to C99 because it doesn't declare "bool" as a macro.
> Did C23 really remove that !?

Yes, seems to be a general pattern: features introduced as keyword
_Xxx with a library macro xxx -> _Xxx (usually where xxx is already a
keyword in C++ but the C committee was afraid to unleash a new keyword
directly on the world, I guess?), and now xxx is finally graduating to
real keyword status.  Other examples: static_assert, thread_local,
alignas.



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Mon, Nov 18, 2024 at 7:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Having said that, I can reproduce this on gcc 14 using -std=gnu23.
>> It appears that configure is deciding that <stdbool.h> is not
>> conformant to C99 because it doesn't declare "bool" as a macro.
>> Did C23 really remove that !?

> Yes, seems to be a general pattern: features introduced as keyword
> _Xxx with a library macro xxx -> _Xxx (usually where xxx is already a
> keyword in C++ but the C committee was afraid to unleash a new keyword
> directly on the world, I guess?), and now xxx is finally graduating to
> real keyword status.  Other examples: static_assert, thread_local,
> alignas.

Fun.  Well, now that we insist on C99 support in all branches,
I wonder whether we can just remove all the non-stdbool support.
The one thing that looks tricky is that we insist on sizeof(bool)
being 1, but are there any remaining supported platforms where
it isn't?  The buildfarm doesn't have any examples.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
On Mon, Nov 18, 2024 at 9:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Fun.  Well, now that we insist on C99 support in all branches,
> I wonder whether we can just remove all the non-stdbool support.
> The one thing that looks tricky is that we insist on sizeof(bool)
> being 1, but are there any remaining supported platforms where
> it isn't?  The buildfarm doesn't have any examples.

So far I have found only Apple/Darwin PPC (RIP), where this was
occasionally an issue.  Some projects would apparently compile with
-mone-byte-bool to unbreak local assumptions, but risk breaking ABI
with other libraries (as we do?).  GCC filed that switch under Darwin
options rather than somewhere more general... can we call that a clue
that it was highly unusual?

https://gcc.gnu.org/onlinedocs/gcc/Darwin-Options.html

There may be a systematic way to survey ABIs from the LLVM or GCC
source trees... hmm, I am no expert so take this with a grain of salt
but I found where the LLVM project defines BoolWidth and BoolAlign,
starting from the commit where they removed Darwin PPC support
(2fe49ea0), and it looks like it was the only target ABI that overrode
the default of 8 there (it had 32, meaning bits).

BTW animal "alligator" has just shown the failure.  Ah, yes, due to
this GCC switch being flipped a couple of days ago:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=55e3bd376b2214e200fa76d12b67ff259b06c212



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Mon, Nov 18, 2024 at 9:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Fun.  Well, now that we insist on C99 support in all branches,
>> I wonder whether we can just remove all the non-stdbool support.
>> The one thing that looks tricky is that we insist on sizeof(bool)
>> being 1, but are there any remaining supported platforms where
>> it isn't?  The buildfarm doesn't have any examples.

> So far I have found only Apple/Darwin PPC (RIP), where this was
> occasionally an issue.

Yeah.  Well, what say we leave the "typedef unsigned char bool"
pathway in place, but set things up to use that only if sizeof
the stdbool type isn't 1 --- and then it's up to any hypothetical
users of that pathway to choose a compiler and compiler options
that won't choke on it.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
On Mon, Nov 18, 2024 at 10:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah.  Well, what say we leave the "typedef unsigned char bool"
> pathway in place, but set things up to use that only if sizeof
> the stdbool type isn't 1 --- and then it's up to any hypothetical
> users of that pathway to choose a compiler and compiler options
> that won't choke on it.

It sounds like we should stop using the old and broken
AC_CHECK_HEADER_STDBOOL macro.  I think it was doing two jobs in old
times: there were some systems that shipped a defective/pre-standard
stdbool.h, and some systems without it.  I think both classes of
system are gone from the universe.  Later autoconf versions check for
C99 "or later", but we're stuck with the old one and I doubt we are
going to upgrade it.  Found in their NEWS:

*** AC_HEADER_STDBOOL, AC_CHECK_HEADER_STDBOOL are obsolescent and less picky.
These macros are now obsolescent, as programs can simply include
stdbool.h unconditionally. If you use these macros, they now accept
a stdbool.h that exists but does nothing, so long as ‘bool’, ‘true’,
and ‘false’ work anyway. This is for compatibility with C 2023 and
with C++.



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> It sounds like we should stop using the old and broken
> AC_CHECK_HEADER_STDBOOL macro.

Yeah, that's what I was imagining: assume that <stdbool.h> exists
and works, and check only to see if sizeof(bool) is acceptable.

> ... Later autoconf versions check for
> C99 "or later", but we're stuck with the old one and I doubt we are
> going to upgrade it.

I'm not sure --- there was some discussion a week or two ago about
upgrading autoconf after all.  But whether we do or not, it's hard
to see what AC_HEADER_STDBOOL is buying us.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> I found a few lines we could just delete in master.  I wonder if we
> should also just require sizeof(bool) == 1 more explicitly going
> forward with an error, since we don't have coverage or any expectation
> of ever getting any for the alternative code AFAICS, even if it is
> small.

Yeah, that's a fair criticism.  I don't think we've tested that code
path since I retired prairiedog, so who's to say that it works even
now?  Maybe it's best to just delete that code, and if we ever find a
new platform with wider bool, figure out what to do at that time.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Peter Eisentraut
Date:
On 18.11.24 02:30, Thomas Munro wrote:
> On Mon, Nov 18, 2024 at 11:50 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Thomas Munro <thomas.munro@gmail.com> writes:
>>> It sounds like we should stop using the old and broken
>>> AC_CHECK_HEADER_STDBOOL macro.
>>
>> Yeah, that's what I was imagining: assume that <stdbool.h> exists
>> and works, and check only to see if sizeof(bool) is acceptable.
> 
> I think this is the minimal change, which I'd push back to 13 post-freeze.

Note that if we backpatch C23 support, we also need to backpatch at 
least commits a67a49648d9 and d2b4b4c2259.




Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> Note that if we backpatch C23 support, we also need to backpatch at 
> least commits a67a49648d9 and d2b4b4c2259.

Yeah.  Our normal theory for this kind of thing is "people are
likely to build our old branches with modern toolchains", so
we are going to have to back-patch C23 compatibility sooner or
later.  In fact, we'll have to back-patch to 9.2, or else
decide that those branches are unbuildable on modern platforms
and hence out of scope for compatibility testing.

We have a little bit of grace time before this needs to happen,
but perhaps not very much.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Peter Eisentraut
Date:
On 20.11.24 16:32, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
>> Note that if we backpatch C23 support, we also need to backpatch at
>> least commits a67a49648d9 and d2b4b4c2259.
> 
> Yeah.  Our normal theory for this kind of thing is "people are
> likely to build our old branches with modern toolchains", so
> we are going to have to back-patch C23 compatibility sooner or
> later.  In fact, we'll have to back-patch to 9.2, or else
> decide that those branches are unbuildable on modern platforms
> and hence out of scope for compatibility testing.

I have checked that with this patch and the two above (well, one is just 
to remove a warning), you can get PG16 and up building cleanly with 
gcc-14 -std=gnu23.

Before that, you get a ton of warnings and errors related to the node 
tree walker routines.  This is presumably related to commit 1c27d16e6e5.

Going further back, the bool patch proposed here assumes that stdbool.h 
exists unconditionally, which is C99, which is not the baseline for 
older branches.  I think for those it's probably best to leave it alone 
and just use gcc-15 -std=gnu89 or whatever.




Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
On Thu, Nov 21, 2024 at 8:23 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 20.11.24 16:32, Tom Lane wrote:
> > Yeah.  Our normal theory for this kind of thing is "people are
> > likely to build our old branches with modern toolchains", so
> > we are going to have to back-patch C23 compatibility sooner or
> > later.  In fact, we'll have to back-patch to 9.2, or else
> > decide that those branches are unbuildable on modern platforms
> > and hence out of scope for compatibility testing.
>
> I have checked that with this patch and the two above (well, one is just
> to remove a warning), you can get PG16 and up building cleanly with
> gcc-14 -std=gnu23.

Thanks.  I pushed the <stdbool.h> thing, which didn't require going
back very far.

> Before that, you get a ton of warnings and errors related to the node
> tree walker routines.  This is presumably related to commit 1c27d16e6e5.

Alligator is now getting past the bool troubles and reaching that
stuff.  I was expecting it to be green in master.  It's OK on my
slightly older "gcc version 15.0.0 20241110 (experimental) (FreeBSD
Ports Collection)" with -std=gnu23, but alligator now shows a weird
error with tsearch data types.  Something about flexible array members
(casting from non-flex to flex?, without saying where the cast is?),
but IDK, it's an internal error asking for a bug to be filed, not a
user-facing one.

This might be relevant: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113688

> Going further back, the bool patch proposed here assumes that stdbool.h
> exists unconditionally, which is C99, which is not the baseline for
> older branches.  I think for those it's probably best to leave it alone
> and just use gcc-15 -std=gnu89 or whatever.

There's only one C89 branch that knows about <stdbool.h>:
REL_11_STABLE.  That's recent enough that it's still easy to work
with, so I just changed it to use AC_CHECK_HEADER instead.  In other
words, we've removed the bogus "conforms" check.  Whether you still
need a presence check depends on the C version, and only for 11 is the
answer yes.  Obviously nobody is really going to build with an actual
C89 system so the presence check is never going to fail, but it would
be weird on principle to suddenly require a C99 thing...



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Peter Eisentraut
Date:
On 25.11.24 11:01, Thomas Munro wrote:
>> I have checked that with this patch and the two above (well, one is just
>> to remove a warning), you can get PG16 and up building cleanly with
>> gcc-14 -std=gnu23.
> 
> Thanks.  I pushed the <stdbool.h> thing, which didn't require going
> back very far.
> 
>> Before that, you get a ton of warnings and errors related to the node
>> tree walker routines.  This is presumably related to commit 1c27d16e6e5.
> 
> Alligator is now getting past the bool troubles and reaching that
> stuff.  I was expecting it to be green in master.  It's OK on my
> slightly older "gcc version 15.0.0 20241110 (experimental) (FreeBSD
> Ports Collection)" with -std=gnu23, but alligator now shows a weird
> error with tsearch data types.  Something about flexible array members
> (casting from non-flex to flex?, without saying where the cast is?),
> but IDK, it's an internal error asking for a bug to be filed, not a
> user-facing one.
> 
> This might be relevant: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113688
> 
>> Going further back, the bool patch proposed here assumes that stdbool.h
>> exists unconditionally, which is C99, which is not the baseline for
>> older branches.  I think for those it's probably best to leave it alone
>> and just use gcc-15 -std=gnu89 or whatever.
> 
> There's only one C89 branch that knows about <stdbool.h>:
> REL_11_STABLE.  That's recent enough that it's still easy to work
> with, so I just changed it to use AC_CHECK_HEADER instead.  In other
> words, we've removed the bogus "conforms" check.  Whether you still
> need a presence check depends on the C version, and only for 11 is the
> answer yes.  Obviously nobody is really going to build with an actual
> C89 system so the presence check is never going to fail, but it would
> be weird on principle to suddenly require a C99 thing...

Where does this leave us regarding backpatching the other two 
C23-related patches?  The node tree walker issue looks like a very hard 
barrier.  I don't want to spend too much effort backpatching anything to 
ancient version if there's little hope of getting the whole thing working.



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
On Tue, Nov 26, 2024 at 4:57 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> Where does this leave us regarding backpatching the other two
> C23-related patches?  The node tree walker issue looks like a very hard
> barrier.  I don't want to spend too much effort backpatching anything to
> ancient version if there's little hope of getting the whole thing working.

Oh.  Yeah.  1c27d16e6e5 was not back-patchable.  And what f9a56e72 did
in 15 and older doesn't seem to have any equivalent in C23, at least
without going way overboard.  -Wdeprecated-non-prototype was
recognising a category of function type that no longer exists, so the
code now falls into the more general case of
-Wincompatible-pointer-types in C23, which you certainly wouldn't want
to suppress.  So perhaps we actually can't make any branch older than
PostgreSQL 16 into a valid C23 program, and if that's true, I needn't
have back-patched the <stdbool.h> change any further back than 16.
Perhaps we should reconsider that, then.  And if it can't be all the
back-branches, we could even decide to focus just on master.  Where do
we want our C23 support to begin?

Coincidental observation: We added -Wdeprecated-non-prototype back
when Clang 15 invented it, but I noticed that GCC 15 has just now
added it too[1], so alligator started detecting and using that in
REL_15_STABLE last week.  Of course it doesn't help once you're
talking C23.

[1] https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=701d8e7e60b85809cae348c1e9edb3b0f4924325
[2] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=alligator&dt=2024-11-18%2019%3A23%3A30&stg=configure



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> Oh.  Yeah.  1c27d16e6e5 was not back-patchable.  And what f9a56e72 did
> in 15 and older doesn't seem to have any equivalent in C23, at least
> without going way overboard.  -Wdeprecated-non-prototype was
> recognising a category of function type that no longer exists, so the
> code now falls into the more general case of
> -Wincompatible-pointer-types in C23, which you certainly wouldn't want
> to suppress.  So perhaps we actually can't make any branch older than
> PostgreSQL 16 into a valid C23 program, and if that's true, I needn't
> have back-patched the <stdbool.h> change any further back than 16.
> Perhaps we should reconsider that, then.  And if it can't be all the
> back-branches, we could even decide to focus just on master.  Where do
> we want our C23 support to begin?

Unless somebody has a better idea than 1c27d16e6e5, it would seem
reasonable to say that we'll support C23 in v16 and later, but
to build an older branch you have to back off to an older C version.

I don't feel a need to revert those <stdbool.h> changes.
If nothing else, that saved a bit of configure runtime.

If we leave it like this, alligator will need some configuration
adjustments, and so will other BF animals when they migrate to new
gcc, and so will individual hackers when they're trying to build
old branches.  A possible compromise to reduce the manual pain
level could be to adjust configure to add "-std=gnu99" or so to
CFLAGS in the pre-v16 branches, if the compiler accepts that.
(OTOH, maybe that'd cause pain for some extensions?)

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
On Tue, Nov 26, 2024 at 9:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If we leave it like this, alligator will need some configuration
> adjustments, and so will other BF animals when they migrate to new
> gcc, and so will individual hackers when they're trying to build
> old branches.  A possible compromise to reduce the manual pain
> level could be to adjust configure to add "-std=gnu99" or so to
> CFLAGS in the pre-v16 branches, if the compiler accepts that.

We already have tests to see if we need to add -std=c99 to go forwards
in time (from a quick look at the build farm, now used only by EOL
distros testing GCC 4).  Something tells me we might want to be less
draconian when travelling backwards, but I dunno... our stuff is
working fine with (implicit) -std=c17 all over the place, and we also
have:

# Do we need -std=c99 to compile C99 code? We don't want to add -std=c99
# unnecessarily, because we optionally rely on newer features.

I noticed that later autoconf killed AC_PROG_CC_C99 and its AC_PROG_CC
is figuring out the highest C standard available and requesting that,
though it doesn't know about C23 yet.

> (OTOH, maybe that'd cause pain for some extensions?)

So we're talking about -std=XXX suddenly appearing in pg_config
--cflags?  Of course we want to try quite hard to keep emitting
nothing for that if we can.  We already emit -std=gnu99 from old tests
that keep those GCC 4 build farm zombies at bay (did extensions ever
complain about that back in the GCC 4 days?  I'd guess not), but no
one really uses those in real life.  If we one day dare to dream about
moving our own baseline to C11/C17, we'll still emit nothing as the
compilers are already there by default.  We will start to emit a new
flag to disable C23 if required, but I think it might be unlikely to
upset anyone if it works something like this:

* For 16+ nothing, we're going to be C23 clean (after a couple more
back-patches)
* For 9.2-15 on GCC < 15 it'll stay as nothing too
* For 9.2-15 on early GCC 15 adopter distros like Fedora/Gentoo etc
we'll detect C23, and perhaps start spitting out -std=c17 (if you've
detected C23, I think you can assume that C17 is available so we don't
have to do a C17-C11-C99[-C89] search?)
* When 12-15 fall out of support and all compilers are eventually C23+
compilers, they'll eventually always be getting -std=c17 by the above
rules but no one will mind about that in the ancient branches

I think if someone writes new extension code in C23 and wants to use
it with PostgreSQL 15 that came out in 2022, they can expect a few
time travel problems, but by the time C23 really starts to take off,
15 will be retired, and it's great that we got the hardest part of
this into 16.  I don't think the problems would be too hard to deal
with if you do it.  One saving grace here is that all this stuff is
converging with C++, and we already ensure our headers are valid C++.
As for what they actually mean, we also know that C++ extensions are
happily using the tree walker stuff in the wild, which I think must be
about the same level of C calling convention abuse whether you do it
from C23 or C++, and apparently doesn't break.  Example:

https://github.com/duckdb/pg_duckdb/blob/d53247f004b154dc81275e9c4b1184c792f4865c/src/pgduckdb_hooks.cpp#L123

The C++ people aren't using --cflags of course.   I guess a
really-written-in-C23 extension using --cflags to compile its own code
would need to append -std=c23 on the end when building against 12-15
(both clang and gcc will take the last of multiple -std flags), if we
decide to start emitting -std=c17 in 12-15 because we've detected a
C23 compiler.  In 16+ they could put -std=c23 on the end, or not if
they somehow know it's the default.

We could suppress it in pg_config --cflags even though we need it to
build the backend (by the arguments above, we know the headers are
more acceptable as C23 than the backend .c files), but that seems
bound to confuse matters.  I don't know how exactly, I'm no expert in
the gnarly details of extension buildfiles, and that's just some first
thoughts.  I might try some ideas out in a few days when my local
gcc15-devel package catches up with the new defaults, and see if
everything I wrote is complete nonsense.



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> ... I think it might be unlikely to
> upset anyone if it works something like this:

> * For 16+ nothing, we're going to be C23 clean (after a couple more
> back-patches)
> * For 9.2-15 on GCC < 15 it'll stay as nothing too
> * For 9.2-15 on early GCC 15 adopter distros like Fedora/Gentoo etc
> we'll detect C23, and perhaps start spitting out -std=c17 (if you've
> detected C23, I think you can assume that C17 is available so we don't
> have to do a C17-C11-C99[-C89] search?)
> * When 12-15 fall out of support and all compilers are eventually C23+
> compilers, they'll eventually always be getting -std=c17 by the above
> rules but no one will mind about that in the ancient branches

Sounds plausible to me.  Will you work on making that happen?

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
On Tue, Nov 26, 2024 at 3:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Sounds plausible to me.  Will you work on making that happen?

OK, trying it out...



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
Pushed.  Alligator is turning green in the back branches, only one more to go.

I see that Peter also pushed the reserved word patches.  Apparently
that gcc bug it's blowing up on might go away with -g0 (see link
earlier), but anyway it's a nightly build compiler so fingers crossed
for a fix soon.  The newer branches are building and running for me on
{ gcc14, gcc15, clang18 } -std=gnu23, and apparently the bug was even
in gcc14, so it must require some unlikely conditions that alligator
has stumbled on.

(Yeah I had missed 10, thanks for the nudge.)



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Mon, Nov 18, 2024 at 2:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, that's a fair criticism.  I don't think we've tested that code
>> path since I retired prairiedog, so who's to say that it works even
>> now?  Maybe it's best to just delete that code, and if we ever find a
>> new platform with wider bool, figure out what to do at that time.

> Here's a draft patch for that.

Passes an eyeball check, but I've not really tried to test it.
I suppose the only meaningful test is likely to be letting the
buildfarm loose on it.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> Pushed.  Alligator is turning green in the back branches, only one more to go.
> I see that Peter also pushed the reserved word patches.  Apparently
> that gcc bug it's blowing up on might go away with -g0 (see link
> earlier), but anyway it's a nightly build compiler so fingers crossed
> for a fix soon.  The newer branches are building and running for me on
> { gcc14, gcc15, clang18 } -std=gnu23, and apparently the bug was even
> in gcc14, so it must require some unlikely conditions that alligator
> has stumbled on.

Looks like flaviventris and serinus just updated to the same broken
compiler version that alligator is using :-(.  Maybe we'd better
file a formal bug report?

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Andres Freund
Date:
Hi,

On 2024-11-27 13:28:24 -0500, Tom Lane wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > Pushed.  Alligator is turning green in the back branches, only one more to go.
> > I see that Peter also pushed the reserved word patches.  Apparently
> > that gcc bug it's blowing up on might go away with -g0 (see link
> > earlier), but anyway it's a nightly build compiler so fingers crossed
> > for a fix soon.  The newer branches are building and running for me on
> > { gcc14, gcc15, clang18 } -std=gnu23, and apparently the bug was even
> > in gcc14, so it must require some unlikely conditions that alligator
> > has stumbled on.
> 
> Looks like flaviventris and serinus just updated to the same broken
> compiler version that alligator is using :-(.  Maybe we'd better
> file a formal bug report?

I run a development gcc locally, and I just had updated it this morning
(4a868591169). Interestingly I don't see the ICE with it.

But I can reproduce it with debian sid's gcc-snapshot, with exactly the same
compiler arguments. The snapshot's version:
     gcc (Debian 20241123-1) 15.0.0 20241123 (experimental) [master r15-5606-g4aa4162e365]

so it looks like the bug might have been fixed recently?

I'm not sure this is really the bug linked to earlier [1]. I can't repro the
issue with 14, for example.

It's possible it requires specific gcc configure flags to be triggered?

Luckily -g1 does, at least locally, work around the issue with
gcc-snapshot. So I guess I'll make flaviventris and serinus use that for now
:/

Greetings,

Andres Freund

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113688



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Sam James
Date:
Andres Freund <andres@anarazel.de> writes:

> Hi,
>
> On 2024-11-27 13:28:24 -0500, Tom Lane wrote:
>> Thomas Munro <thomas.munro@gmail.com> writes:
>> > Pushed.  Alligator is turning green in the back branches, only one more to go.
>> > I see that Peter also pushed the reserved word patches.  Apparently
>> > that gcc bug it's blowing up on might go away with -g0 (see link
>> > earlier), but anyway it's a nightly build compiler so fingers crossed
>> > for a fix soon.  The newer branches are building and running for me on
>> > { gcc14, gcc15, clang18 } -std=gnu23, and apparently the bug was even
>> > in gcc14, so it must require some unlikely conditions that alligator
>> > has stumbled on.
>> 
>> Looks like flaviventris and serinus just updated to the same broken
>> compiler version that alligator is using :-(.  Maybe we'd better
>> file a formal bug report?
>
> I run a development gcc locally, and I just had updated it this morning
> (4a868591169). Interestingly I don't see the ICE with it.
>
> But I can reproduce it with debian sid's gcc-snapshot, with exactly the same
> compiler arguments. The snapshot's version:
>      gcc (Debian 20241123-1) 15.0.0 20241123 (experimental) [master r15-5606-g4aa4162e365]
>
> so it looks like the bug might have been fixed recently?
>
> I'm not sure this is really the bug linked to earlier [1]. I can't repro the
> issue with 14, for example.
>
> It's possible it requires specific gcc configure flags to be triggered?
>
> Luckily -g1 does, at least locally, work around the issue with
> gcc-snapshot. So I guess I'll make flaviventris and serinus use that for now
> :/
>
> Greetings,
>
> Andres Freund
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113688

See https://gcc.gnu.org/PR117724 as well. The issues are related in that
canonicalisation of struct types keeps needing revisiting, more so in
light of C23 changes.

Note also that the ICE is only with "checking" (~assertions) which is
enabled at a stricter level for non-releases by default, so some of it
may affect 14 but not show up there.

Martin Uecker has posted a patch which is currently being reviewed. I
wouldn't worry about it until that lands unless the build failures continue.

thanks,
sam



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Andres Freund
Date:
Hi,

On 2024-11-27 19:01:36 +0000, Sam James wrote:
> > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113688
> See https://gcc.gnu.org/PR117724 as well. The issues are related in that
> canonicalisation of struct types keeps needing revisiting, more so in
> light of C23 changes.

Thanks.


> Note also that the ICE is only with "checking" (~assertions) which is
> enabled at a stricter level for non-releases by default, so some of it
> may affect 14 but not show up there.

Ah, that explains that.


> Martin Uecker has posted a patch which is currently being reviewed. I
> wouldn't worry about it until that lands unless the build failures continue.

I changed my local build to use the same --checking as debian's gcc-snapshot,
confirmed that that reproduces the issue. Then applied Martin's patch. It does
fix the problem.

Greetings,

Andres Freund



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Andres Freund
Date:
Hi,

On 2024-11-27 13:50:59 -0500, Andres Freund wrote:
> Luckily -g1 does, at least locally, work around the issue with
> gcc-snapshot. So I guess I'll make flaviventris and serinus use that for now
> :/

Done. It did fix flaviventris/HEAD, I'm sure the others will follow soon.

Greetings,

Andres Freund



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Robins Tharakan
Date:
Hi,
 
> Done. It did fix flaviventris/HEAD, I'm sure the others will follow soon.

alligator/HEAD is still failing though.
Let me know if changing something on alligator can help in some way.

-
robins

Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Robins Tharakan <tharakan@gmail.com> writes:
> alligator/HEAD is still failing though.
> Let me know if changing something on alligator can help in some way.

I think we're waiting on the gcc crew to fix their bug.  As long
as alligator is faithfully rebuilding gcc from upstream everyday,
there's not much more to do than wait.

(You could adopt Andres' -g1 workaround, but then we'd not know
when the gcc bug is fixed.  So unless this drags on quite awhile,
I think alligator is best left as-is.)

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Robins Tharakan
Date:

On Sat, 30 Nov 2024 at 02:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robins Tharakan <tharakan@gmail.com> writes:
> alligator/HEAD is still failing though.
> Let me know if changing something on alligator can help in some way.

I think we're waiting on the gcc crew to fix their bug.  As long
as alligator is faithfully rebuilding gcc from upstream everyday,
there's not much more to do than wait.

(You could adopt Andres' -g1 workaround, but then we'd not know
when the gcc bug is fixed.  So unless this drags on quite awhile,
I think alligator is best left as-is.)
 
It's been a few days since alligator lit up the buildfarm red. IMHO at this
point, it is just noise. So I've reduced its v16+ build frequency to daily
(instead of a few minutes). I'll revert that once gcc is fixed upstream.

-
robins

Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Robins Tharakan <tharakan@gmail.com> writes:
> It's been a few days since alligator lit up the buildfarm red. IMHO at this
> point, it is just noise. So I've reduced its v16+ build frequency to daily
> (instead of a few minutes). I'll revert that once gcc is fixed upstream.

Looks like alligator just went green with the 20241212 gcc build.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Robins Tharakan
Date:

On Fri, 13 Dec 2024 at 11:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robins Tharakan <tharakan@gmail.com> writes:
> It's been a few days since alligator lit up the buildfarm red. IMHO at this
> point, it is just noise. So I've reduced its v16+ build frequency to daily
> (instead of a few minutes). I'll revert that once gcc is fixed upstream.

Looks like alligator just went green with the 20241212 gcc build.


Nice! Just forced REL_17_STABLE / REL_16_STABLE and they are past
that failure point too (still running). 

For now I've restored the original build frequency for these branches.
 -
robins

Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Robins Tharakan
Date:

On Fri, 13 Dec 2024 at 12:31, Robins Tharakan <tharakan@gmail.com> wrote:
Just forced REL_17_STABLE / REL_16_STABLE and they are past
that failure point too (still running). 



Seems unrelated to the thread but v17 has failed.

REL_17_STABLE failed on misc-recovery and one context I can add here is
that I triggered both REL_16_STABLE and REL_17_STABLE together and
were running neck-and-neck wherein v16 went past this test (in ~4 minutes)
and v17 (got stuck for ~10 min) and failed.


-
robins
 

Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Robins Tharakan
Date:

On Fri, 13 Dec 2024 at 12:46, Robins Tharakan <tharakan@gmail.com> wrote:

REL_17_STABLE failed on misc-recovery and one context I can add here is
that I triggered both REL_16_STABLE and REL_17_STABLE together and
were running neck-and-neck wherein v16 went past this test (in ~4 minutes)
and v17 (got stuck for ~10 min) and failed.


Is it possible that 2 concurrent runs (of different branches) could step on each other?

These are the logs that I captured, and v16 [2] / v17 [1] literally ran at the same
time (seconds apart).


v16 log:
alligator:REL_16_STABLE [12:24:42] running bin test scripts ...
alligator:REL_16_STABLE [12:25:28] running test misc-recovery ...
alligator:REL_16_STABLE [12:29:40] running test misc-subscription ...


v17 log:
alligator:REL_17_STABLE [12:25:09] running bin test psql ...
alligator:REL_17_STABLE [12:25:19] running bin test scripts ...
alligator:REL_17_STABLE [12:26:00] running test misc-recovery ...
alligator:REL_17_STABLE [12:37:15] failed at stage recoveryCheck
$


Anyway, I have triggered another v17 run. If the above was purely coincidental, then
that run should fail too.

Ref:
-
robins

Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Robins Tharakan <tharakan@gmail.com> writes:
> Is it possible that 2 concurrent runs (of different branches) could step on
> each other?

Looks that way, doesn't it?  But I suspect it's just a timing
problem --- perhaps one triggered by the background load caused
by another test run, but not directly related to it.  Looking at
019_replslot_limit.pl, I see

-----
# freeze walsender and walreceiver. Slot will still be active, but walreceiver
# won't get anything anymore.
kill 'STOP', $senderpid, $receiverpid;
$node_primary3->advance_wal(2);
-----

This fragment seems to assume that the effect of the kill 'STOP'
is instantaneous.  If it's not, it seems possible that the
walsender process could manage to push out the WAL data added by
"advance_wal" before it gets frozen, in which case the subsequent
loop that's watching for it to get killed would watch in vain,
which matches your symptoms.

As far as a quick grep finds, this is the only test we have that
relies on kill 'STOP'.  I suspect that it's too clever for its
own good.

I trawled the last three months' worth of buildfarm logs and
didn't find any other matches to "not ok 19 - walsender termination
logged".  So this is a pretty improbable failure mode, whatever
the explanation is.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Andrew Dunstan
Date:


On 2024-12-12 Th 9:31 PM, Robins Tharakan wrote:

On Fri, 13 Dec 2024 at 12:46, Robins Tharakan <tharakan@gmail.com> wrote:

REL_17_STABLE failed on misc-recovery and one context I can add here is
that I triggered both REL_16_STABLE and REL_17_STABLE together and
were running neck-and-neck wherein v16 went past this test (in ~4 minutes)
and v17 (got stuck for ~10 min) and failed.


Is it possible that 2 concurrent runs (of different branches) could step on each other?

These are the logs that I captured, and v16 [2] / v17 [1] literally ran at the same
time (seconds apart).


v16 log:
alligator:REL_16_STABLE [12:24:42] running bin test scripts ...
alligator:REL_16_STABLE [12:25:28] running test misc-recovery ...
alligator:REL_16_STABLE [12:29:40] running test misc-subscription ...


v17 log:
alligator:REL_17_STABLE [12:25:09] running bin test psql ...
alligator:REL_17_STABLE [12:25:19] running bin test scripts ...
alligator:REL_17_STABLE [12:26:00] running test misc-recovery ...
alligator:REL_17_STABLE [12:37:15] failed at stage recoveryCheck
$


We actually have a good deal of protection against concurrent runs clobbering each other.

It's not clear to me if you're using "run_branches.pl --run-parallel" or not. If not, you might like to consider changing to that - it's the recommended way of doing concurrent runs. Apart from any other reason it removes the need for a lot of redundant git fetches. By default it staggers concurrent build starts by 60 seconds.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Robins Tharakan
Date:

On Sat, 14 Dec 2024 at 00:18, Andrew Dunstan <andrew@dunslane.net> wrote:

We actually have a good deal of protection against concurrent runs clobbering each other.

It's not clear to me if you're using "run_branches.pl --run-parallel" or not. If not, you might like to consider changing to that - it's the recommended way of doing concurrent runs. Apart from any other reason it removes the need for a lot of redundant git fetches. By default it staggers concurrent build starts by 60 seconds.


In this case I didn't use run_branches.pl. I just opened up two sessions and triggered two
separate runs for v16 / v17 (with --force) since master just came out green. Efficiency aside,
at worst I was expecting two concurrent runs to be slower, but not error out.

Unrelated, for a slow system my understanding was that it's quite inefficient to keep running older
branches every few minutes (like HEAD does) - so for some of the animals I explicitly run older
branches (for e.g. v13) every few hours, but HEAD runs every few minutes. 

Are you saying it's still a good idea to run all together every few minutes (and let older branches
skip if there's nothing to do)?

-
robins

Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Tom Lane
Date:
Robins Tharakan <tharakan@gmail.com> writes:
> Unrelated, for a slow system my understanding was that it's quite
> inefficient to keep running older
> branches every few minutes (like HEAD does) - so for some of the animals I
> explicitly run older
> branches (for e.g. v13) every few hours, but HEAD runs every few minutes.

> Are you saying it's still a good idea to run all together every few minutes
> (and let older branches
> skip if there's nothing to do)?

Nowadays run_branches' check for whether there's something to do is
cheap enough that it's not worth skipping.  So it's recommendable
to just launch that every so often, and not complicate your life
with manual per-branch scheduling.

            regards, tom lane



Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Robins Tharakan
Date:

On Sun, 15 Dec 2024 at 03:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Nowadays run_branches' check for whether there's something to do is
cheap enough that it's not worth skipping.  So it's recommendable
to just launch that every so often, and not complicate your life
with manual per-branch scheduling.

Thanks for explaining / confirming.
I'll update the remaining machine configurations soon.

-
robins

Re: Build failure with GCC 15 (defaults to -std=gnu23)

From
Thomas Munro
Date:
On Sun, Dec 15, 2024 at 8:04 PM Robins Tharakan <tharakan@gmail.com> wrote:
> [unexplained failure on alligator]

2024-12-19 12:38:26.223 ACDT [1956030:2] [unknown] FATAL:  could not
open shared memory segment "/PostgreSQL.2167569412": No such file or
directory

That smells like systemd's RemoteIPC feature:

https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC