Re: Speed up clean meson builds by ~25% - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Speed up clean meson builds by ~25%
Date
Msg-id 20240409223310.q7o4tctltsnqcm4j@awork3.anarazel.de
Whole thread Raw
In response to Re: Speed up clean meson builds by ~25%  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Speed up clean meson builds by ~25%
Re: Speed up clean meson builds by ~25%
List pgsql-hackers
Hi,

On 2024-04-09 17:13:52 +1200, Thomas Munro wrote:
> On Tue, Apr 9, 2024 at 5:01 PM Michael Paquier <michael@paquier.xyz> wrote:
> > On Mon, Apr 08, 2024 at 12:23:56PM +0300, Nazir Bilal Yavuz wrote:
> > > On Mon, 8 Apr 2024 at 11:00, Alexander Lakhin <exclusion@gmail.com> wrote:
> > >> As I wrote in [1], I didn't observe the issue with clang-18, so maybe it
> > >> is fixed already.
> > >> Perhaps it's worth rechecking...
> > >>
> > >> [1] https://www.postgresql.org/message-id/d2bf3727-bae4-3aee-65f6-caec2c4ebaa8%40gmail.com
> > >
> > > I had this problem on my local computer. My build times are:
> > >
> > > gcc: 20s
> > > clang-15: 24s
> > > clang-16: 105s
> > > clang-17: 111s
> > > clang-18: 25s
> >
> > Interesting.  A parallel build of ecpg shows similar numbers here:
> > clang-16: 101s
> > clang-17: 112s
> > clang-18: 14s
> > gcc: 10s
>
> I don't expect it to get fixed BTW, because it's present in 16.0.6,
> and .6 is the terminal release, if I understand their system
> correctly.  They're currently only doing bug fixes for 18, and even
> there not for much longer. Interesting that not everyone saw this at
> first, perhaps the bug arrived in a minor release that some people
> didn't have yet?  Or perhaps there is something special required to
> trigger it?

I think we need to do something about the compile time of this file, even with
gcc. Our main grammar already is an issue and stacking all the ecpg stuff on
top makes it considerably worse.

ISTM there's a bunch of pretty pointless stuff in the generated preproc.y,
which do seem to have some impact on compile time. E.g. a good bit of the file
is just stuff like

 reserved_keyword:
 ALL
 {
 $$ = mm_strdup("all");
}
...


Why are strduping all of these? We could instead just use the value of the
token, instead of forcing the compiler to generate branches for all individual
keywords etc.

I don't know off-hand if the keyword lookup machinery ends up with an
uppercase keyword, but if so, that'd be easy enough to change.


It actually looks to me like the many calls to mm_strdup() might actually be
what's driving clang nuts. I hacked up preproc.y to not need those calls for
  unreserved_keyword
  col_name_keyword
  type_func_name_keyword
  reserved_keyword
  bare_label_keyword
by removing the actions and defining those tokens to be of type str. There are
many more such calls that could be dealt with similarly.

That alone reduced compile times with
    clang-16 -O1 from  18.268s to  12.516s
    clang-16 -O2 from 345.188  to 158.084s
    clang-19 -O2 from  26.018s to  15.200s


I suspect what is happening is that clang tries to optimize the number of
calls to mm_strdup(), by separating the argument setup from the function
call. Which leads to a control flow graph with *many* incoming edges to the
basic block containing the function call to mm_strdup(), triggering a normally
harmless O(N^2) or such.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: Allow non-superuser to cancel superuser tasks.
Next
From: Michael Paquier
Date:
Subject: Re: Allow non-superuser to cancel superuser tasks.