Re: Move bki file pre-processing from initdb to bootstrap - Mailing list pgsql-hackers

From Krishnakumar R
Subject Re: Move bki file pre-processing from initdb to bootstrap
Date
Msg-id CAPMWgZ94rvAR5a9T-VBgcweu4EnQGtHaiXgW6g+sH62feZPwow@mail.gmail.com
Whole thread Raw
In response to Re: Move bki file pre-processing from initdb to bootstrap  (Peter Eisentraut <peter@eisentraut.org>)
Responses Re: Move bki file pre-processing from initdb to bootstrap
Re: Move bki file pre-processing from initdb to bootstrap
List pgsql-hackers
Thank you, Peter, Andres and Tom for your comments and thoughts.

Hi Peter,

> For example, for NAMEDATALEN, SIZEOF_POINTER, ALIGNOF_POINTER,
> FLOAT8PASSBYVAL, these are known at build time, so we could have
> genbki.pl substitute them at build time.

I have modified the patch to use genbki to generate these ones during
build time.

> The locale-related symbols (ENCODING, LC_COLLATE, etc.), I wonder
> whether we can eliminate the need for them.  Right now, these are only
> used in the bki entry for the template1 database.  How about initdb
> creates template0 first, with hardcoded default encoding, collation,
> etc., and then creates template1 from that, using the normal CREATE
> DATABASE command with the appropriate options.  Or initdb could just run
> an UPDATE on pg_database to put the right settings in place.

Using a combination of this suggestion and discussions Andres pointed
to in this thread, updated the patch to add placeholder values first
into template1 and then do UPDATEs in initdb itself.

> You should use an option letter that isn't already in use in one of the
> other modes of "postgres".  We try to keep those consistent.
>
> New options should be added to the --help output (usage() in main.c).

Used a -b option under bootstrap mode and added help.

> elog(INFO, "Open bki file %s\n", bki_file);
> +   boot_yyin = fopen(bki_file, "r");
>
> Why is this needed?  It already reads the bki file from stdin?

We no longer open the bki file in initdb and pass to postgres to parse
from stdin, instead we open the bki file directly in bootstrap and
pass the file stream to the parser. Hence the need to switch the yyin.
Have added a comment in the commit logs to capture this.

The version comparison has been moved from initdb to bootstrap. This
created some compatibility problems with windows tests. For now I kept
the version check to not have \n added, which worked fine and serves
the purpose. However hoping to have something better in v3 in addition
to addressing any other comments.

Please let me know your thoughts and review comments.

--
Thanks and Regards,
Krishnakumar (KK).
[Microsoft]

On Tue, Sep 19, 2023 at 3:18 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 01.09.23 10:01, Krishnakumar R wrote:
> > This patch moves the pre-processing for tokens in the bki file from
> > initdb to bootstrap. With these changes the bki file will only be
> > opened once in bootstrap and parsing will be done by the bootstrap
> > parser.
>
> I did some rough performance tests on this.  I get about a 10%
> improvement on initdb run time, so this appears to have merit.
>
> I wonder whether we can reduce the number of symbols that we need this
> treatment for.
>
> For example, for NAMEDATALEN, SIZEOF_POINTER, ALIGNOF_POINTER,
> FLOAT8PASSBYVAL, these are known at build time, so we could have
> genbki.pl substitute them at build time.
>
> The locale-related symbols (ENCODING, LC_COLLATE, etc.), I wonder
> whether we can eliminate the need for them.  Right now, these are only
> used in the bki entry for the template1 database.  How about initdb
> creates template0 first, with hardcoded default encoding, collation,
> etc., and then creates template1 from that, using the normal CREATE
> DATABASE command with the appropriate options.  Or initdb could just run
> an UPDATE on pg_database to put the right settings in place.
>
> I don't like this part so much, because it adds like 4 more places each
> of these variables is mentioned, which increases the mental and testing
> overhead for dealing with these features (which are an area of active
> development).
>
> In general, it would be good if this could be factored a bit more so
> each variable doesn't have to be hardcoded in so many places.
>
>
> Some more detailed comments on the code:
>
> +                   boot_yylval.str = pstrdup(yytext);
> +                   sprintf(boot_yylval.str, "%d", NAMEDATALEN);
>
> This is weird.  You are first assigning the text and then overwriting it
> with the numeric value?
>
> You can also use boot_yylval.ival for storing numbers.
>
> +                   if (bootp_null(ebootp, ebootp->username)) return
> NULLVAL;
>
> Add proper line breaks in the code.
>
> +bool bootp_null(extra_bootstrap_params *e, char *s)
>
> Add a comment what this function is supposed to do.
>
> This function could be static.
>
> +   while ((flag = getopt(argc, argv, "B:c:d:D:Fi:kr:X:-:")) != -1)
>
> You should use an option letter that isn't already in use in one of the
> other modes of "postgres".  We try to keep those consistent.
>
> New options should be added to the --help output (usage() in main.c).
>
> +   elog(INFO, "Open bki file %s\n", bki_file);
> +   boot_yyin = fopen(bki_file, "r");
>
> Why is this needed?  It already reads the bki file from stdin?
>
> +   printfPQExpBuffer(&cmd, "\"%s\" --boot -X %d %s %s %s %s -i
> %s=%s,%s=%s,%s=%s,"
> +                     "%s=%s,%s=%s,%s=%s,%s=%s,%s=%c",
> +                     backend_exec,
> +                     wal_segment_size_mb * (1024 * 1024),
> +                     boot_options, extra_options,
> +                     data_checksums ? "-k" : "",
> +                     debug ? "-d 5" : "",
>
> This appears to undo some of the changes done in cccdbc5d95.
>
> +#define BOOT_LC_COLLATE "lc_collate"
> +#define BOOT_LC_CTYPE "lc_ctype"
> +#define BOOT_ICU_LOCALE "icu_locale"
>
> etc.  This doesn't look particularly useful.  You can just use the
> strings directly.
>

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg16: invalid page/page verification failed
Next
From: Michael Paquier
Date:
Subject: Re: Remove distprep