Thread: Set arbitrary GUC options during initdb

Set arbitrary GUC options during initdb

From
Tom Lane
Date:
The attached patch responds to the discussion at [1] about how
we ought to offer a way to set any server GUC from the initdb
command line.  Currently, if for some reason the server won't
start with default parameters, the only way to get through initdb
is to change the installed version of postgresql.conf.sample.
And even that is just a kluge, because the initial probes to
choose max_connections and shared_buffers will all fail, causing
initdb to choose rock-bottom-minimum values of those settings.
You can fix that up after the fact if you notice it, but you
might not.

So this invents an initdb switch "-c NAME=VALUE" just like the
one that the server itself has long had.  The specified settings
are applied on the command line of the initial probe calls
(which happen before we've made any config files), and then they
are added to postgresql.auto.conf, which causes them to take
effect for the bootstrap backend runs as well as subsequent
postmaster starts.

I also invented "--set NAME=VALUE", mainly because just about
every other initdb switch has a long form.  The server itself
doesn't have that spelling, so I'm not wedded to that part.

Thoughts?

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/17757-dbdfc1f1c954a6db%40postgresql.org

diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 5b2bdac101..904cee7858 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -433,6 +433,23 @@ PostgreSQL documentation
     Other, less commonly used, options are also available:

     <variablelist>
+     <varlistentry id="app-initdb-option-set">
+      <term><option>-c <replaceable>name</replaceable>=<replaceable>value</replaceable></option></term>
+      <term><option>--set <replaceable>name</replaceable>=<replaceable>value</replaceable></option></term>
+      <listitem>
+       <para>
+        Forcibly set the server parameter <replaceable>name</replaceable>
+        to <replaceable>value</replaceable> during <command>initdb</command>,
+        and also set up that assignment in the
+        generated <filename>postgresql.auto.conf</filename> file, as though
+        it had been issued via <command>ALTER SYSTEM SET</command>.
+        This option can be used more than once to set several parameters.
+        It is primarily useful when the environment is such that the server
+        will not start at all using the default parameters.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="app-initdb-option-debug">
       <term><option>-d</option></term>
       <term><option>--debug</option></term>
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 7a58c33ace..8daad6a1f6 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -82,6 +82,13 @@
 /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
 extern const char *select_default_timezone(const char *share_path);

+/* simple list of strings */
+typedef struct _stringlist
+{
+    char       *str;
+    struct _stringlist *next;
+} _stringlist;
+
 static const char *const auth_methods_host[] = {
     "trust", "reject", "scram-sha-256", "md5", "password", "ident", "radius",
 #ifdef ENABLE_GSS
@@ -142,6 +149,8 @@ static char *pwfilename = NULL;
 static char *superuser_password = NULL;
 static const char *authmethodhost = NULL;
 static const char *authmethodlocal = NULL;
+static _stringlist *extra_guc_names = NULL;
+static _stringlist *extra_guc_values = NULL;
 static bool debug = false;
 static bool noclean = false;
 static bool noinstructions = false;
@@ -253,6 +262,7 @@ static void check_input(char *path);
 static void write_version_file(const char *extrapath);
 static void set_null_conf(void);
 static void test_config_settings(void);
+static bool test_specific_config_settings(int test_conns, int test_buffs);
 static void setup_config(void);
 static void bootstrap_template1(void);
 static void setup_auth(FILE *cmdfd);
@@ -359,6 +369,27 @@ escape_quotes_bki(const char *src)
     return result;
 }

+/*
+ * Add an item at the end of a stringlist.
+ */
+static void
+add_stringlist_item(_stringlist **listhead, const char *str)
+{
+    _stringlist *newentry = pg_malloc(sizeof(_stringlist));
+    _stringlist *oldentry;
+
+    newentry->str = pg_strdup(str);
+    newentry->next = NULL;
+    if (*listhead == NULL)
+        *listhead = newentry;
+    else
+    {
+        for (oldentry = *listhead; oldentry->next; oldentry = oldentry->next)
+             /* skip */ ;
+        oldentry->next = newentry;
+    }
+}
+
 /*
  * make a copy of the array of lines, with token replaced by replacement
  * the first time it occurs on each line.
@@ -873,11 +904,9 @@ test_config_settings(void)
         400, 300, 200, 100, 50
     };

-    char        cmd[MAXPGPATH];
     const int    connslen = sizeof(trial_conns) / sizeof(int);
     const int    bufslen = sizeof(trial_bufs) / sizeof(int);
     int            i,
-                status,
                 test_conns,
                 test_buffs,
                 ok_buffers = 0;
@@ -903,19 +932,7 @@ test_config_settings(void)
         test_conns = trial_conns[i];
         test_buffs = MIN_BUFS_FOR_CONNS(test_conns);

-        snprintf(cmd, sizeof(cmd),
-                 "\"%s\" --check %s %s "
-                 "-c max_connections=%d "
-                 "-c shared_buffers=%d "
-                 "-c dynamic_shared_memory_type=%s "
-                 "< \"%s\" > \"%s\" 2>&1",
-                 backend_exec, boot_options, extra_options,
-                 test_conns, test_buffs,
-                 dynamic_shared_memory_type,
-                 DEVNULL, DEVNULL);
-        fflush(NULL);
-        status = system(cmd);
-        if (status == 0)
+        if (test_specific_config_settings(test_conns, test_buffs))
         {
             ok_buffers = test_buffs;
             break;
@@ -940,19 +957,7 @@ test_config_settings(void)
             break;
         }

-        snprintf(cmd, sizeof(cmd),
-                 "\"%s\" --check %s %s "
-                 "-c max_connections=%d "
-                 "-c shared_buffers=%d "
-                 "-c dynamic_shared_memory_type=%s "
-                 "< \"%s\" > \"%s\" 2>&1",
-                 backend_exec, boot_options, extra_options,
-                 n_connections, test_buffs,
-                 dynamic_shared_memory_type,
-                 DEVNULL, DEVNULL);
-        fflush(NULL);
-        status = system(cmd);
-        if (status == 0)
+        if (test_specific_config_settings(n_connections, test_buffs))
             break;
     }
     n_buffers = test_buffs;
@@ -968,6 +973,48 @@ test_config_settings(void)
     printf("%s\n", default_timezone ? default_timezone : "GMT");
 }

+/*
+ * Test a specific combination of configuration settings.
+ */
+static bool
+test_specific_config_settings(int test_conns, int test_buffs)
+{
+    PQExpBuffer cmd = createPQExpBuffer();
+    _stringlist *gnames,
+               *gvalues;
+    int            status;
+
+    /* Set up the test postmaster invocation */
+    printfPQExpBuffer(cmd,
+                      "\"%s\" --check %s %s "
+                      "-c max_connections=%d "
+                      "-c shared_buffers=%d "
+                      "-c dynamic_shared_memory_type=%s",
+                      backend_exec, boot_options, extra_options,
+                      test_conns, test_buffs,
+                      dynamic_shared_memory_type);
+
+    /* Add any user-given setting overrides */
+    for (gnames = extra_guc_names, gvalues = extra_guc_values;
+         gnames != NULL;        /* assume lists have the same length */
+         gnames = gnames->next, gvalues = gvalues->next)
+    {
+        appendPQExpBuffer(cmd, " -c %s=", gnames->str);
+        appendShellString(cmd, gvalues->str);
+    }
+
+    appendPQExpBuffer(cmd,
+                      " < \"%s\" > \"%s\" 2>&1",
+                      DEVNULL, DEVNULL);
+
+    fflush(NULL);
+    status = system(cmd->data);
+
+    destroyPQExpBuffer(cmd);
+
+    return (status == 0);
+}
+
 /*
  * Calculate the default wal_size with a "pretty" unit.
  */
@@ -994,7 +1041,10 @@ setup_config(void)
     char      **conflines;
     char        repltok[MAXPGPATH];
     char        path[MAXPGPATH];
-    char       *autoconflines[3];
+    char      **autoconflines;
+    int            numautoconflines;
+    _stringlist *gnames,
+               *gvalues;

     fputs(_("creating configuration files ... "), stdout);
     fflush(stdout);
@@ -1152,15 +1202,32 @@ setup_config(void)
     if (chmod(path, pg_file_create_mode) != 0)
         pg_fatal("could not change permissions of \"%s\": %m", path);

+    free(conflines);
+
+
     /*
-     * create the automatic configuration file to store the configuration
-     * parameters set by ALTER SYSTEM command. The parameters present in this
-     * file will override the value of parameters that exists before parse of
-     * this file.
+     * Create the automatic configuration file that stores the configuration
+     * parameters set by the ALTER SYSTEM command.  We also place any
+     * parameter values set with initdb's -c option into this file.
      */
+    numautoconflines = 2;        /* fixed header lines */
+    for (gnames = extra_guc_names; gnames != NULL; gnames = gnames->next)
+        numautoconflines++;
+    autoconflines = (char **) pg_malloc((numautoconflines + 1) * sizeof(char *));
+
     autoconflines[0] = pg_strdup("# Do not edit this file manually!\n");
     autoconflines[1] = pg_strdup("# It will be overwritten by the ALTER SYSTEM command.\n");
-    autoconflines[2] = NULL;
+    numautoconflines = 2;
+    for (gnames = extra_guc_names, gvalues = extra_guc_values;
+         gnames != NULL;        /* assume lists have the same length */
+         gnames = gnames->next, gvalues = gvalues->next)
+    {
+        autoconflines[numautoconflines++] =
+            psprintf("%s = '%s'\n",
+                     gnames->str,
+                     escape_quotes(gvalues->str));
+    }
+    autoconflines[numautoconflines] = NULL;

     sprintf(path, "%s/postgresql.auto.conf", pg_data);

@@ -1168,7 +1235,7 @@ setup_config(void)
     if (chmod(path, pg_file_create_mode) != 0)
         pg_fatal("could not change permissions of \"%s\": %m", path);

-    free(conflines);
+    free(autoconflines);


     /* pg_hba.conf */
@@ -2124,6 +2191,7 @@ usage(const char *progname)
     printf(_("  -X, --waldir=WALDIR       location for the write-ahead log directory\n"));
     printf(_("      --wal-segsize=SIZE    size of WAL segments, in megabytes\n"));
     printf(_("\nLess commonly used options:\n"));
+    printf(_("  -c, --set NAME=VALUE      override default setting for server parameter\n"));
     printf(_("  -d, --debug               generate lots of debugging output\n"));
     printf(_("      --discard-caches      set debug_discard_caches=1\n"));
     printf(_("  -L DIRECTORY              where to find the input files\n"));
@@ -2759,6 +2827,7 @@ main(int argc, char *argv[])
         {"nosync", no_argument, NULL, 'N'}, /* for backwards compatibility */
         {"no-sync", no_argument, NULL, 'N'},
         {"no-instructions", no_argument, NULL, 13},
+        {"set", required_argument, NULL, 'c'},
         {"sync-only", no_argument, NULL, 'S'},
         {"waldir", required_argument, NULL, 'X'},
         {"wal-segsize", required_argument, NULL, 12},
@@ -2808,7 +2877,8 @@ main(int argc, char *argv[])

     /* process command-line options */

-    while ((c = getopt_long(argc, argv, "A:dD:E:gkL:nNsST:U:WX:", long_options, &option_index)) != -1)
+    while ((c = getopt_long(argc, argv, "A:c:dD:E:gkL:nNsST:U:WX:",
+                            long_options, &option_index)) != -1)
     {
         switch (c)
         {
@@ -2831,6 +2901,24 @@ main(int argc, char *argv[])
             case 11:
                 authmethodhost = pg_strdup(optarg);
                 break;
+            case 'c':
+                {
+                    char       *buf = pg_strdup(optarg);
+                    char       *equals = strchr(buf, '=');
+
+                    if (!equals)
+                    {
+                        pg_log_error("-c %s requires a value", buf);
+                        pg_log_error_hint("Try \"%s --help\" for more information.",
+                                          progname);
+                        exit(1);
+                    }
+                    *equals++ = '\0';    /* terminate variable name */
+                    add_stringlist_item(&extra_guc_names, buf);
+                    add_stringlist_item(&extra_guc_values, equals);
+                    pfree(buf);
+                }
+                break;
             case 'D':
                 pg_data = pg_strdup(optarg);
                 break;
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 772769acab..1e23ac8d0c 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -48,7 +48,13 @@ mkdir $datadir;
     local (%ENV) = %ENV;
     delete $ENV{TZ};

-    command_ok([ 'initdb', '-N', '-T', 'german', '-X', $xlogdir, $datadir ],
+    # while we are here, also exercise -T and -c options
+    command_ok(
+        [
+            'initdb', '-N', '-T', 'german', '-c',
+            'default_text_search_config=german',
+            '-X', $xlogdir, $datadir
+        ],
         'successful creation');

     # Permissions on PGDATA should be default
@@ -147,4 +153,7 @@ command_fails(
     ],
     'fails for invalid option combination');

+command_fails([ 'initdb', '--no-sync', '--set', 'foo=bar', "$tempdir/dataX" ],
+    'fails for invalid --set option');
+
 done_testing();

Re: Set arbitrary GUC options during initdb

From
Andres Freund
Date:
Hi,

On 2023-01-25 16:25:19 -0500, Tom Lane wrote:
> The attached patch responds to the discussion at [1] about how
> we ought to offer a way to set any server GUC from the initdb
> command line.

Are you thinking of backpatching this, to offer the people affected by the
issue in [1] a way out?


> So this invents an initdb switch "-c NAME=VALUE" just like the
> one that the server itself has long had.

I still am +1 on the idea. I've actually wanted this for development purposes
a couple times...


> The specified settings are applied on the command line of the initial probe
> calls (which happen before we've made any config files), and then they are
> added to postgresql.auto.conf, which causes them to take effect for the
> bootstrap backend runs as well as subsequent postmaster starts.

I think this means that if you set e.g. max_connections as an initdb
parameter, the probes won't do much. Probably fine?


Perhaps worth memorializing the priority of the -c options in a test?
E.g. setting shared_buffers = 20MB or so and then testing that that's the
value when starting the server?


> I also invented "--set NAME=VALUE", mainly because just about
> every other initdb switch has a long form.  The server itself
> doesn't have that spelling, so I'm not wedded to that part.

Fine with me, but also fine to leave out.

Greetings,

Andres Freund



Re: Set arbitrary GUC options during initdb

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2023-01-25 16:25:19 -0500, Tom Lane wrote:
>> The attached patch responds to the discussion at [1] about how
>> we ought to offer a way to set any server GUC from the initdb
>> command line.

> Are you thinking of backpatching this, to offer the people affected by the
> issue in [1] a way out?

We could ... it's a new feature for sure, but it seems quite unlikely
to break things for anybody not using it.

>> The specified settings are applied on the command line of the initial probe
>> calls (which happen before we've made any config files), and then they are
>> added to postgresql.auto.conf, which causes them to take effect for the
>> bootstrap backend runs as well as subsequent postmaster starts.

> I think this means that if you set e.g. max_connections as an initdb
> parameter, the probes won't do much. Probably fine?

Right, the probed value will be overridden.

> Perhaps worth memorializing the priority of the -c options in a test?
> E.g. setting shared_buffers = 20MB or so and then testing that that's the
> value when starting the server?

Given that it's written into postgresql.auto.conf, I imagine that
we have test coverage of that point already.

There is a more subtle issue, which is that -c max_connections or
-c shared_buffers should override the probe values *during the
probe steps*.  My first thought about implementation had been to
create postgresql.auto.conf right off the bat, but that would
fail to have this property because server command line overrides
config file.  I can't think of any very portable way to check
that though.

            regards, tom lane



Re: Set arbitrary GUC options during initdb

From
Peter Eisentraut
Date:
On 25.01.23 22:25, Tom Lane wrote:
> So this invents an initdb switch "-c NAME=VALUE" just like the
> one that the server itself has long had.

This seems useful.

> The specified settings
> are applied on the command line of the initial probe calls
> (which happen before we've made any config files), and then they
> are added to postgresql.auto.conf, which causes them to take
> effect for the bootstrap backend runs as well as subsequent
> postmaster starts.

I would have expected them to be edited into postgresql.conf.  What are 
the arguments for one or the other?

Btw., something that I have had in my notes for a while, but with this 
it would now be officially exposed:  Not all options can be safely set 
during bootstrap.  For example,

     initdb -D data -c track_commit_timestamp=on

will fail an assertion.  This might be an exception, or there might be 
others.




Re: Set arbitrary GUC options during initdb

From
Isaac Morland
Date:
On Fri, 27 Jan 2023 at 09:49, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 25.01.23 22:25, Tom Lane wrote:
> So this invents an initdb switch "-c NAME=VALUE" just like the
> one that the server itself has long had.

This seems useful.

> The specified settings
> are applied on the command line of the initial probe calls
> (which happen before we've made any config files), and then they
> are added to postgresql.auto.conf, which causes them to take
> effect for the bootstrap backend runs as well as subsequent
> postmaster starts.

I would have expected them to be edited into postgresql.conf.  What are
the arguments for one or the other?

That would be my expectation also. I believe that is how it works now for options which can be set by initdb, such as locale and port. I view postgresql.auto.conf being for temporary changes, or changes related to different instances within a replication setup, or whatever other uses people come up with - but not for the permanent configuration established by initdb.

In particular, I would be surprised if removing a postgresql.auto.conf completely disabled an instance. Obviously, in my replication setup example, the replication would be broken, but the basic operation of the instance would still be possible.

Also, if somebody wants to put a change in postgresql.auto.conf, they can easily do it using ALTER SYSTEM once the instance is running, or by just writing out their own postgresql.auto.conf before starting it. Putting a change in postgresql.conf programmatically is a bit of a pain.

Re: Set arbitrary GUC options during initdb

From
Robert Haas
Date:
On Wed, Jan 25, 2023 at 4:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So this invents an initdb switch "-c NAME=VALUE" just like the
> one that the server itself has long had.

HUGE +1 from me. This will, I think, be extremely convenient in many situations.

> The specified settings
> are applied on the command line of the initial probe calls
> (which happen before we've made any config files), and then they
> are added to postgresql.auto.conf, which causes them to take
> effect for the bootstrap backend runs as well as subsequent
> postmaster starts.

I agree with others that it would seem more natural to edit them in
postgresql.conf itself, but I also think it doesn't matter nearly as
much as getting the feature in some form.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Set arbitrary GUC options during initdb

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 25.01.23 22:25, Tom Lane wrote:
>> The specified settings
>> are applied on the command line of the initial probe calls
>> (which happen before we've made any config files), and then they
>> are added to postgresql.auto.conf, which causes them to take
>> effect for the bootstrap backend runs as well as subsequent
>> postmaster starts.

> I would have expected them to be edited into postgresql.conf.  What are 
> the arguments for one or the other?

TBH, the driving reason was that the string-munging code we have in
initdb isn't up to snuff for that: it wants to substitute for an
exactly-known string, which we won't have in the case of an
arbitrary GUC.

One idea if we want to make it work like that could be to stop
trying to edit out the default value, and instead make the file
contents look like, say,

#huge_pages = try                   # on, off, or try
huge_pages = off                    # set by initdb

Then we just need to be able to find the GUC's entry.

> Btw., something that I have had in my notes for a while, but with this 
> it would now be officially exposed:  Not all options can be safely set 
> during bootstrap.  For example,
>      initdb -D data -c track_commit_timestamp=on
> will fail an assertion.  This might be an exception, or there might be 
> others.

Interesting.  We'd probably want to sprinkle some more
do-nothing-in-bootstrap-mode tests as we discover that sort of thing.

            regards, tom lane



Re: Set arbitrary GUC options during initdb

From
Robert Haas
Date:
On Fri, Jan 27, 2023 at 10:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> One idea if we want to make it work like that could be to stop
> trying to edit out the default value, and instead make the file
> contents look like, say,
>
> #huge_pages = try                   # on, off, or try
> huge_pages = off                    # set by initdb

How about just making replace_token() a little smarter, and maybe renaming it?

The idea is that instead of:

replace_token(conflines, "#max_connections = 100", repltok);

You'd write something like:

replace_guc_value(conflines, "max_connections", repltok);

Which would look for a line matching /^#max_connections\s+=\s/, and
then identify everything following that point up to the first #. It
would replace all that stuff with repltok, but if the replacement is
shorter than the original, it would pad with spaces to get back to the
original length. And otherwise it would add a single space, so that if
you set a super long GUC value there's still at least one space
between the end of the value and the comment that follows.

There might be some quoting-related problems with this idea, not sure.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Set arbitrary GUC options during initdb

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> The idea is that instead of:

> replace_token(conflines, "#max_connections = 100", repltok);

> You'd write something like:

> replace_guc_value(conflines, "max_connections", repltok);

> Which would look for a line matching /^#max_connections\s+=\s/, and
> then identify everything following that point up to the first #. It
> would replace all that stuff with repltok, but if the replacement is
> shorter than the original, it would pad with spaces to get back to the
> original length. And otherwise it would add a single space, so that if
> you set a super long GUC value there's still at least one space
> between the end of the value and the comment that follows.

Well, yeah, I was trying to avoid writing that ;-).  There's even
one more wrinkle: we might already have removed the initial '#',
if one does say "-c max_connections=N", because this logic won't
know whether the -c switch matches one of initdb's predetermined
substitutions.

> There might be some quoting-related problems with this idea, not sure.

'#' in a value might confuse it, but we could probably take the last '#'
not the first.

Anyway, it seems like I gotta work harder.  I'll produce a
new patch.

            regards, tom lane



Re: Set arbitrary GUC options during initdb

From
"David G. Johnston"
Date:
On Fri, Jan 27, 2023 at 8:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> The idea is that instead of:

> replace_token(conflines, "#max_connections = 100", repltok);

> You'd write something like:

> replace_guc_value(conflines, "max_connections", repltok);

> Which would look for a line matching /^#max_connections\s+=\s/, and
> then identify everything following that point up to the first #. It
> would replace all that stuff with repltok, but if the replacement is
> shorter than the original, it would pad with spaces to get back to the
> original length. And otherwise it would add a single space, so that if
> you set a super long GUC value there's still at least one space
> between the end of the value and the comment that follows.

Well, yeah, I was trying to avoid writing that ;-).  There's even
one more wrinkle: we might already have removed the initial '#',
if one does say "-c max_connections=N", because this logic won't
know whether the -c switch matches one of initdb's predetermined
substitutions.

> There might be some quoting-related problems with this idea, not sure.

'#' in a value might confuse it, but we could probably take the last '#'
not the first.

Anyway, it seems like I gotta work harder.  I'll produce a
new patch.


How about just adding a "section" to the end of the file as needed:

# AdHoc Settings Specified During InitDB
max_connections=75
...

David J.

Re: Set arbitrary GUC options during initdb

From
Tom Lane
Date:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Fri, Jan 27, 2023 at 8:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Anyway, it seems like I gotta work harder.  I'll produce a
>> new patch.

> How about just adding a "section" to the end of the file as needed:

> # AdHoc Settings Specified During InitDB
> max_connections=75
> ...

Nah, I think that would be impossibly confusing.  One way or another
the live setting has to be near where the GUC is documented.

We will have to do add-at-the-end for custom GUCs, of course,
but in that case there's no matching comment to confuse you.

            regards, tom lane



Re: Set arbitrary GUC options during initdb

From
Tom Lane
Date:
I wrote:
>>> Anyway, it seems like I gotta work harder.  I'll produce a
>>> new patch.

The string-hacking was fully as tedious as I expected.  However, the
output looks pretty nice, and this does have the advantage that the
pre-programmed substitutions become a lot more robust: they are no
longer dependent on the initdb code exactly matching what is in
postgresql.conf.sample.

            regards, tom lane

diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 5b2bdac101..724188b1b5 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -433,6 +433,23 @@ PostgreSQL documentation
     Other, less commonly used, options are also available:

     <variablelist>
+     <varlistentry id="app-initdb-option-set">
+      <term><option>-c <replaceable>name</replaceable>=<replaceable>value</replaceable></option></term>
+      <term><option>--set <replaceable>name</replaceable>=<replaceable>value</replaceable></option></term>
+      <listitem>
+       <para>
+        Forcibly set the server parameter <replaceable>name</replaceable>
+        to <replaceable>value</replaceable> during <command>initdb</command>,
+        and also install that setting in the
+        generated <filename>postgresql.conf</filename> file,
+        so that it will apply during future server runs.
+        This option can be given more than once to set several parameters.
+        It is primarily useful when the environment is such that the server
+        will not start at all using the default parameters.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="app-initdb-option-debug">
       <term><option>-d</option></term>
       <term><option>--debug</option></term>
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 7a58c33ace..c955c0837e 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -48,6 +48,7 @@

 #include "postgres_fe.h"

+#include <ctype.h>
 #include <dirent.h>
 #include <fcntl.h>
 #include <netdb.h>
@@ -82,6 +83,13 @@
 /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
 extern const char *select_default_timezone(const char *share_path);

+/* simple list of strings */
+typedef struct _stringlist
+{
+    char       *str;
+    struct _stringlist *next;
+} _stringlist;
+
 static const char *const auth_methods_host[] = {
     "trust", "reject", "scram-sha-256", "md5", "password", "ident", "radius",
 #ifdef ENABLE_GSS
@@ -142,6 +150,8 @@ static char *pwfilename = NULL;
 static char *superuser_password = NULL;
 static const char *authmethodhost = NULL;
 static const char *authmethodlocal = NULL;
+static _stringlist *extra_guc_names = NULL;
+static _stringlist *extra_guc_values = NULL;
 static bool debug = false;
 static bool noclean = false;
 static bool noinstructions = false;
@@ -242,7 +252,10 @@ static char backend_exec[MAXPGPATH];

 static char **replace_token(char **lines,
                             const char *token, const char *replacement);
-
+static char **replace_guc_value(char **lines,
+                                const char *guc_name, const char *guc_value,
+                                bool mark_as_comment);
+static bool guc_value_requires_quotes(const char *guc_value);
 static char **readfile(const char *path);
 static void writefile(char *path, char **lines);
 static FILE *popen_check(const char *command, const char *mode);
@@ -253,6 +266,7 @@ static void check_input(char *path);
 static void write_version_file(const char *extrapath);
 static void set_null_conf(void);
 static void test_config_settings(void);
+static bool test_specific_config_settings(int test_conns, int test_buffs);
 static void setup_config(void);
 static void bootstrap_template1(void);
 static void setup_auth(FILE *cmdfd);
@@ -360,9 +374,34 @@ escape_quotes_bki(const char *src)
 }

 /*
- * make a copy of the array of lines, with token replaced by replacement
+ * Add an item at the end of a stringlist.
+ */
+static void
+add_stringlist_item(_stringlist **listhead, const char *str)
+{
+    _stringlist *newentry = pg_malloc(sizeof(_stringlist));
+    _stringlist *oldentry;
+
+    newentry->str = pg_strdup(str);
+    newentry->next = NULL;
+    if (*listhead == NULL)
+        *listhead = newentry;
+    else
+    {
+        for (oldentry = *listhead; oldentry->next; oldentry = oldentry->next)
+             /* skip */ ;
+        oldentry->next = newentry;
+    }
+}
+
+/*
+ * Make a copy of the array of lines, with token replaced by replacement
  * the first time it occurs on each line.
  *
+ * The original data structure is not changed, but we share any unchanged
+ * strings with it.  (This definition lends itself to memory leaks, but
+ * we don't care too much about leaks in this program.)
+ *
  * This does most of what sed was used for in the shell script, but
  * doesn't need any regexp stuff.
  */
@@ -416,6 +455,168 @@ replace_token(char **lines, const char *token, const char *replacement)
     return result;
 }

+/*
+ * Make a copy of the array of lines, replacing the possibly-commented-out
+ * assignment of parameter guc_name with a live assignment of guc_value.
+ * The value will be suitably quoted.
+ *
+ * If mark_as_comment is true, the replacement line is prefixed with '#'.
+ * This is used for fixing up cases where the effective default might not
+ * match what is in postgresql.conf.sample.
+ *
+ * We assume there's at most one matching assignment.  If we find no match,
+ * append a new line with the desired assignment.
+ *
+ * The original data structure is not changed, but we share any unchanged
+ * strings with it.  (This definition lends itself to memory leaks, but
+ * we don't care too much about leaks in this program.)
+ */
+static char **
+replace_guc_value(char **lines, const char *guc_name, const char *guc_value,
+                  bool mark_as_comment)
+{
+    char      **result;
+    int            namelen = strlen(guc_name);
+    PQExpBuffer newline = createPQExpBuffer();
+    int            numlines = 0;
+    int            i;
+
+    /* prepare the replacement line, except for possible comment and newline */
+    if (mark_as_comment)
+        appendPQExpBufferChar(newline, '#');
+    appendPQExpBuffer(newline, "%s = ", guc_name);
+    if (guc_value_requires_quotes(guc_value))
+        appendPQExpBuffer(newline, "'%s'", escape_quotes(guc_value));
+    else
+        appendPQExpBufferStr(newline, guc_value);
+
+    /* create the new pointer array */
+    for (i = 0; lines[i]; i++)
+        numlines++;
+
+    /* leave room for one extra string in case we need to append */
+    result = (char **) pg_malloc((numlines + 2) * sizeof(char *));
+
+    /* initialize result with all the same strings */
+    memcpy(result, lines, (numlines + 1) * sizeof(char *));
+
+    for (i = 0; i < numlines; i++)
+    {
+        const char *where;
+
+        /*
+         * Look for a line assigning to guc_name.  Typically it will be
+         * preceded by '#', but that might not be the case if a -c switch
+         * overrides a previous assignment.  We allow leading whitespace too,
+         * although normally there wouldn't be any.
+         */
+        where = result[i];
+        while (*where == '#' || isspace((unsigned char) *where))
+            where++;
+        if (strncmp(where, guc_name, namelen) != 0)
+            continue;
+        where += namelen;
+        while (isspace((unsigned char) *where))
+            where++;
+        if (*where != '=')
+            continue;
+
+        /* found it -- append the original comment if any */
+        where = strrchr(where, '#');
+        if (where)
+        {
+            /*
+             * We try to preserve original indentation, which is tedious.
+             * oldindent and newindent are measured in de-tab-ified columns.
+             */
+            const char *ptr;
+            int            oldindent = 0;
+            int            newindent;
+
+            for (ptr = result[i]; ptr < where; ptr++)
+            {
+                if (*ptr == '\t')
+                    oldindent += 8 - (oldindent % 8);
+                else
+                    oldindent++;
+            }
+            /* ignore the possibility of tabs in guc_value */
+            newindent = newline->len;
+            /* append appropriate tabs and spaces, forcing at least one */
+            oldindent = Max(oldindent, newindent + 1);
+            while (newindent < oldindent)
+            {
+                int            newindent_if_tab = newindent + 8 - (newindent % 8);
+
+                if (newindent_if_tab <= oldindent)
+                {
+                    appendPQExpBufferChar(newline, '\t');
+                    newindent = newindent_if_tab;
+                }
+                else
+                {
+                    appendPQExpBufferChar(newline, ' ');
+                    newindent++;
+                }
+            }
+            /* and finally append the old comment */
+            appendPQExpBufferStr(newline, where);
+            /* we'll have appended the original newline; don't add another */
+        }
+        else
+            appendPQExpBufferChar(newline, '\n');
+
+        result[i] = newline->data;
+
+        break;                    /* assume there's only one match */
+    }
+
+    if (i >= numlines)
+    {
+        /*
+         * No match, so append a new entry.  (We rely on the bootstrap server
+         * to complain if it's not a valid GUC name.)
+         */
+        appendPQExpBufferChar(newline, '\n');
+        result[numlines++] = newline->data;
+        result[numlines] = NULL;    /* keep the array null-terminated */
+    }
+
+    return result;
+}
+
+/*
+ * Decide if we should quote a replacement GUC value.  We aren't too tense
+ * here, but we'd like to avoid quoting simple identifiers and numbers
+ * with units, which are common cases.
+ */
+static bool
+guc_value_requires_quotes(const char *guc_value)
+{
+    /* Don't use <ctype.h> macros here, they might accept too much */
+#define LETTERS    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
+#define DIGITS    "0123456789"
+
+    if (*guc_value == '\0')
+        return true;            /* empty string must be quoted */
+    if (strchr(LETTERS, *guc_value))
+    {
+        if (strspn(guc_value, LETTERS DIGITS) == strlen(guc_value))
+            return false;        /* it's an identifier */
+        return true;            /* nope */
+    }
+    if (strchr(DIGITS, *guc_value))
+    {
+        /* skip over digits */
+        guc_value += strspn(guc_value, DIGITS);
+        /* there can be zero or more unit letters after the digits */
+        if (strspn(guc_value, LETTERS) == strlen(guc_value))
+            return false;        /* it's a number, possibly with units */
+        return true;            /* nope */
+    }
+    return true;                /* all else must be quoted */
+}
+
 /*
  * get the lines from a text file
  */
@@ -873,11 +1074,9 @@ test_config_settings(void)
         400, 300, 200, 100, 50
     };

-    char        cmd[MAXPGPATH];
     const int    connslen = sizeof(trial_conns) / sizeof(int);
     const int    bufslen = sizeof(trial_bufs) / sizeof(int);
     int            i,
-                status,
                 test_conns,
                 test_buffs,
                 ok_buffers = 0;
@@ -903,19 +1102,7 @@ test_config_settings(void)
         test_conns = trial_conns[i];
         test_buffs = MIN_BUFS_FOR_CONNS(test_conns);

-        snprintf(cmd, sizeof(cmd),
-                 "\"%s\" --check %s %s "
-                 "-c max_connections=%d "
-                 "-c shared_buffers=%d "
-                 "-c dynamic_shared_memory_type=%s "
-                 "< \"%s\" > \"%s\" 2>&1",
-                 backend_exec, boot_options, extra_options,
-                 test_conns, test_buffs,
-                 dynamic_shared_memory_type,
-                 DEVNULL, DEVNULL);
-        fflush(NULL);
-        status = system(cmd);
-        if (status == 0)
+        if (test_specific_config_settings(test_conns, test_buffs))
         {
             ok_buffers = test_buffs;
             break;
@@ -940,19 +1127,7 @@ test_config_settings(void)
             break;
         }

-        snprintf(cmd, sizeof(cmd),
-                 "\"%s\" --check %s %s "
-                 "-c max_connections=%d "
-                 "-c shared_buffers=%d "
-                 "-c dynamic_shared_memory_type=%s "
-                 "< \"%s\" > \"%s\" 2>&1",
-                 backend_exec, boot_options, extra_options,
-                 n_connections, test_buffs,
-                 dynamic_shared_memory_type,
-                 DEVNULL, DEVNULL);
-        fflush(NULL);
-        status = system(cmd);
-        if (status == 0)
+        if (test_specific_config_settings(n_connections, test_buffs))
             break;
     }
     n_buffers = test_buffs;
@@ -968,6 +1143,48 @@ test_config_settings(void)
     printf("%s\n", default_timezone ? default_timezone : "GMT");
 }

+/*
+ * Test a specific combination of configuration settings.
+ */
+static bool
+test_specific_config_settings(int test_conns, int test_buffs)
+{
+    PQExpBuffer cmd = createPQExpBuffer();
+    _stringlist *gnames,
+               *gvalues;
+    int            status;
+
+    /* Set up the test postmaster invocation */
+    printfPQExpBuffer(cmd,
+                      "\"%s\" --check %s %s "
+                      "-c max_connections=%d "
+                      "-c shared_buffers=%d "
+                      "-c dynamic_shared_memory_type=%s",
+                      backend_exec, boot_options, extra_options,
+                      test_conns, test_buffs,
+                      dynamic_shared_memory_type);
+
+    /* Add any user-given setting overrides */
+    for (gnames = extra_guc_names, gvalues = extra_guc_values;
+         gnames != NULL;        /* assume lists have the same length */
+         gnames = gnames->next, gvalues = gvalues->next)
+    {
+        appendPQExpBuffer(cmd, " -c %s=", gnames->str);
+        appendShellString(cmd, gvalues->str);
+    }
+
+    appendPQExpBuffer(cmd,
+                      " < \"%s\" > \"%s\" 2>&1",
+                      DEVNULL, DEVNULL);
+
+    fflush(NULL);
+    status = system(cmd->data);
+
+    destroyPQExpBuffer(cmd);
+
+    return (status == 0);
+}
+
 /*
  * Calculate the default wal_size with a "pretty" unit.
  */
@@ -995,6 +1212,8 @@ setup_config(void)
     char        repltok[MAXPGPATH];
     char        path[MAXPGPATH];
     char       *autoconflines[3];
+    _stringlist *gnames,
+               *gvalues;

     fputs(_("creating configuration files ... "), stdout);
     fflush(stdout);
@@ -1003,120 +1222,106 @@ setup_config(void)

     conflines = readfile(conf_file);

-    snprintf(repltok, sizeof(repltok), "max_connections = %d", n_connections);
-    conflines = replace_token(conflines, "#max_connections = 100", repltok);
+    snprintf(repltok, sizeof(repltok), "%d", n_connections);
+    conflines = replace_guc_value(conflines, "max_connections",
+                                  repltok, false);

     if ((n_buffers * (BLCKSZ / 1024)) % 1024 == 0)
-        snprintf(repltok, sizeof(repltok), "shared_buffers = %dMB",
+        snprintf(repltok, sizeof(repltok), "%dMB",
                  (n_buffers * (BLCKSZ / 1024)) / 1024);
     else
-        snprintf(repltok, sizeof(repltok), "shared_buffers = %dkB",
+        snprintf(repltok, sizeof(repltok), "%dkB",
                  n_buffers * (BLCKSZ / 1024));
-    conflines = replace_token(conflines, "#shared_buffers = 128MB", repltok);
+    conflines = replace_guc_value(conflines, "shared_buffers",
+                                  repltok, false);

-    snprintf(repltok, sizeof(repltok), "#unix_socket_directories = '%s'",
-             DEFAULT_PGSOCKET_DIR);
-    conflines = replace_token(conflines, "#unix_socket_directories = '/tmp'",
-                              repltok);
+    conflines = replace_guc_value(conflines, "lc_messages", lc_messages, false);

-#if DEF_PGPORT != 5432
-    snprintf(repltok, sizeof(repltok), "#port = %d", DEF_PGPORT);
-    conflines = replace_token(conflines, "#port = 5432", repltok);
-#endif
+    conflines = replace_guc_value(conflines, "lc_monetary", lc_monetary, false);

-    /* set default max_wal_size and min_wal_size */
-    snprintf(repltok, sizeof(repltok), "min_wal_size = %s",
-             pretty_wal_size(DEFAULT_MIN_WAL_SEGS));
-    conflines = replace_token(conflines, "#min_wal_size = 80MB", repltok);
+    conflines = replace_guc_value(conflines, "lc_numeric", lc_numeric, false);

-    snprintf(repltok, sizeof(repltok), "max_wal_size = %s",
-             pretty_wal_size(DEFAULT_MAX_WAL_SEGS));
-    conflines = replace_token(conflines, "#max_wal_size = 1GB", repltok);
-
-    snprintf(repltok, sizeof(repltok), "lc_messages = '%s'",
-             escape_quotes(lc_messages));
-    conflines = replace_token(conflines, "#lc_messages = 'C'", repltok);
-
-    snprintf(repltok, sizeof(repltok), "lc_monetary = '%s'",
-             escape_quotes(lc_monetary));
-    conflines = replace_token(conflines, "#lc_monetary = 'C'", repltok);
-
-    snprintf(repltok, sizeof(repltok), "lc_numeric = '%s'",
-             escape_quotes(lc_numeric));
-    conflines = replace_token(conflines, "#lc_numeric = 'C'", repltok);
-
-    snprintf(repltok, sizeof(repltok), "lc_time = '%s'",
-             escape_quotes(lc_time));
-    conflines = replace_token(conflines, "#lc_time = 'C'", repltok);
+    conflines = replace_guc_value(conflines, "lc_time", lc_time, false);

     switch (locale_date_order(lc_time))
     {
         case DATEORDER_YMD:
-            strcpy(repltok, "datestyle = 'iso, ymd'");
+            strcpy(repltok, "iso, ymd");
             break;
         case DATEORDER_DMY:
-            strcpy(repltok, "datestyle = 'iso, dmy'");
+            strcpy(repltok, "iso, dmy");
             break;
         case DATEORDER_MDY:
         default:
-            strcpy(repltok, "datestyle = 'iso, mdy'");
+            strcpy(repltok, "iso, mdy");
             break;
     }
-    conflines = replace_token(conflines, "#datestyle = 'iso, mdy'", repltok);
+    conflines = replace_guc_value(conflines, "datestyle", repltok, false);

     snprintf(repltok, sizeof(repltok),
-             "default_text_search_config = 'pg_catalog.%s'",
-             escape_quotes(default_text_search_config));
-    conflines = replace_token(conflines,
-                              "#default_text_search_config = 'pg_catalog.simple'",
-                              repltok);
+             "pg_catalog.%s", default_text_search_config);
+    conflines = replace_guc_value(conflines, "default_text_search_config",
+                                  repltok, false);

     if (default_timezone)
     {
-        snprintf(repltok, sizeof(repltok), "timezone = '%s'",
-                 escape_quotes(default_timezone));
-        conflines = replace_token(conflines, "#timezone = 'GMT'", repltok);
-        snprintf(repltok, sizeof(repltok), "log_timezone = '%s'",
-                 escape_quotes(default_timezone));
-        conflines = replace_token(conflines, "#log_timezone = 'GMT'", repltok);
+        conflines = replace_guc_value(conflines, "timezone",
+                                      default_timezone, false);
+        conflines = replace_guc_value(conflines, "log_timezone",
+                                      default_timezone, false);
     }

-    snprintf(repltok, sizeof(repltok), "dynamic_shared_memory_type = %s",
-             dynamic_shared_memory_type);
-    conflines = replace_token(conflines, "#dynamic_shared_memory_type = posix",
-                              repltok);
+    conflines = replace_guc_value(conflines, "dynamic_shared_memory_type",
+                                  dynamic_shared_memory_type, false);
+
+    /*
+     * Fix up various entries to match the true compile-time defaults.  Since
+     * these are indeed defaults, keep the postgresql.conf lines commented.
+     */
+    conflines = replace_guc_value(conflines, "unix_socket_directories",
+                                  DEFAULT_PGSOCKET_DIR, true);
+
+#if DEF_PGPORT != 5432
+    snprintf(repltok, sizeof(repltok), "%d", DEF_PGPORT);
+    conflines = replace_guc_value(conflines, "port",
+                                  repltok, true);
+#endif
+
+    conflines = replace_guc_value(conflines, "min_wal_size",
+                                  pretty_wal_size(DEFAULT_MIN_WAL_SEGS), true);
+
+    conflines = replace_guc_value(conflines, "max_wal_size",
+                                  pretty_wal_size(DEFAULT_MAX_WAL_SEGS), true);

 #if DEFAULT_BACKEND_FLUSH_AFTER > 0
-    snprintf(repltok, sizeof(repltok), "#backend_flush_after = %dkB",
+    snprintf(repltok, sizeof(repltok), "%dkB",
              DEFAULT_BACKEND_FLUSH_AFTER * (BLCKSZ / 1024));
-    conflines = replace_token(conflines, "#backend_flush_after = 0",
-                              repltok);
+    conflines = replace_guc_value(conflines, "backend_flush_after",
+                                  repltok, true);
 #endif

 #if DEFAULT_BGWRITER_FLUSH_AFTER > 0
-    snprintf(repltok, sizeof(repltok), "#bgwriter_flush_after = %dkB",
+    snprintf(repltok, sizeof(repltok), "%dkB",
              DEFAULT_BGWRITER_FLUSH_AFTER * (BLCKSZ / 1024));
-    conflines = replace_token(conflines, "#bgwriter_flush_after = 0",
-                              repltok);
+    conflines = replace_guc_value(conflines, "bgwriter_flush_after",
+                                  repltok, true);
 #endif

 #if DEFAULT_CHECKPOINT_FLUSH_AFTER > 0
-    snprintf(repltok, sizeof(repltok), "#checkpoint_flush_after = %dkB",
+    snprintf(repltok, sizeof(repltok), "%dkB",
              DEFAULT_CHECKPOINT_FLUSH_AFTER * (BLCKSZ / 1024));
-    conflines = replace_token(conflines, "#checkpoint_flush_after = 0",
-                              repltok);
+    conflines = replace_guc_value(conflines, "checkpoint_flush_after",
+                                  repltok, true);
 #endif

 #ifndef USE_PREFETCH
-    conflines = replace_token(conflines,
-                              "#effective_io_concurrency = 1",
-                              "#effective_io_concurrency = 0");
+    conflines = replace_guc_value(conflines, "effective_io_concurrency",
+                                  "0", true);
 #endif

 #ifdef WIN32
-    conflines = replace_token(conflines,
-                              "#update_process_title = on",
-                              "#update_process_title = off");
+    conflines = replace_guc_value(conflines, "update_process_title",
+                                  "off", true);
 #endif

     /*
@@ -1128,9 +1333,8 @@ setup_config(void)
         (strcmp(authmethodhost, "md5") == 0 &&
          strcmp(authmethodlocal, "scram-sha-256") != 0))
     {
-        conflines = replace_token(conflines,
-                                  "#password_encryption = scram-sha-256",
-                                  "password_encryption = md5");
+        conflines = replace_guc_value(conflines, "password_encryption",
+                                      "md5", false);
     }

     /*
@@ -1141,11 +1345,22 @@ setup_config(void)
      */
     if (pg_dir_create_mode == PG_DIR_MODE_GROUP)
     {
-        conflines = replace_token(conflines,
-                                  "#log_file_mode = 0600",
-                                  "log_file_mode = 0640");
+        conflines = replace_guc_value(conflines, "log_file_mode",
+                                      "0640", false);
     }

+    /*
+     * Now replace anything that's overridden via -c switches.
+     */
+    for (gnames = extra_guc_names, gvalues = extra_guc_values;
+         gnames != NULL;        /* assume lists have the same length */
+         gnames = gnames->next, gvalues = gvalues->next)
+    {
+        conflines = replace_guc_value(conflines, gnames->str,
+                                      gvalues->str, false);
+    }
+
+    /* ... and write out the finished postgresql.conf file */
     snprintf(path, sizeof(path), "%s/postgresql.conf", pg_data);

     writefile(path, conflines);
@@ -2124,6 +2339,7 @@ usage(const char *progname)
     printf(_("  -X, --waldir=WALDIR       location for the write-ahead log directory\n"));
     printf(_("      --wal-segsize=SIZE    size of WAL segments, in megabytes\n"));
     printf(_("\nLess commonly used options:\n"));
+    printf(_("  -c, --set NAME=VALUE      override default setting for server parameter\n"));
     printf(_("  -d, --debug               generate lots of debugging output\n"));
     printf(_("      --discard-caches      set debug_discard_caches=1\n"));
     printf(_("  -L DIRECTORY              where to find the input files\n"));
@@ -2759,6 +2975,7 @@ main(int argc, char *argv[])
         {"nosync", no_argument, NULL, 'N'}, /* for backwards compatibility */
         {"no-sync", no_argument, NULL, 'N'},
         {"no-instructions", no_argument, NULL, 13},
+        {"set", required_argument, NULL, 'c'},
         {"sync-only", no_argument, NULL, 'S'},
         {"waldir", required_argument, NULL, 'X'},
         {"wal-segsize", required_argument, NULL, 12},
@@ -2808,7 +3025,8 @@ main(int argc, char *argv[])

     /* process command-line options */

-    while ((c = getopt_long(argc, argv, "A:dD:E:gkL:nNsST:U:WX:", long_options, &option_index)) != -1)
+    while ((c = getopt_long(argc, argv, "A:c:dD:E:gkL:nNsST:U:WX:",
+                            long_options, &option_index)) != -1)
     {
         switch (c)
         {
@@ -2831,6 +3049,24 @@ main(int argc, char *argv[])
             case 11:
                 authmethodhost = pg_strdup(optarg);
                 break;
+            case 'c':
+                {
+                    char       *buf = pg_strdup(optarg);
+                    char       *equals = strchr(buf, '=');
+
+                    if (!equals)
+                    {
+                        pg_log_error("-c %s requires a value", buf);
+                        pg_log_error_hint("Try \"%s --help\" for more information.",
+                                          progname);
+                        exit(1);
+                    }
+                    *equals++ = '\0';    /* terminate variable name */
+                    add_stringlist_item(&extra_guc_names, buf);
+                    add_stringlist_item(&extra_guc_values, equals);
+                    pfree(buf);
+                }
+                break;
             case 'D':
                 pg_data = pg_strdup(optarg);
                 break;
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 772769acab..1e23ac8d0c 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -48,7 +48,13 @@ mkdir $datadir;
     local (%ENV) = %ENV;
     delete $ENV{TZ};

-    command_ok([ 'initdb', '-N', '-T', 'german', '-X', $xlogdir, $datadir ],
+    # while we are here, also exercise -T and -c options
+    command_ok(
+        [
+            'initdb', '-N', '-T', 'german', '-c',
+            'default_text_search_config=german',
+            '-X', $xlogdir, $datadir
+        ],
         'successful creation');

     # Permissions on PGDATA should be default
@@ -147,4 +153,7 @@ command_fails(
     ],
     'fails for invalid option combination');

+command_fails([ 'initdb', '--no-sync', '--set', 'foo=bar', "$tempdir/dataX" ],
+    'fails for invalid --set option');
+
 done_testing();

Re: Set arbitrary GUC options during initdb

From
Robert Haas
Date:
On Fri, Jan 27, 2023 at 3:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The string-hacking was fully as tedious as I expected.  However, the
> output looks pretty nice, and this does have the advantage that the
> pre-programmed substitutions become a lot more robust: they are no
> longer dependent on the initdb code exactly matching what is in
> postgresql.conf.sample.

Awesome! Thank you very much.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Set arbitrary GUC options during initdb

From
Peter Eisentraut
Date:
On 27.01.23 21:02, Tom Lane wrote:
> I wrote:
>>>> Anyway, it seems like I gotta work harder.  I'll produce a
>>>> new patch.
> 
> The string-hacking was fully as tedious as I expected.  However, the
> output looks pretty nice, and this does have the advantage that the
> pre-programmed substitutions become a lot more robust: they are no
> longer dependent on the initdb code exactly matching what is in
> postgresql.conf.sample.

This patch looks good to me.  It's a very nice simplification of the 
initdb.c code, even without the new feature.

I found that the addition of

#include <ctype.h>

didn't appear to be necessary.  Maybe it was required before 
guc_value_requires_quotes() was changed?

I would remove the

#if DEF_PGPORT != 5432

This was in the previous code too, but now if we remove it, then we 
don't have any more hardcoded 5432 left, which seems like a nice 
improvement in cleanliness.




Re: Set arbitrary GUC options during initdb

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> This patch looks good to me.  It's a very nice simplification of the 
> initdb.c code, even without the new feature.

Thanks for looking!

> I found that the addition of
> #include <ctype.h>
> didn't appear to be necessary.  Maybe it was required before 
> guc_value_requires_quotes() was changed?

There's still an isspace() added by the patch ... oh, the #include
is not needed because port.h includes ctype.h.  That's spectacularly
awful from an include-footprint standpoint, but I can't say that
I want to go fix it right this minute.

> I would remove the
> #if DEF_PGPORT != 5432
> This was in the previous code too, but now if we remove it, then we 
> don't have any more hardcoded 5432 left, which seems like a nice 
> improvement in cleanliness.

Hm.  That'll waste a few cycles during initdb; not sure if the extra
cleanliness is worth it.  It's not like that number is going to change.

            regards, tom lane



Re: Set arbitrary GUC options during initdb

From
Tom Lane
Date:
I wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> I would remove the
>> #if DEF_PGPORT != 5432
>> This was in the previous code too, but now if we remove it, then we 
>> don't have any more hardcoded 5432 left, which seems like a nice 
>> improvement in cleanliness.

> Hm.  That'll waste a few cycles during initdb; not sure if the extra
> cleanliness is worth it.  It's not like that number is going to change.

After further thought I did it as you suggest.  I think the only case
where we really care about shaving milliseconds from initdb is in debug
builds (e.g. buildfarm), which very likely get built with nondefault
DEF_PGPORT anyway.

I did get a bee in my bonnet about how replace_token (and now
replace_guc_value) leak memory like there's no tomorrow.  The leakage
amounts to about a megabyte per run according to valgrind, and it's
not going anywhere but up as we add more calls of those functions.
So I made a quick change to redefine them in a less leak-prone way.

            regards, tom lane



Re: Set arbitrary GUC options during initdb

From
Andres Freund
Date:
Hi,

This commit unfortunately broke --wal-segsize. If I use a slightly larger than
the default setting, I get:

initdb --wal-segsize 64 somepath
running bootstrap script ... 2023-03-22 13:06:41.282 PDT [639848] FATAL:  "min_wal_size" must be at least twice
"wal_segment_size"

Greetings,

Andres Freund



Re: Set arbitrary GUC options during initdb

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> This commit unfortunately broke --wal-segsize. If I use a slightly larger than
> the default setting, I get:
> initdb --wal-segsize 64 somepath
> running bootstrap script ... 2023-03-22 13:06:41.282 PDT [639848] FATAL:  "min_wal_size" must be at least twice
"wal_segment_size"

[ confused... ]  Oh, I see the problem.  This:

    /* set default max_wal_size and min_wal_size */
    snprintf(repltok, sizeof(repltok), "min_wal_size = %s",
             pretty_wal_size(DEFAULT_MIN_WAL_SEGS));
    conflines = replace_token(conflines, "#min_wal_size = 80MB", repltok);

looks like it's setting a compile-time-constant value of min_wal_size;
at least that's what I thought it was doing when I revised the code.
But it isn't, because somebody had the brilliant idea of making
pretty_wal_size() depend on the wal_segment_size_mb variable.

Will fix, thanks for report.

            regards, tom lane



Re: Set arbitrary GUC options during initdb

From
Peter Eisentraut
Date:
On 27.01.23 15:48, Peter Eisentraut wrote:
> Btw., something that I have had in my notes for a while, but with this 
> it would now be officially exposed:  Not all options can be safely set 
> during bootstrap.  For example,
> 
>      initdb -D data -c track_commit_timestamp=on
> 
> will fail an assertion.  This might be an exception, or there might be 
> others.

I ran a test across all changeable boolean parameters with initdb 
setting it to the opposite of their default.  The only one besides 
track_commit_timestamp that caused initdb to not complete was 
default_transaction_read_only, which is to be expected.

We should fix track_commit_timestamp, but it doesn't look like there is 
wider impact.  (Obviously, this tested only boolean settings.  If 
someone wants to fuzz-test the others ...)