Thread: Enforce "max_wal_size/ min_wal_size must be at least twice wal_segment_size" limit while setting GUCs

Hi,

Currently postgres allows setting any value for max_wal_size or
min_wal_size, not enforcing "at least twice as wal_segment_size" limit
[1]. This isn't a problem if the server continues to run, however, the
server can't come up after a crash or restart or maintenance or
upgrade (goes to a continuous restart loop with FATAL errors [1]).

How about we add GUC check hooks for both max_wal_size and
min_wal_size where we can either emit ERROR or WARNING if values are
not "at least twice as wal_segment_size"?

Thoughts?

[1]
FATAL:  "max_wal_size" must be at least twice "wal_segment_size"
FATAL:  "min_wal_size" must be at least twice "wal_segment_size"

[2]
./initdb -D data
./pg_ctl -D data -l logfile start
./psql -c "alter system set max_wal_size='2MB'" postgres
./psql -c "alter system set min_wal_size='2MB'" postgres
./psql -c "select pg_reload_conf()" postgres
./pg_ctl -D data -l logfile restart

Regards,
Bharath Rupireddy.



Hi Bharath,

Could you explain why min wal size must be at least twice but not
equal to wal_segment_size ?

thanks
Rajesh

On Sat, May 21, 2022 at 7:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Hi,
>
> Currently postgres allows setting any value for max_wal_size or
> min_wal_size, not enforcing "at least twice as wal_segment_size" limit
> [1]. This isn't a problem if the server continues to run, however, the
> server can't come up after a crash or restart or maintenance or
> upgrade (goes to a continuous restart loop with FATAL errors [1]).
>
> How about we add GUC check hooks for both max_wal_size and
> min_wal_size where we can either emit ERROR or WARNING if values are
> not "at least twice as wal_segment_size"?
>
> Thoughts?
>
> [1]
> FATAL:  "max_wal_size" must be at least twice "wal_segment_size"
> FATAL:  "min_wal_size" must be at least twice "wal_segment_size"
>
> [2]
> ./initdb -D data
> ./pg_ctl -D data -l logfile start
> ./psql -c "alter system set max_wal_size='2MB'" postgres
> ./psql -c "alter system set min_wal_size='2MB'" postgres
> ./psql -c "select pg_reload_conf()" postgres
> ./pg_ctl -D data -l logfile restart
>
> Regards,
> Bharath Rupireddy.
>
>



Hi Bharath,

+1.
This seems to be good idea to have checks on upper bound for the max_wal_size and min_wal_size. We have seen customers play with these parameters and ran into issues.
It will also be better to consider all the control parameters and have a min/max checks on them as well.

Thanks,
Mahendrakar.

On Sat, 21 May 2022 at 23:26, rajesh singarapu <rajesh.rs0541@gmail.com> wrote:
Hi Bharath,

Could you explain why min wal size must be at least twice but not
equal to wal_segment_size ?

thanks
Rajesh

On Sat, May 21, 2022 at 7:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Hi,
>
> Currently postgres allows setting any value for max_wal_size or
> min_wal_size, not enforcing "at least twice as wal_segment_size" limit
> [1]. This isn't a problem if the server continues to run, however, the
> server can't come up after a crash or restart or maintenance or
> upgrade (goes to a continuous restart loop with FATAL errors [1]).
>
> How about we add GUC check hooks for both max_wal_size and
> min_wal_size where we can either emit ERROR or WARNING if values are
> not "at least twice as wal_segment_size"?
>
> Thoughts?
>
> [1]
> FATAL:  "max_wal_size" must be at least twice "wal_segment_size"
> FATAL:  "min_wal_size" must be at least twice "wal_segment_size"
>
> [2]
> ./initdb -D data
> ./pg_ctl -D data -l logfile start
> ./psql -c "alter system set max_wal_size='2MB'" postgres
> ./psql -c "alter system set min_wal_size='2MB'" postgres
> ./psql -c "select pg_reload_conf()" postgres
> ./pg_ctl -D data -l logfile restart
>
> Regards,
> Bharath Rupireddy.
>
>


On Sat, May 21, 2022 at 11:26 PM rajesh singarapu
<rajesh.rs0541@gmail.com> wrote:
>
> On Sat, May 21, 2022 at 7:08 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Hi,
> >
> > Currently postgres allows setting any value for max_wal_size or
> > min_wal_size, not enforcing "at least twice as wal_segment_size" limit
> > [1]. This isn't a problem if the server continues to run, however, the
> > server can't come up after a crash or restart or maintenance or
> > upgrade (goes to a continuous restart loop with FATAL errors [1]).
> >
> > How about we add GUC check hooks for both max_wal_size and
> > min_wal_size where we can either emit ERROR or WARNING if values are
> > not "at least twice as wal_segment_size"?
> >
> > Thoughts?
> >
> > [1]
> > FATAL:  "max_wal_size" must be at least twice "wal_segment_size"
> > FATAL:  "min_wal_size" must be at least twice "wal_segment_size"
> Hi Bharath,
>
> Could you explain why min wal size must be at least twice but not
> equal to wal_segment_size ?

It is because postgres always needs/keeps at least one WAL file and
the usage of max_wal_size/min_wal_size is in terms of number of WAL
segments/WAL files. It doesn't make sense to set
max_wal_size/min_wal_size to, say, 20MB (where wal_segment_size =
16MB) and expect the server to honor it and do something. Hence the
'at least twice' requirement.

Regards,
Bharath Rupireddy.



At Sat, 21 May 2022 19:08:06 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> How about we add GUC check hooks for both max_wal_size and
> min_wal_size where we can either emit ERROR or WARNING if values are
> not "at least twice as wal_segment_size"?

It should be ERROR.

As you say, it should have been changed when the unit of them is
changed to MB and wal_segment_size became variable.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Mon, May 23, 2022 at 10:45 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Sat, 21 May 2022 19:08:06 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
> > How about we add GUC check hooks for both max_wal_size and
> > min_wal_size where we can either emit ERROR or WARNING if values are
> > not "at least twice as wal_segment_size"?
>
> It should be ERROR.
>
> As you say, it should have been changed when the unit of them is
> changed to MB and wal_segment_size became variable.

Thanks. Having check hooks for min_wal_size and max_wal_size that
throw errors if they aren't at least twice the wal_segment_size has a
"BIG" problem - ./initdb -D data --wal-segsize=1 (or a value < 16)
fails. This is because during the bootstrap mode the min_wal_size is
calculated using the supplied wal-segsize and written to
postgresql.conf file, but in the "post-bootstrap initialization" in
single user mode, the min_wal_size's default value is calculated as
80MB using default wal_segment_size 16MB
(PostmasterMain->InitializeGUCOptions->InitializeOneGUCOption->check_hook)
as wal_segment_size isn't read from control file yet, see [1] and [2]
for reference.

Maybe we have a fundamental problem here that in single user mode we
aren't reading control file. I have no further thoughts to offer at
this moment

[1]
foobaralicebob@foobaralicebob:~/postgres/inst/bin$ ./initdb -D data
--wal-segsize=1
The files belonging to this database system will be owned by user
"foobaralicebob".
This user must also own the server process.

The database cluster will be initialized with locale "C.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... 2022-05-23 11:57:35.999 GMT [3277331]
LOG:  min_wal_size 80, wal_segment_size 16777216
2022-05-23 11:57:36.000 GMT [3277331] LOG:  min_wal_size 5,
wal_segment_size 1048576
ok
performing post-bootstrap initialization ... 2022-05-23 11:57:36.178
GMT [3277333] LOG:  min_wal_size 80, wal_segment_size 16777216
2022-05-23 11:57:36.179 GMT [3277333] LOG:  min_wal_size 5,
wal_segment_size 16777216
2022-05-23 11:57:36.179 GMT [3277333] LOG:  "min_wal_size" must be at
least twice "wal_segment_size"
2022-05-23 11:57:36.179 UTC [3277333] FATAL:  configuration file
"/home/foobaralicebob/postgres/inst/bin/data/postgresql.conf" contains
errors
child process exited with exit code 1
initdb: removing data directory "data"

[2]
(gdb) bt
#0  0x00007f879105cfaa in __GI___select (nfds=0, readfds=0x0,
writefds=0x0, exceptfds=0x0, timeout=0x7ffd31e040c0) at
../sysdeps/unix/sysv/linux/select.c:41
#1  0x0000556cee068326 in pg_usleep (microsec=1000000) at pgsleep.c:56
#2  0x0000556ced9cc06e in check_min_wal_size (newval=0x7ffd31e04240,
extra=0x7ffd31e04248, source=PGC_S_FILE) at xlog.c:4327
#3  0x0000556cee016e2e in call_int_check_hook (conf=0x556cee365c58
<ConfigureNamesInt+9912>, newval=0x7ffd31e04240, extra=0x7ffd31e04248,
source=PGC_S_FILE, elevel=15) at guc.c:11786
#4  0x0000556cee00eb28 in parse_and_validate_value
(record=0x556cee365c58 <ConfigureNamesInt+9912>, name=0x556cef5a9778
"min_wal_size", value=0x556cef5a97a0 "5MB", source=PGC_S_FILE,
    elevel=15, newval=0x7ffd31e04240, newextra=0x7ffd31e04248) at guc.c:7413
#5  0x0000556cee00f908 in set_config_option (name=0x556cef5a9778
"min_wal_size", value=0x556cef5a97a0 "5MB", context=PGC_POSTMASTER,
source=PGC_S_FILE, action=GUC_ACTION_SET,
    changeVal=true, elevel=15, is_reload=false) at guc.c:7922
#6  0x0000556cee01b1b2 in ProcessConfigFileInternal
(context=PGC_POSTMASTER, applySettings=true, elevel=15) at
guc-file.l:441
#7  0x0000556cee01ab2d in ProcessConfigFile (context=PGC_POSTMASTER)
at guc-file.l:155
#8  0x0000556cee00c859 in SelectConfigFiles (userDoption=0x0,
progname=0x556cef584eb0 "postgres") at guc.c:6196
#9  0x0000556cede249c6 in PostgresSingleUserMain (argc=12,
argv=0x556cef585800, username=0x556cef58cc60 "foobaralicebob") at
postgres.c:3991
#10 0x0000556cedc34a72 in main (argc=12, argv=0x556cef585800) at main.c:199
(gdb)

Regards,
Bharath Rupireddy.



Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> Thanks. Having check hooks for min_wal_size and max_wal_size that
> throw errors if they aren't at least twice the wal_segment_size has a
> "BIG" problem - ./initdb -D data --wal-segsize=1 (or a value < 16)
> fails.

In general, you can't do that (i.e. try to enforce constraints between
GUC values via check hooks).  It's been tried in the past and failed
miserably, because the hooks can't know whether the other value is
about to be changed.

            regards, tom lane



At Mon, 23 May 2022 10:08:54 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> > Thanks. Having check hooks for min_wal_size and max_wal_size that
> > throw errors if they aren't at least twice the wal_segment_size has a
> > "BIG" problem - ./initdb -D data --wal-segsize=1 (or a value < 16)
> > fails.
> 
> In general, you can't do that (i.e. try to enforce constraints between
> GUC values via check hooks).  It's been tried in the past and failed
> miserably, because the hooks can't know whether the other value is
> about to be changed.

I thought that wal_segment_size is a semi-constant for a server life.
But looking at the startup sequence closely, postmaster tries
changing max_wal_size before reading control file.

Couldn't we use PGC_S_TEST for this purpose?  AlterSystemSetConfigFile
is calling parse_and_validate_value() with PGC_S_FILE, but it is
actually a "value to be used later"(@guc.h:93). So it can be thought
that PG_S_TEST is the right choice there.  If it is still not work
perfectly, we could have a new source value, say PGC_S_ALTER_SYSTEM,
exactly for this use. (but I don't see a following users comes in
future..)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



At Tue, 24 May 2022 10:19:53 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> At Mon, 23 May 2022 10:08:54 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> > Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> > > Thanks. Having check hooks for min_wal_size and max_wal_size that
> > > throw errors if they aren't at least twice the wal_segment_size has a
> > > "BIG" problem - ./initdb -D data --wal-segsize=1 (or a value < 16)
> > > fails.
> > 
> > In general, you can't do that (i.e. try to enforce constraints between
> > GUC values via check hooks).  It's been tried in the past and failed
> > miserably, because the hooks can't know whether the other value is
> > about to be changed.
> 
> I thought that wal_segment_size is a semi-constant for a server life.
> But looking at the startup sequence closely, postmaster tries
> changing max_wal_size before reading control file.
> 
> Couldn't we use PGC_S_TEST for this purpose?  AlterSystemSetConfigFile
> is calling parse_and_validate_value() with PGC_S_FILE, but it is
> actually a "value to be used later"(@guc.h:93). So it can be thought
> that PG_S_TEST is the right choice there.  If it is still not work
> perfectly, we could have a new source value, say PGC_S_ALTER_SYSTEM,
> exactly for this use. (but I don't see a following users comes in
> future..)

This duscussion is based on the assumption that "wal_segment_size can
be assumed to be a constant when a check function is called with
PGC_S_FILE".

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Mon, May 23, 2022 at 10:08:54AM -0400, Tom Lane wrote:
> In general, you can't do that (i.e. try to enforce constraints between
> GUC values via check hooks).  It's been tried in the past and failed
> miserably, because the hooks can't know whether the other value is
> about to be changed.

+1.  Aka, cough, 41aadee followed by 414c2fd.
--
Michael

Attachment