Thread: pgsql: Add vacuum_truncate configuration parameter.

pgsql: Add vacuum_truncate configuration parameter.

From
Nathan Bossart
Date:
Add vacuum_truncate configuration parameter.

This new parameter works just like the storage parameter of the
same name: if set to true (which is the default), autovacuum and
VACUUM attempt to truncate any empty pages at the end of the table.
It is primarily intended to help users avoid locking issues on hot
standbys.  The setting can be overridden with the storage parameter
or VACUUM's TRUNCATE option.

Since there's presently no way to determine whether a Boolean
storage parameter is explicitly set or has just picked up the
default value, this commit also introduces an isset_offset member
to relopt_parse_elt.

Suggested-by: Will Storey <will@summercat.com>
Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Gurjeet Singh <gurjeet@singh.im>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/Z2DE4lDX4tHqNGZt%40dev.null

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/0164a0f9ee12e0eff9e4c661358a272ecd65c2d4

Modified Files
--------------
doc/src/sgml/config.sgml                      | 29 +++++++++++++++++++++++++++
doc/src/sgml/ref/create_table.sgml            | 13 ++++--------
doc/src/sgml/ref/vacuum.sgml                  |  3 ++-
src/backend/access/common/reloptions.c        | 14 ++++++++++++-
src/backend/commands/vacuum.c                 | 17 ++++++++++++----
src/backend/utils/misc/guc_tables.c           | 10 +++++++++
src/backend/utils/misc/postgresql.conf.sample |  4 ++++
src/include/access/reloptions.h               |  1 +
src/include/commands/vacuum.h                 |  1 +
src/include/utils/guc_tables.h                |  1 +
src/include/utils/rel.h                       |  1 +
src/test/regress/expected/vacuum.out          | 27 +++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql               | 10 +++++++++
13 files changed, 116 insertions(+), 15 deletions(-)


Re: pgsql: Add vacuum_truncate configuration parameter.

From
Tom Lane
Date:
Nathan Bossart <nathan@postgresql.org> writes:
> Since there's presently no way to determine whether a Boolean
> storage parameter is explicitly set or has just picked up the
> default value, this commit also introduces an isset_offset member
> to relopt_parse_elt.

Uh, what?  Why is it a good idea to distinguish those states?
Seems like that risks some very surprising behavior, ie if the
default is "true", why shouldn't that act exactly like an
explicit setting of "true"?

            regards, tom lane



Re: pgsql: Add vacuum_truncate configuration parameter.

From
"David G. Johnston"
Date:
On Thursday, March 20, 2025, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Nathan Bossart <nathan@postgresql.org> writes:
> Since there's presently no way to determine whether a Boolean
> storage parameter is explicitly set or has just picked up the
> default value, this commit also introduces an isset_offset member
> to relopt_parse_elt.

Uh, what?  Why is it a good idea to distinguish those states?
Seems like that risks some very surprising behavior, ie if the
default is "true", why shouldn't that act exactly like an
explicit setting of "true"?

In order to implement what amounts to coalesce(…) for settings (use global value unless the table value overrides) one needs a sentinel value that means unset because settings cannot take on the null value.  There is no such possible sentinel value for boolean so another field is required.  The hazards of choosing Boolean instead of text for settings.

David J.

Re: pgsql: Add vacuum_truncate configuration parameter.

From
David Rowley
Date:
On Fri, 21 Mar 2025 at 04:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Nathan Bossart <nathan@postgresql.org> writes:
> > Since there's presently no way to determine whether a Boolean
> > storage parameter is explicitly set or has just picked up the
> > default value, this commit also introduces an isset_offset member
> > to relopt_parse_elt.
>
> Uh, what?  Why is it a good idea to distinguish those states?
> Seems like that risks some very surprising behavior, ie if the
> default is "true", why shouldn't that act exactly like an
> explicit setting of "true"?

I was thinking about this yesterday as the topic of a
user-configurable options for truncation threshold came up in [1]. I
find it a bit annoying the boolean vacuum_truncate reloption was added
(a few years ago) as we could have instead added a more flexible
truncate_scale_factor that could be set to -1 to disable truncation.
Maybe it's too late now as reloptions are not easy to remove. Adding
this GUC now does put us a bit further down the path of the boolean
option.

From [2], it seems there are people around unhappy with the current
compile-time settings. If we do end up making those user configurable,
then we're going to have a redundant "vacuum_truncate" reloption and a
redundant GUC.

David

[1] https://postgr.es/m/CAApHDvrfngqCpQ09LKdr-BnJEVHW0%3DwAhxfRBnw%3DHJ2645cJPg%40mail.gmail.com
[2] https://postgr.es/m/CA%2BYyo5Trs9x03-sc4PpmEXPY9K_B_jbEbNd01RbF%2BvOxs3zVUA%40mail.gmail.com



Re: pgsql: Add vacuum_truncate configuration parameter.

From
"David G. Johnston"
Date:
On Thu, Mar 20, 2025 at 4:07 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 21 Mar 2025 at 04:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Nathan Bossart <nathan@postgresql.org> writes:
> > Since there's presently no way to determine whether a Boolean
> > storage parameter is explicitly set or has just picked up the
> > default value, this commit also introduces an isset_offset member
> > to relopt_parse_elt.
>
> Uh, what?  Why is it a good idea to distinguish those states?
> Seems like that risks some very surprising behavior, ie if the
> default is "true", why shouldn't that act exactly like an
> explicit setting of "true"?

I was thinking about this yesterday as the topic of a
user-configurable options for truncation threshold came up in [1]. I
find it a bit annoying the boolean vacuum_truncate reloption was added
(a few years ago) as we could have instead added a more flexible
truncate_scale_factor that could be set to -1 to disable truncation.
Maybe it's too late now as reloptions are not easy to remove. Adding
this GUC now does put us a bit further down the path of the boolean
option.

Not seeing much point in trying to get rid of the on/off switch.  It just won't make sense to choose a tunable value of zero to disable something, and probably should be prohibited.

I'm seeing an implementation detail discussion here, not a behavior one.  The field complaint that we don't let the DBA control this at the GUC level is valid and reasonably solved.  The "default" behavior hasn't changed but now instead of hard-coding the default we moved it to a GUC.  The storage parameter is no longer documented as having a default, which is correct.  It now behaves like most of the other storage parameters in that if unset a GUC provides the value.

David J.

Re: pgsql: Add vacuum_truncate configuration parameter.

From
David Rowley
Date:
On Sat, 22 Mar 2025 at 05:40, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Not seeing much point in trying to get rid of the on/off switch.  It just won't make sense to choose a tunable value
ofzero to disable something, and probably should be prohibited. 

Can you explain what does not make sense about it?  We have plenty of
GUCs and reloptions where -1 means "inherit the setting from somewhere
else". Do those also not make sense, or is this one somehow different?

> I'm seeing an implementation detail discussion here, not a behavior one.  The field complaint that we don't let the
DBAcontrol this at the GUC level is valid and reasonably solved.  The "default" behavior hasn't changed but now instead
ofhard-coding the default we moved it to a GUC.  The storage parameter is no longer documented as having a default,
whichis correct.  It now behaves like most of the other storage parameters in that if unset a GUC provides the value. 

The reason I was pointing this out is that I wanted to ensure that
this was considered before we release code with the new GUC. It's true
removing a GUC isn't as hard as removing a reloption and we already
have the reloption. If everyone thinks we'll likely not need
user-tunable options to specify the number of pages required before
truncation occurs then that's fine.  The problem I see is that we
already have lots of GUCs and it'd be nice not to have semi-redundant
ones in the future because we've failed to consider something.  I was
just pointing out the "something" part that we might want to consider.

David