Thread: vacuum_truncate configuration parameter and isset_offset

vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
Hi! I've tired to rebase my reloptions patch (https://
commitfest.postgresql.org/patch/4688/)
and have stuck with 0164a0f9 commit.

I am deeply involved with reloptions topic, and I should say that I do not 
like the idea of isset_offset field

All other types, as you have mentioned in commit messages, all other types 
manages "Not set" feature via some specific default value (like -1 for positive 
integer for example or "auto" for enums). But you can't do that in boolean, 
and this is true.

If you add isset_offset in this case, it would be useless for all types, except 
boolean. It will make whole design inconsistent, and this is bad.

I would suggest to use enum here to achieve same goal, or add some "trilean" 
data type that can be "true/false/unset"

Current reloption code is already a big mess, there is no reason to make it 
worse. 

Can we revert it, and do it again, without adding isset_offset field? I can 
write some code for it too.

Or we can patch it over, but I am afraid something will go wrong and we will 
stuck with isset_offset  forever. I do not want it to happen. 

PS. I've just looked at code for vacuum_index_cleanup it has very same logic, 
just replace "auto" with anything you like, and uses enum.
Grep for StdRdOptIndexCleanupValues for more info

StdRdOptIndexCleanupValues


-- 
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 15:27:35 MSK пользователь Nikolay
Shaplov написал:

> PS. I've just looked at code for vacuum_index_cleanup it has very same
> logic, just replace "auto" with anything you like, and uses enum.
> Grep for StdRdOptIndexCleanupValues for more info

I thought about it a bit more...

ALTER TABLE test SET (vacuum_truncate);
ALTER TABLE test SET (vacuum_truncate=false);
ALTER TABLE test RESET (vacuum_truncate);

Nobody will ever guess that these are three different cases.

since for enum solution
ALTER TABLE test SET (vacuum_truncate=on);
ALTER TABLE test SET (vacuum_truncate=off);
ALTER TABLE test SET (vacuum_truncate=unset);
ALTER TABLE test RESET (vacuum_truncate);

it is quite clear that first three are different, and 4th is most probably is
equivalent for 3rd.

So, my suggestion is to change it as soon as possible.


--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 03:27:35PM +0300, Nikolay Shaplov wrote:
> I would suggest to use enum here to achieve same goal, or add some "trilean" 
> data type that can be "true/false/unset"

I did first try making the vacuum_truncate reloption an enum, but I didn't
proceed with that approach for a couple of reasons:

* vacuum_truncate is already a Boolean reloption, so switching it to an
  enum would require enumerating all possible values accepted by
  parse_bool(), which has the following comment:

    /*
     * Try to interpret value as boolean value.  Valid values are: true,
     * false, yes, no, on, off, 1, 0; as well as unique prefixes thereof.
     * If the string parses okay, return true, else false.
     * If okay and result is not NULL, return the value in *result.
     */

  I also considered adding some sort of pluggable parsing functionality or
  something like StdRdOptIndexCleanupValues (i.e., a subset of the most
  popular values accepted by parse_bool()), but those seemed a little too
  complicated for the problem at hand.

* I wasn't sure whether it would be better to expose this third "unset"
  value or to make it hidden and for internal use only.  The former would
  mean we have two ways to declare a reloption "unset."  You could either
  not set it, or you could explicitly set it to "unset."  The latter would
  require additional code somewhere, or we could just leave it
  undocumented.  Both of these approaches seemed like weird user
  experiences to me.

Overall, the biggest reason I didn't proceed with the enum is because it
felt like it was making it the user's problem.  Rather than just teaching
our code how to determine if a reloption was explicitly set, we'd be
introducing unnecessary complexity to the user, or at least we'd be trying
to closely match the existing functionality in an attempt to hide this
complexity from them.

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Mon, Mar 24, 2025 at 10:43 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
> Overall, the biggest reason I didn't proceed with the enum is because it
> felt like it was making it the user's problem.  Rather than just teaching
> our code how to determine if a reloption was explicitly set, we'd be
> introducing unnecessary complexity to the user, or at least we'd be trying
> to closely match the existing functionality in an attempt to hide this
> complexity from them.

+1. Giving the user the ability to set the option to a value called
"unset" doesn't seem right to me.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 17:43:38 MSK пользователь Nathan
Bossart написал:


> * vacuum_truncate is already a Boolean reloption,
Yes, I guess schema update would be best solution here.

> so switching it to an
>   enum would require enumerating all possible values accepted by
>   parse_bool(), which has the following comment:
>
>     /*
>      * Try to interpret value as boolean value.  Valid values are: true,
>      * false, yes, no, on, off, 1, 0; as well as unique prefixes thereof.
>      * If the string parses okay, return true, else false.
>      * If okay and result is not NULL, return the value in *result.
>      */

We already do it for some option. May be we need more generic solution, but
boolean should remain boolean, with only two possible values.


> * I wasn't sure whether it would be better to expose this third "unset"
>   value or to make it hidden and for internal use only.
If you hide something from user,  you will confuse him. If the solution is
more explicit, then it is more easy to use it, and understand it. If something
is implying (like deleted option means third value), then there would be
confusion. Please do not do that.


--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 17:48:25 MSK пользователь Robert
Haas написал:
> On Mon, Mar 24, 2025 at 10:43 AM Nathan Bossart
>
> <nathandbossart@gmail.com> wrote:
> > Overall, the biggest reason I didn't proceed with the enum is because it
> > felt like it was making it the user's problem.  Rather than just teaching
> > our code how to determine if a reloption was explicitly set, we'd be
> > introducing unnecessary complexity to the user, or at least we'd be trying
> > to closely match the existing functionality in an attempt to hide this
> > complexity from them.
>
> +1. Giving the user the ability to set the option to a value called
> "unset" doesn't seem right to me.

1. Enum allows you to make default value "unreachable" for setting. But this
is not the most important thing.

2.  This option has three possible values: on/off/system_default. We can find
better name for system_default, but it would not change it's meaning.
I would prefer to let user set these three values explicitly.

Nobody would guess that

ALTER TABLE test SET (vacuum_truncate=false);
means "off"

and
ALTER TABLE test RESET (vacuum_truncate);
means "system_default"

This will lead to a lot of confusion.

So I strongly against this  implying third value for boolean. This is nasty
thing. Boolean should have two values.


--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Mon, Mar 24, 2025 at 11:12 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> Nobody would guess that
>
> ALTER TABLE test SET (vacuum_truncate=false);
> means "off"
>
> and
> ALTER TABLE test RESET (vacuum_truncate);
> means "system_default"
>
> This will lead to a lot of confusion.

I agree that this confuses people, but I don't think it's more
confusing here than for other vacuum reloptions. I have seen people
try to unset vacuum reloptions by using SET to configure them to the
default value rather than by using RESET to remove them. But then
later they change the system default and that table is still nailed to
the old default. I always find myself slightly bemused by this,
because it doesn't seem that hard to me to figure out how it actually
works, but it's definitely a real issue. However, I don't see why the
issue is any more acute for this parameter than any other, and it
certainly does not seem like a good idea to make this parameter work
differently from the other ones.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Mon, Mar 24, 2025 at 8:26 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Mar 24, 2025 at 11:12 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> Nobody would guess that
>
> ALTER TABLE test SET (vacuum_truncate=false);
> means "off"
>
> and
> ALTER TABLE test RESET (vacuum_truncate);
> means "system_default"
>
> This will lead to a lot of confusion.

I agree that this confuses people, but I don't think it's more
confusing here than for other vacuum reloptions. I have seen people
try to unset vacuum reloptions by using SET to configure them to the
default value rather than by using RESET to remove them. But then
later they change the system default and that table is still nailed to
the old default. I always find myself slightly bemused by this,
because it doesn't seem that hard to me to figure out how it actually
works, but it's definitely a real issue. However, I don't see why the
issue is any more acute for this parameter than any other, and it
certainly does not seem like a good idea to make this parameter work
differently from the other ones.


+1

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 18:25:44 MSK пользователь Robert
Haas написал:

> > and
> > ALTER TABLE test RESET (vacuum_truncate);
> > means "system_default"
> >
> > This will lead to a lot of confusion.
>
> I agree that this confuses people,
Though a bit more, I think that as a user I would prefer to be able to
explicitly set "system_defaults" value. If I have not set any, most probably I
do not care, or have not thought about it. But if I set option to
"system_defaults" then I do it intentionally and have some good reasons for
it.


> but I don't think it's more
> confusing here than for other vacuum reloptions. I have seen people
> try to unset vacuum reloptions by using SET to configure them to the
> default value rather than by using RESET to remove them. But then
> later they change the system default and that table is still nailed to
> the old default. I always find myself slightly bemused by this,
> because it doesn't seem that hard to me to figure out how it actually
> works, but it's definitely a real issue. However, I don't see why the
> issue is any more acute for this parameter than any other, and it
> certainly does not seem like a good idea to make this parameter work
> differently from the other ones.

May be I can agree with you. But this does not explain why we should switch
boolean from two possible values into three...  It can't have three values for
any practical reason. This should not be boolean anymore in this case.



--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Mon, Mar 24, 2025 at 8:45 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> but I don't think it's more
> confusing here than for other vacuum reloptions. I have seen people
> try to unset vacuum reloptions by using SET to configure them to the
> default value rather than by using RESET to remove them. But then
> later they change the system default and that table is still nailed to
> the old default. I always find myself slightly bemused by this,
> because it doesn't seem that hard to me to figure out how it actually
> works, but it's definitely a real issue. However, I don't see why the
> issue is any more acute for this parameter than any other, and it
> certainly does not seem like a good idea to make this parameter work
> differently from the other ones.

May be I can agree with you. But this does not explain why we should switch
boolean from two possible values into three...  It can't have three values for
any practical reason. This should not be boolean anymore in this case.


The boolean is two-valued.  The state of the reloption is also binary - set or unset (i.e., it is listed in pg_class.reloptions, or not).  In the unset state the effective value of a reloption comes from the corresponding GUC (or, lacking that, some hard-coded default).  If set, the value comes from the associated boolean.  RESET places a reloption into the unset state.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Álvaro Herrera
Date:
Hello

I don't understand why this shouldn't work exactly like
vacuum_index_cleanup (cf. vacuum_rel lines 2170ff).  That would require
no new mechanism.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming"  (A. Stepanov)



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 19:00:51 MSK пользователь Álvaro
Herrera написал:

> I don't understand why this shouldn't work exactly like
> vacuum_index_cleanup (cf. vacuum_rel lines 2170ff).  That would require
> no new mechanism.

In my opinion it should.

"no new mechanism" is good. "new mechanism" is bad. :-)

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Mon, Mar 24, 2025 at 9:00 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Hello

I don't understand why this shouldn't work exactly like
vacuum_index_cleanup (cf. vacuum_rel lines 2170ff).  That would require
no new mechanism.


That reloption is already an enum and there is no GUC to defer to when the value is unset.  It doesn't seem like an equivalent scenario.  AUTO is a perfectly useful value as opposed to an undocumented sentinel for unset/missing.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Mon, Mar 24, 2025 at 9:08 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Mar 24, 2025 at 9:00 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Hello

I don't understand why this shouldn't work exactly like
vacuum_index_cleanup (cf. vacuum_rel lines 2170ff).  That would require
no new mechanism.


That reloption is already an enum and there is no GUC to defer to when the value is unset.  It doesn't seem like an equivalent scenario.  AUTO is a perfectly useful value as opposed to an undocumented sentinel for unset/missing.


Sorry, the "already an enum" comment is wrong - I see the commit now where we basically re-implemented boolean value processing logic and added an "auto" option.

Basically we'd do this to make a boolean-compatible enum adding an undocumented value "null" as a valid and default set value and then interpret "null" as meaning "go use the vacuum_truncate GUC".

It's too late to argue against sentinel values so I suppose this would have to be acceptable.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Mon, Mar 24, 2025 at 5:42 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
В письме от понедельник, 24 марта 2025 г. 15:27:35 MSK пользователь Nikolay
Shaplov написал:

> PS. I've just looked at code for vacuum_index_cleanup it has very same
> logic, just replace "auto" with anything you like, and uses enum.
> Grep for StdRdOptIndexCleanupValues for more info

I thought about it a bit more...

ALTER TABLE test SET (vacuum_truncate);
ALTER TABLE test SET (vacuum_truncate=false);
ALTER TABLE test RESET (vacuum_truncate);

Nobody will ever guess that these are three different cases.

since for enum solution
ALTER TABLE test SET (vacuum_truncate=on);
ALTER TABLE test SET (vacuum_truncate=off);
ALTER TABLE test SET (vacuum_truncate=unset);
ALTER TABLE test RESET (vacuum_truncate);


Actually, we'd want to prohibit vacuum_truncate=unset (or whatever internal enum option value we choose) as input since none of the other reloptions allow the user to specify the sentinel value but instead require the use of reset to achieve it.

i.e., the reset value for parallel_workers is -1:

{
"parallel_workers",
"Number of parallel processes that can be used per executor node for this relation.",
RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock
},
-1, 0, 1024

 but:

alter table vct set (parallel_workers=-1);
ERROR:  value -1 out of bounds for option "parallel_workers"
DETAIL:  Valid values are between "0" and "1024".

So, given the precedent of vacuum_index_cleanup and the above, we should turn this into an enum that accepts all existing boolean literal inputs and also has a undocumented "unset" default value that the user is not allowed to explicitly set but instead only gets used to resolve an unset reloption at runtime.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 05:00:51PM +0100, Álvaro Herrera wrote:
> I don't understand why this shouldn't work exactly like
> vacuum_index_cleanup (cf. vacuum_rel lines 2170ff).  That would require
> no new mechanism.

I explained my reasons for not proceeding with that approach upthread [0].

I don't think there are any existing reloptions where there are two ways to
say "use the GUC."  For ones with corresponding GUCs, unsetting the storage
parameter is the mechanism for doing so.  vacuum_index_cleanup's AUTO
option would be the closest thing, but there's no backing GUC, so the
meaning of AUTO doesn't change unless someone changes the code.

TBH I'm not understanding the pushback for adding a way to determine
whether the storage parameter is actually set.  It's very simple, and it
seems like it could be useful elsewhere.  But more importantly, it allows
us to more closely match the behavior of the existing reloptions with GUCs,
and it prevents type mismatches (e.g., the reloption is an enum but the GUC
is a Boolean).

[0] https://postgr.es/m/Z-Fvmnf57z6zM8Ky%40nathan

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 19:41:27 MSK пользователь Nathan
Bossart написал:
> But more importantly, it allows
> us to more closely match the behavior of the existing reloptions with GUCs,
> and it prevents type mismatches (e.g., the reloption is an enum but the GUC
> is a Boolean).

Please do not forget, that reloptions is not the only application of options
mechanism. We have attribute options, opclass options and may have more.
You've just changed the whole engine, for what is seems to be an exceptional
case, that can be solved via existing means. Changing of an engine should not
be done carelessly. You've just make changes for any boolean option of any
postgres object that can  ever have options for years and years on. This
change should be done after really good consideration. I strongly suggest to
revert it back, solve the problem via existing means if it needs immediate
solving, and then carefully design boolean-nullable option, or whatever you
want to have, keeping in mind that this changes the whole engine for all
options postgres will ever have. A general design of a new option fature
without anything like "Here I feel like to have this, and there I feel like to
have that". A option type that will be useful for everyone. If it is carefully
considered, I will support it.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Mon, Mar 24, 2025 at 9:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
TBH I'm not understanding the pushback for adding a way to determine
whether the storage parameter is actually set.  It's very simple, and it
seems like it could be useful elsewhere.

IMO this is superior to using sentinel values for the same purpose, but those already exist and having both is reasonably unappealing.

  But more importantly, it allows
us to more closely match the behavior of the existing reloptions with GUCs,
and it prevents type mismatches (e.g., the reloption is an enum but the GUC
is a Boolean).


Good point; we could solve this in documentation by simply keeping boolean if no non-boolean enum values are publicly accessible.  But this is part of why having "set/unset" not use sentinel values is better.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 09:40:24AM -0700, David G. Johnston wrote:
> So, given the precedent of vacuum_index_cleanup and the above, we should
> turn this into an enum that accepts all existing boolean literal inputs and
> also has a undocumented "unset" default value that the user is not allowed
> to explicitly set but instead only gets used to resolve an unset reloption
> at runtime.

This would involve adding a field to relopt_enum_elt_def to declare a value
as "unsettable," right?  That seems feasible, but IMHO it's comparable to
adding a field to reopt_parse_elt.

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 19:58:38 MSK пользователь Nathan
Bossart написал:
> On Mon, Mar 24, 2025 at 09:40:24AM -0700, David G. Johnston wrote:
> > So, given the precedent of vacuum_index_cleanup and the above, we should
> > turn this into an enum that accepts all existing boolean literal inputs
> > and
> > also has a undocumented "unset" default value that the user is not allowed
> > to explicitly set but instead only gets used to resolve an unset reloption
> > at runtime.
>
> This would involve adding a field to relopt_enum_elt_def to declare a value
> as "unsettable," right?  That seems feasible, but IMHO it's comparable to
> adding a field to reopt_parse_elt.

If you look at view's check_option option, you will see, how unsettable enum
default can be implemented using existing code.


--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 07:53:43PM +0300, Nikolay Shaplov wrote:
> You've just changed the whole engine, for what is seems to be an exceptional 
> case, that can be solved via existing means.

I have not changed the whole engine.  I have added an optional integer
field to a single struct.

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 20:09:23 MSK пользователь Nathan
Bossart написал:

> > You've just changed the whole engine, for what is seems to be an
> > exceptional case, that can be solved via existing means.
> I have not changed the whole engine.  I have added an optional integer
> field to a single struct.

That potentially changes the behaviour of all boolean options. They will never
be what they were before.


--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Mon, Mar 24, 2025 at 10:27 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
В письме от понедельник, 24 марта 2025 г. 20:09:23 MSK пользователь Nathan
Bossart написал:

> > You've just changed the whole engine, for what is seems to be an
> > exceptional case, that can be solved via existing means.
> I have not changed the whole engine.  I have added an optional integer
> field to a single struct.

That potentially changes the behaviour of all boolean options. They will never
be what they were before.

The implementation sure looks like a pure opt-in from this angle. Can you provide a concrete behavior change example of a boolean option that hasn't opted-in to help me see what I'm missing.  I do agree that this should have been a separate commit but let us at least try and do some analysis before throwing the whole thing away.  A refactoring/change like this would need an example to work with anyway so that thread you want to create is this one in practice.  It has a poor subject line for the material, true, but that doesn't seem uncommon around here...

My main concern when first seeing this was adding an integer to every single option in the entire system for something that is going to be zero 99.9% of the time.  A bit bloated but not directly impacting behavior.  I wanted to avoid that by just looking in pg_class.reloptions for the vacuum_truncate setting when needed.  I'd rather do that then turn this into an enum that is masquerading as a boolean.

David J.



Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Mon, Mar 24, 2025 at 1:27 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> > > You've just changed the whole engine, for what is seems to be an
> > > exceptional case, that can be solved via existing means.
> > I have not changed the whole engine.  I have added an optional integer
> > field to a single struct.
>
> That potentially changes the behaviour of all boolean options. They will never
> be what they were before.

If you want to convince people to change something, you need to make
real arguments, not just wildly accuse Nathan of having broken
everything. He has not "changed the behavior of the whole engine," and
this does not change the behavior of any Boolean options that don't
elect to use it. It is of course possible that there is some better
way to solve the problem than what Nathan picked, but this is really a
very minor code change that appears to solve the problem in a very
natural way.

I don't understand why you're upset about this, and I don't think it's
fair to beat up Nathan for doing something that looks totally normal
without a really clear reason.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
For the sake of discussion, here is what the enum approach would look like.

-- 
nathan

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 20:37:50 MSK пользователь David G.
Johnston написал:

> My main concern when first seeing this was adding an integer to every
> single option in the entire system for something that is going to be zero
> 99.9% of the time.  A bit bloated but not directly impacting behavior.

That is 100% true.

> I  wanted to avoid that by just looking in pg_class.reloptions for the
> vacuum_truncate setting when needed.

This goes against all current practice of options usage. Option is mapped to a
C structure, once loaded it is cached, and reused while cache is still alive.
Doing other way will increase the mess.

> I'd rather do that then turn this into an enum that is masquerading as a
boolean.

I think the core of the problem, is that you still thinking about this option
as a boolean. It is no longer a boolean. It is nullable-boolean, or call it
whatever you want, but not a boolean anymore. It has 3 values available. The
most simple way to implement that is to use enum.

If you want to have not-set/null status marked as a separate flag, you can have
it. But then you should redesign all options to follow that logic: do not use
unreachable default value, but use null/unset flag. That is acceptable, if you
really need this, but should be done to all options. But I do not think it
worth efforts.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 20:48:29 MSK пользователь Nathan
Bossart написал:

> For the sake of discussion, here is what the enum approach would look like.

In my point of view this solution is much-much better: it achieves all goals,
has no drawbacks, and do not change reloption engine.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 09:03:11PM +0300, Nikolay Shaplov wrote:
> В письме от понедельник, 24 марта 2025 г. 20:48:29 MSK пользователь Nathan 
> Bossart написал:
> 
>> For the sake of discussion, here is what the enum approach would look like.
> 
> In my point of view this solution is much-much better: it achieves all goals, 
> has no drawbacks, and do not change reloption engine. 

I think this change would probably work fine, but it does have a couple of
drawbacks:

* We'll need to make it really clear in comments why you would want to use
  this enum instead of making it a Boolean reloption.  That's not a big
  deal.

* We'd need to decide what to say on the documentation side.  My first
  instinct is that we should just leave it as "boolean" because otherwise
  we're going to describe something that sounds an awful lot like a
  Boolean.

* I don't think this matches the parse_bool() behavior exactly.  For
  example, parse_bool() appears to accept inputs like "t" to mean "true".
  This is also probably not a huge deal.

That being said, I'm open to applying this patch if it seems like a
majority of folks prefer this implementation.  FWIW I'm still partial to
the isset_offset approach for the reasons I've already discussed.  

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Mon, Mar 24, 2025 at 2:45 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> * I don't think this matches the parse_bool() behavior exactly.  For
>   example, parse_bool() appears to accept inputs like "t" to mean "true".
>   This is also probably not a huge deal.

It's not great. We allow t/f in most places and some people may rely
on it. Having one place be randomly different is not ideal.

I still don't understand what problem we're trying to solve here.
Using a Boolean flag variable to indicate whether some other value is
or is not initialized is an extremely standard programming practice.
I've done it hundreds of times, and I'm sure some of those times are
now part of PostgreSQL. If we're going to change this -- especially in
a way that has potentially user-visible consequences -- we should have
some non-hyperbolic explanation of why it's a problem in this
particular case.

I think that the answer here might be that Nikolay doesn't like this
because it interacts badly with his "New [relation] options engine,"
mentioned upthread. While I'm sympathetic from the point of view that
it is no fun at all to have your patch get sideswiped, either that
patch can be easily adapted to handle the possibility of an is-set
flag or it can't. If it can't, then I think that's a flaw in that
patch, not this one. I think it's been demonstrated that there are
some things that somebody might like to do with the current reloptions
engine that are actually fairly hard to make work. We should be
looking for an engine that is more flexible and can easily accommodate
new needs as they arise. An isset flag is such a minor deviation from
existing practice that it shouldn't be a big deal; in fact, I think
it's surprising we haven't needed it sooner, and I think if an engine
is so opinionated about how things have to work that it can't
accommodate that, it's not a good idea for us to adopt it.

And if I'm wrong in thinking that, then I would like a detailed
technical argument explaining exactly why.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 21:45:27 MSK пользователь Nathan
Bossart написал:
> * We'd need to decide what to say on the documentation side.  My first
>   instinct is that we should just leave it as "boolean" because otherwise
>   we're going to describe something that sounds an awful lot like a
>   Boolean.

I think that "boolean-like" is a good word for describing it.

> * I don't think this matches the parse_bool() behavior exactly.  For
>   example, parse_bool() appears to accept inputs like "t" to mean "true".
>   This is also probably not a huge deal.

That was me, who added Enum Reloption to the postgres code, and it was Alvaro
(if I recall correctly) who improved StdRdOptIndexCleanupValues option list to
it's current state. I would trust him here, since, if I recall correctly he is
original author of reloptions the way we know them.

But if "t" is really missing I think we can patch both
StdRdOptIndexCleanupValues and StdRdOptBoolValues to have "t". But in separate
commit.

Moreover if this Boolean-Based enum is a common case, we can think about
adding some basic template, that can be extended by adding "auto" value, or
not-set value and so on. But I do not think that two cases makes it common
case.

> That being said, I'm open to applying this patch if it seems like a
> majority of folks prefer this implementation.  FWIW I'm still partial to
> the isset_offset approach for the reasons I've already discussed.

I am for a long while working with reloptions, and can say, that almost nobody
fully understand the idea, how it work and how it was designed. So I would
suggest follow Alvaro, since he is keeper of that knowledge.

And second general idea: changing engine is bad, at least when you can manage
without changing it. We can, your patch proves it.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 10:12:05PM +0300, Nikolay Shaplov wrote:
> And second general idea: changing engine is bad, at least when you can manage 
> without changing it.

You have asserted this a couple of times without providing any reasons why.
I know of no general project policy about changing the reloption code.  I
would expect this code to evolve just like any other part of Postgres,
whether it's to improve performance or to expand the feature set.

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 22:11:19 MSK пользователь Robert
Haas написал:

> I think that the answer here might be that Nikolay doesn't like this
> because it interacts badly with his "New [relation] options engine,"
There is no problem with adding isset_offset into "New [relation] options
engine", not a bit deal. But it will break all internal logic of reloption.
Both current version and my patch

> And if I'm wrong in thinking that, then I would like a detailed
> technical argument explaining exactly why.

You can't have both isset_offset flag and default_value. Either one or another.
You can't have them both, it will make whole design inconsistent. We already
have default_value. Even for boolean.
If you want custom behaviour when option is not set, you set unreachable
default value. This is the way it is done for _all_ options that has custom
behavior for unset option. Even those that a technically boolean. Why
vacuum_truncate should do another way? Either do it the way it have been done
before, or suggest total redisign for all oprions that has custom unset
behaviour via unreachable defaults. This will be consistent.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 22:26:17 MSK пользователь Nathan
Bossart написал:

> > And second general idea: changing engine is bad, at least when you can
> > manage without changing it.
>
> You have asserted this a couple of times without providing any reasons why.
> I know of no general project policy about changing the reloption code.  I
> would expect this code to evolve just like any other part of Postgres,
> whether it's to improve performance or to expand the feature set.
Because this code was carefully designed, and it is intentionally was made the
way it is.
It can be redesigned, but redesigning is not just adding another field to a C
structure. It requires very carefully consideration, not as a part of option
patch, but as a redesign patch. See my answer to Robert. We can have
isset_offset, but then we have redesign all options with custom unset behavior
to use it, instead of unreachable default value. This will make it consistent
then.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 10:35:41PM +0300, Nikolay Shaplov wrote:
> We can have isset_offset, but then we have redesign all options with
> custom unset behavior to use it, instead of unreachable default value.
> This will make it consistent then.

I don't see any reason why we are compelled to redesign all such options,
but in any case, I would think that would be preferable to magic special
values.  For example,

    /* -1 is used to disable max threshold */
    vac_max_thresh = (relopts && relopts->vacuum_max_threshold >= -1)
        ? relopts->vacuum_max_threshold
        : autovacuum_vac_max_thresh;

would become something like

    if (relopts && relopts->vacuum_max_threshold_set)
        vac_max_thresh = relopts->vacuum_max_threshold;
    else
        vac_max_thresh = autovacuum_vac_max_thresh;

The former requires you to know that the reloption defaults to -2 if not
set, which does not seem particularly obvious to me.  At least, I did not
find this out-of-range reloption default technique obvious when I was
working on autovacuum_vacuum_max_threshold.

But again, I don't see any strong reason why we must change all such
reloptions.

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от понедельник, 24 марта 2025 г. 23:04:39 MSK пользователь Nathan
Bossart написал:
> > We can have isset_offset, but then we have redesign all options with
> > custom unset behavior to use it, instead of unreachable default value.
> > This will make it consistent then.
>
> I don't see any reason why we are compelled to redesign all such options,
> but in any case, I would think that would be preferable to magic special
> values.  For example,
>
>         /* -1 is used to disable max threshold */
>         vac_max_thresh = (relopts && relopts->vacuum_max_threshold >= -1)
>                 ? relopts->vacuum_max_threshold
>
>                 : autovacuum_vac_max_thresh;
>
> would become something like
>
>         if (relopts && relopts->vacuum_max_threshold_set)
>                 vac_max_thresh = relopts->vacuum_max_threshold;
>         else
>                 vac_max_thresh = autovacuum_vac_max_thresh;

I totally agree. This looks much nicer. But see how the code would change for
fillfactor, where default value is in valid range and to be actually used?

What would happen to default_val attribute in relopt_* structures? Should we
keep it? Should we remove it? If we keep it, what should we set for
vacuum_max_threshold? If we remove it how we should treat default value for
fillfactor?

Then you see, you will have extra int field for at least for every option with
unreachable defaults. Do we really need this?

default_val and isset_* flags are two completely different ways to do similar
thing. You will never find a nice way to use them both. So I am up to keeping
it the way it is: there exist only default_val and it is used to detect unset
option when needed.

> But again, I don't see any strong reason why we must change all such
> reloptions.

Because code of the engine should be consistent. We can't have two different
ways to do same thing. If we have isset flag, we should go for it everywhere,
where isset logic exists. Or do not use it at all. Other ways will lead us to
a much bigger mess, then we have today.


--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 11:27:15PM +0300, Nikolay Shaplov wrote:
> В письме от понедельник, 24 марта 2025 г. 23:04:39 MSK пользователь Nathan 
> Bossart написал:
>> But again, I don't see any strong reason why we must change all such
>> reloptions.
> 
> Because code of the engine should be consistent. We can't have two different 
> ways to do same thing. If we have isset flag, we should go for it everywhere, 
> where isset logic exists. Or do not use it at all. Other ways will lead us to 
> a much bigger mess, then we have today.

I don't disagree that it might be desirable for all reloptions with
corresponding GUCs to use isset_offset, but I am not following why this is
critically important.  The out-of-range default approach has worked just
fine for years, and I'm not aware of any reason that isset_offset isn't
also a suitable solution.

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Mon, Mar 24, 2025 at 11:45 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 24, 2025 at 09:03:11PM +0300, Nikolay Shaplov wrote:
> В письме от понедельник, 24 марта 2025 г. 20:48:29 MSK пользователь Nathan
> Bossart написал:
>
>> For the sake of discussion, here is what the enum approach would look like.
>
> In my point of view this solution is much-much better: it achieves all goals,
> has no drawbacks, and do not change reloption engine.

I am in favor of changing over to an enum at this point (let's try to keep users believing its a boolean if possible).

This isn't an endorsement on this idea and framing of things (say were we in a green-field situation) but rather that this patch is basically the "null hypothesis" and the alternate (presently committed) patch doesn't do enough to reject the null hypothesis, IME. (More detail on this below.)



I think this change would probably work fine, but it does have a couple of
drawbacks:

* We'll need to make it really clear in comments why you would want to use
  this enum instead of making it a Boolean reloption.  That's not a big
  deal.

Agreed.


* We'd need to decide what to say on the documentation side.  My first
  instinct is that we should just leave it as "boolean" because otherwise
  we're going to describe something that sounds an awful lot like a
  Boolean.

I'd start here too and wait to be convinced its pointless because the user is going to be exposed to the enum aspect anyway via system introspection.
 

* I don't think this matches the parse_bool() behavior exactly.  For
  example, parse_bool() appears to accept inputs like "t" to mean "true".
  This is also probably not a huge deal.

Fixable for sure.


That being said, I'm open to applying this patch if it seems like a
majority of folks prefer this implementation.  FWIW I'm still partial to
the isset_offset approach for the reasons I've already discussed. 


I'm preferable to turning the implementation into an enum.  Slight preference to documenting it as the boolean it presently is; pending any information regarding where in user-space, particularly in psql but also any client-side details, this internal detail might show through.  It avoids having to tell the reader that we choose enum in order to accommodate optionality.  But it isn't the end of the world if we do say that either.

I'm not convinced our existing way of approaching/framing this is great - defaults and is-set are better treated as being orthogonal; but, I'm also not fond of reserving space for a bunch of zeros and having to using naming conventions to match up vacuum_truncate and vacuum_truncate_set.  So while "isset_offset" better matches my preferred "framing" of this system I agree forcing it in here with an implementation that isn't ideal doesn't make sense when we can just continue on doing what we've been doing and use this experience to bring a fresh perspective to the larger work that apparently is going on in this area.

We do at least have the user experience correct either way - use reset to change to the state that denotes "this option has not been set".

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Mon, Mar 24, 2025 at 08:57:27PM -0700, David G. Johnston wrote:
> On Mon, Mar 24, 2025 at 11:45 AM Nathan Bossart <nathandbossart@gmail.com>
> wrote:
>> * I don't think this matches the parse_bool() behavior exactly.  For
>>   example, parse_bool() appears to accept inputs like "t" to mean "true".
>>   This is also probably not a huge deal.
> 
> Fixable for sure.

I've attached an attempt at this.

>> That being said, I'm open to applying this patch if it seems like a
>> majority of folks prefer this implementation.  FWIW I'm still partial to
>> the isset_offset approach for the reasons I've already discussed.
>
> I'm preferable to turning the implementation into an enum.  Slight
> preference to documenting it as the boolean it presently is; pending any
> information regarding where in user-space, particularly in psql but also
> any client-side details, this internal detail might show through.  It
> avoids having to tell the reader that we choose enum in order to
> accommodate optionality.  But it isn't the end of the world if we do say
> that either.

The only other place I found that reveals this internal implementation
detail is parse_one_reloption.  In the attached patch, I've modified it to
emit the same error message as for RELOPT_TYPE_BOOL to further uphold the
illusion that it is a Boolean reloption.  I'm not sure I've patched up all
such holes, though.

TBH I find the attached to be even more of a hack than isset_offset.  We
have to try to match behavior in multiple places, and we need lengthy
commentary about why you'd want to use this enum.  Furthermore, it's only
useful for Boolean-style reloptions, unlike isset_offset which can be used
for any type.  But perhaps more importantly, I agree with Robert that it's
not clear what problem it's trying to solve.  The concrete arguments in
this thread range from avoiding extra bytes in the struct to maintaining
consistency with other options with backing GUCs, which I'd argue are more
style preferences than anything.

In any case, AFAICT the votes are somewhat evenly divided at the moment, so
I do not intend to proceed with this patch for now.

-- 
nathan

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от вторник, 25 марта 2025 г. 17:57:46 MSK пользователь Nathan Bossart
написал:

> In any case, AFAICT the votes are somewhat evenly divided at the moment, so
> I do not intend to proceed with this patch for now.

Counting votes does not lead anywhere, as I can ask friends and colleagues to
vote for me for example. This is wrong way.

There are people who involved it supporting reloption engine. If Alvaro said
"isset_offset" is good, let's have it, I will shut keep my mouth shut, right
after saying "As you say". Because he is the author of that part of the code,
I respect this.

I am also the person that dedicated lot's of efforts to work with reloptions. I
will do it for may years ahead, I think. This isset_offset I will have to
support from now on, if we do not revert it. I do not like it, I see there is
no logic in it, at least the way you suggest it.

If you want to dedicate part of your life to reloptions, you are welcome. I
need help with a code to review. If you really need isset_offset let's together
redisign options to use it instead of default_value. Go for it, but not
partly, but totally.

But if you want to fix one option and leave, please do not leave us with
isset_offset.

This is not about democracy, this is about who will deal with that part of the
code later. I guess it will be me and Alvaro, not you. Please let us have the
code the way we like it, since we support it most of the time.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Tue, Mar 25, 2025 at 12:35 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> This is not about democracy, this is about who will deal with that part of the
> code later. I guess it will be me and Alvaro, not you. Please let us have the
> code the way we like it, since we support it most of the time.

Actually, we decide many things around here by counting votes.

But we also decide things by providing reasons that other people can
engage with intellectually, and I still say you're not really doing
that. You're just saying you like X better than Y, but without really
giving any reason that anybody else can understand and agree with.

Also, the last time you authored a commit to this file was in 2019.
Even if it were last week and even if you were a committer, that
wouldn't give you some kind of maintainer-level control over the file
to prevent other people from doing logical things with it that you
happen not to like.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Wednesday, March 26, 2025, Robert Haas <robertmhaas@gmail.com> wrote:

But we also decide things by providing reasons that other people can
engage with intellectually, and I still say you're not really doing
that. You're just saying you like X better than Y, but without really
giving any reason that anybody else can understand and agree with.

The argument being made is that the enum patch adheres to established practices; and when adding new code that new code is encouraged to adhere to how existing code is written.  A vote to keep to such guidelines seems reasonable and sufficient; and can outweigh quite a bit of deficiency such existing code may have relative to the new proposal.  The burden is on the new code to justify why it should violate established conventions.

So, the question posed is should the new way of doing this, established by the committed patch, be allowed to stay and exist side-by-side with the original convention for handling the determination of whether an option is set? I’m +.75 for no and +.25 for yes.  Convention tips the balance for me.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Álvaro Herrera
Date:
On 2025-Mar-26, David G. Johnston wrote:

> The argument being made is that the enum patch adheres to established
> practices; and when adding new code that new code is encouraged to adhere
> to how existing code is written.  A vote to keep to such guidelines seems
> reasonable and sufficient; and can outweigh quite a bit of deficiency such
> existing code may have relative to the new proposal.  The burden is on the
> new code to justify why it should violate established conventions.

I think the goal of unifying the various pieces of code that handle
options by building a generic "engine" is a good one, and I tend to
agree with the argument that the less things this "engine" has to
support (while keeping the functionality the same), the better.  Given
that we've gone far and long without needing the "isset" flag, and that
it's not clear that we really need the concept, I'd rather go with not
having it.

Now -- Nikolay also appears to be saying that we either choose the
concept of a "default" or the concept of the "isset" flag, but not both.
That says to me that the other option would be to remove all default
values and rely solely on isset.  Is that a workable option?  If it's
not, then it would be clear to me that we'd rather stick with defaults.
But if it is, and if that would get us to an overall simplification of
the concepts, then perhaps that is better.  (In other words: for the
other enums that are currently imitating booleans, can we remove that
and instead use isset?  And are there other uses of default values that
aren't booleans that can be turned into isset or into something
different?)

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Las navajas y los monos deben estar siempre distantes"   (Germán Poo)



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от среда, 26 марта 2025 г. 16:52:09 MSK пользователь Robert Haas
написал:
> But we also decide things by providing reasons that other people can
> engage with intellectually, and I still say you're not really doing
> that. You're just saying you like X better than Y, but without really
> giving any reason that anybody else can understand and agree with.

Then please, listen to a reason. If we add isset_offset, it will make code
inconsistent, because it follow same purpose as unreachable default_value in
other non boolean options. Having two conflicting ways to do same thing is a
bad design.
We can get rid of unreachable default_value in every option and use
isset_offset everywhere to detect unset options, but this will give us lot's of
extra flags in StdRdOptions, and more over we still have case when we have
useful default_value. So code will have to either ignore default_value in
option definition when author imply that he will use isset_offset flag later, or
ignore isset_offset flag when processing option with useful default value. These
two cases, I emphasize, are not directly stated in the code, they are implied
somewhere in author's head, and you will never guess which one is in this
case, until you finish reading, or one need to write a lot of comments.

Please try to expand all this isset_offset-based schema for fillfactor (option
with default value), vacuum_max_threshold (option with unreachable default
value) and vacuum_truncate. If you do it carefully you will see, there is no
way to do it without any conflicts.

Better way, if you want to track if vacuum_truncate is, is to encode it in the
default value somehow, as it have been done before. It will not be boolean
anymore, but it can keep boolean-like behaviour for user.

The most simple way to do this, using current code base, is to use enum. If
this will become common case (And I think it will some day), then we should
add separate data type for it. This would be better then enum. But it will
still have to use unreachable default value to detect unset option, not some
kind of flag.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Wed, Mar 26, 2025 at 03:30:49PM +0100, Álvaro Herrera wrote:
> On 2025-Mar-26, David G. Johnston wrote:
>> The argument being made is that the enum patch adheres to established
>> practices; and when adding new code that new code is encouraged to adhere
>> to how existing code is written.  A vote to keep to such guidelines seems
>> reasonable and sufficient; and can outweigh quite a bit of deficiency such
>> existing code may have relative to the new proposal.  The burden is on the
>> new code to justify why it should violate established conventions.

There's no existing code for an enum reloption that has been hacked to look
like a bool reloption.  There are a couple of enums that accept many of the
same values as bool relopts but that also accept another value, but AFAIK
we don't try to pretend that they are actually bools.

> Now -- Nikolay also appears to be saying that we either choose the
> concept of a "default" or the concept of the "isset" flag, but not both.
> That says to me that the other option would be to remove all default
> values and rely solely on isset.  Is that a workable option?  If it's
> not, then it would be clear to me that we'd rather stick with defaults.
> But if it is, and if that would get us to an overall simplification of
> the concepts, then perhaps that is better.  (In other words: for the
> other enums that are currently imitating booleans, can we remove that
> and instead use isset?  And are there other uses of default values that
> aren't booleans that can be turned into isset or into something
> different?)

Many of the autovacuum reloptions could use isset.  Right now, we use an
out-of-range value as the default so that we can determine whether the
reloption is actually set for the relation.  We'd likely still need a
default value field for reloptions without corresponding GUCs, but that
doesn't seem too troublesome to me.  TBH I was quite surprised that nobody
had encountered this problem with Boolean reloptions yet.

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Wed, Mar 26, 2025 at 10:17 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> The argument being made is that the enum patch adheres to established practices; and when adding new code that new
codeis encouraged to adhere to how existing code is written.  A vote to keep to such guidelines seems reasonable and
sufficient;and can outweigh quite a bit of deficiency such existing code may have relative to the new proposal.  The
burdenis on the new code to justify why it should violate established conventions. 

I kind of agree with that, but:

1. We're talking about a minor deviation resulting in a very small
amount of additional code. It's entirely unclear to me why anyone
thinks this is a big deal either way, even if one accepts that the
patch is "wrong", which I don't.

2. While it is true that reloptions have up until not had any case
where the default value wasn't representable by the underlying data
type, but there's a lot of PostgreSQL code and there is ample
precedent across the broader code base for either (a) using a second
Boolean to track whether a first Boolean is set or (b) converting a
Boolean to a three-valued enum. There are many examples of both
styles.

3. It seems like the fact that this has not been needed up until now
is partly just good luck. It's probably true that most integer or
real-valued reloptions can use 0 or -1 for this purpose, strings can
probably mostly use the empty string, and enums can add a value just
for this purpose. But I can't see any intrinsic reason why every
reloption anyone ever adds in the future has to work out nicely in
that way. I feel like you're trying to take an accidental property of
the status quo ante and elevating it to a rule that doesn't seem to
have clearly been intended and wouldn't be a particularly sound design
principle anyway.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Wed, Mar 26, 2025 at 10:40 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> Then please, listen to a reason. If we add isset_offset, it will make code
> inconsistent, because it follow same purpose as unreachable default_value in
> other non boolean options. Having two conflicting ways to do same thing is a
> bad design.

As Nathan also says, I'm not sure why we're obliged to do everything
one way or the other.

If you look at postgresql.conf, there are some settings where a
designated value is used to disable something -- for instance
temp_file_limit=-1 disables any limit on the size of temp files, and
backend_flush_after=0 also disables the corresponding behavior -- but
there are other settings where that doesn't make sense and we add a
separate switch e.g. ssl=off disables SSL, separate from any of the
other SSL-affecting GUCs, and archive_mode=off disables archiving
separately from the archive_command setting. Your argument seems to
lead to the conclusion that it is bad design, but I don't see it that
way. I think it would be bad design to add a separate
temp_file_limit_enabled=on/off setting when we could just use
temp_file_limit=-1 to mean that, and I think it would be equally a bad
idea to enable ssl based on some non-obvious criterion like whether
ssl_cert_file is set to the empty string. The cases are different, and
it's OK to treat them differently, IMHO.

> We can get rid of unreachable default_value in every option and use
> isset_offset everywhere to detect unset options, but this will give us lot's of
> extra flags in StdRdOptions,

I agree, that doesn't seem like a good idea.

> and more over we still have case when we have
> useful default_value. So code will have to either ignore default_value in
> option definition when author imply that he will use isset_offset flag later, or
> ignore isset_offset flag when processing option with useful default value. These
> two cases, I emphasize, are not directly stated in the code, they are implied
> somewhere in author's head, and you will never guess which one is in this
> case, until you finish reading, or one need to write a lot of comments.

I'm surprised that you think it will be hard to clarify this in the comments.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от среда, 26 марта 2025 г. 17:42:23 MSK пользователь Robert Haas
написал:

> 1. We're talking about a minor deviation resulting in a very small
> amount of additional code. It's entirely unclear to me why anyone
> thinks this is a big deal either way, even if one accepts that the
> patch is "wrong", which I don't.
This code is important part of my life for 8 years. There are not many
commits, but I do code rebases all the time, I port features that have been
added for all these years. I do care about it, it is big deal for me.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Wed, Mar 26, 2025 at 7:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Mar 26, 2025 at 10:17 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> The argument being made is that the enum patch adheres to established practices; and when adding new code that new code is encouraged to adhere to how existing code is written.  A vote to keep to such guidelines seems reasonable and sufficient; and can outweigh quite a bit of deficiency such existing code may have relative to the new proposal.  The burden is on the new code to justify why it should violate established conventions.

I kind of agree with that, but:

1. We're talking about a minor deviation resulting in a very small
amount of additional code. It's entirely unclear to me why anyone
thinks this is a big deal either way, even if one accepts that the
patch is "wrong", which I don't.


I'm willing to say "I don't know why this is so very important to Nikolay, but I trust him that it is, and since my opinion isn't that strong and this isn't a big deal, so I will accommodate the person screaming that adding this will make their life miserable in a real way."  Maybe others need more evidence of what that misery looks like?

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Wed, Mar 26, 2025 at 08:09:53AM -0700, David G. Johnston wrote:
> I'm willing to say "I don't know why this is so very important to Nikolay,
> but I trust him that it is, and since my opinion isn't that strong and this
> isn't a big deal, so I will accommodate the person screaming that adding
> this will make their life miserable in a real way."  Maybe others need more
> evidence of what that misery looks like?

To summarize the discussion thus far, Robert and I seem to be okay with
isset_offset, David is leaning no (+.25 yes and -.75 no), and Niolay and
Álvaro appear to favor an enum approach, which would look like this [0].
That's +2.25 and -2.75.

Is that an accurate analysis of where folks stand?

[0] https://postgr.es/m/attachment/174762/v2-0001-change-vacuum_truncate-relopt-to-enum.patch

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Wed, Mar 26, 2025 at 11:10 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> I'm willing to say "I don't know why this is so very important to Nikolay, but I trust him that it is, and since my
opinionisn't that strong and this isn't a big deal, so I will accommodate the person screaming that adding this will
maketheir life miserable in a real way."  Maybe others need more evidence of what that misery looks like? 

What I'm upset about is that it feels to me like Nikolay is trying to
win the argument by yelling. I don't want that to be the way we do
things around here. I admit that sometimes it is, and I think that is
bad, no matter who the yeller is and who is getting yelled at. People
get upset, including me, and that is life, but whether people are
upset should never be the determinant of what goes into the tree. I
fear that this argument will make Nathan - or other committers - more
cautious in committing routine patches for no real reason, and I think
that would be bad for the project. Committing patches frequently
results in a bunch of people getting annoyed at you because of some
oversight, and it's one of the reasons why it takes so long to get a
patch committed around here -- people are risk-averse and afraid of
doing something wrong, so they do nothing. I don't really know how we
can solve that problem, but I definitely do not want to see us start
getting upset with people when they didn't even do anything wrong. And
that's how I see this case. Somebody could have chosen to solve this
problem in a few ways, Nathan picked one of the reasonable ones, and
now we're spending dozens of email messages arguing about whether
Nathan is a terrible person for doing something that looks totally
normal.

In my mind, that is absolutely clear. Nathan is not a terrible person
for that, and I feel rather strongly that saying otherwise is harmful
to the development community and to the project as a whole. I have no
problem with a rational discussion of what the best option is here,
but I am absolutely not OK with vitriolic rhetoric about how things
are awful when, AFAICS, nothing has happened here beyond the totally
routine. In a certain sense, the damage here has already been done.
Nathan has already had to spend a significant amount of time engaging
with this thread over what I think should be a complete non-event, and
will probably have to spend more, and all that takes away from time
that could, for example, be spent reviewing and committing other
patches. And for what?

If it ever happens that the design decision that this patch made
caused a real problem, some future patch could have reverted it and
substituted something else and probably nobody would have cared. Even
now, if some committer other than Nathan cares enough to change
something here, I doubt that Nathan will really care. But I cannot see
any world in which pinning Nathan beyond a barrel and demanding action
is a win for the project overall. If we argued this much about every
design detail of my patches, I would have quit working on this project
long ago.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
Nathan Bossart
Date:
On Wed, Mar 26, 2025 at 11:41:13AM -0400, Robert Haas wrote:
> What I'm upset about is that it feels to me like Nikolay is trying to
> win the argument by yelling. I don't want that to be the way we do
> things around here. I admit that sometimes it is, and I think that is
> bad, no matter who the yeller is and who is getting yelled at. People
> get upset, including me, and that is life, but whether people are
> upset should never be the determinant of what goes into the tree.

+1

> I have no
> problem with a rational discussion of what the best option is here,
> but I am absolutely not OK with vitriolic rhetoric about how things
> are awful when, AFAICS, nothing has happened here beyond the totally
> routine.

+1

> In a certain sense, the damage here has already been done.
> Nathan has already had to spend a significant amount of time engaging
> with this thread over what I think should be a complete non-event, and
> will probably have to spend more, and all that takes away from time
> that could, for example, be spent reviewing and committing other
> patches. And for what?

I've debated bringing this up, but this has been the most frustrating part
of the discussion for me.  While I'm trying to responsibly commit a couple
of other big patches for v18 (for which I am not the primary author), I'm
also spending a huge amount of time trying to have some sort of rational
discussion about a handful of lines of code that seem to work just fine.

FWIW one of the big reasons I didn't proceed with the enum approach
initially is because I worried that I'd end up in a similar discussion
about how terrible _that_ approach is.  When I look at that patch [0], I
genuinely wonder if folks would accept that without the isset_offset
context.  Maybe I misjudged...

> If it ever happens that the design decision that this patch made
> caused a real problem, some future patch could have reverted it and
> substituted something else and probably nobody would have cared. Even
> now, if some committer other than Nathan cares enough to change
> something here, I doubt that Nathan will really care. But I cannot see
> any world in which pinning Nathan beyond a barrel and demanding action
> is a win for the project overall. If we argued this much about every
> design detail of my patches, I would have quit working on this project
> long ago.

I change others' code all the time, and I fully expect that people will
change my code from time to time, too.  The vacuum_truncate code is no
exception.  As long as it advances the project in some way, I'm happy.

[0] https://postgr.es/m/attachment/174762/v2-0001-change-vacuum_truncate-relopt-to-enum.patch

-- 
nathan



Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Wed, Mar 26, 2025 at 8:41 AM Robert Haas <robertmhaas@gmail.com> wrote:
Nathan has already had to spend a significant amount of time engaging
with this thread over what I think should be a complete non-event, and
will probably have to spend more, and all that takes away from time
that could, for example, be spent reviewing and committing other
patches. And for what?

A broader awareness of what is going on in this little corner of an expansive codebase.  And making explicit those little unwritten rules happened upon by accident so many years ago, and their consequences.

I don't see where anyone has done something wrong or bad.  This touched new territory in a patch that otherwise few people likely cared about - the vacuum_truncate feature itself being already developed - a post-commit realization and discussion was the most likely outcome.  Nikolay's desire for an API-only thread/patch in which to have this discussion does match with ideal circumstances but is not a practical reality.  I wouldn't be too hard on him for expressing that desire - I'm sure we've all wished to not be surprised in this way.

A more concrete explanation of the effect of this patch on other work would be nice to have; and maybe we keep the patch in until we all get a better feel of the negative side-effects that are presently just being alluded to.  This can all easily wait until May.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Wed, Mar 26, 2025 at 9:14 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
FWIW one of the big reasons I didn't proceed with the enum approach
initially is because I worried that I'd end up in a similar discussion
about how terrible _that_ approach is.  When I look at that patch [0], I
genuinely wonder if folks would accept that without the isset_offset
context.  Maybe I misjudged...


I think this discussion was going to happen no matter which approach was actually committed.  The concept of "is set" is too obvious and clean a solution to not be brought up and considered; while at the same time this argument about staying consistent with nearby code, even in the face of a hack-ish implementation, was going to need to be explicitly considered.

This discussion was preordained the moment we decided to add vacuum_truncate to the system.  It was only a matter of when.  And while hindsight is 20-20 your own comments regarding your uncertainty suggests you at least had an inkling of suspicion that this discussion was going to be part of the outcome of committing this.

I would have been fine with: "I committed this approach because it's cleaner, and here is why I dislike the enum approach.  Let's have a discussion in May if this choice is unappealing for reasons."  Getting in the user-facing feature "DBA choice of default" before feature freeze was warranted and the patch as committed did meet all the necessary requirements.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от среда, 26 марта 2025 г. 18:41:13 MSK пользователь Robert Haas
написал:

> What I'm upset about is that it feels to me like Nikolay is trying to
> win the argument by yelling.

I do not think it is fair. When votes are against you (and I did not ask
anybody else's help for it) it turned out, that I am yelling, and thus voting
is not valid.

I do not demand enum, you can create new reloption data type that can be true
false and unset. I even would agree with isset_offset, it you manage to make
code consistent. I doubt anybody manages do it, but may be I am wrong. You did
not even try, you just insist on keeping code inconsistent, because nobody
care. Nikolay cares, but he is yelling while caring, so let's set him aside.

The only thing I am asking for: please keep code consistent. Better keep
reloption core the way it was, but if it is not possible, then keep it
consistent then.

Enum is the cheapest way to achieve the goal Nathan want, and keep code
consistent. Cheapest, not the best.

When the dust settles down I will try to invent boolean-based data type, that
would not be ugly as this enum, and even may be suitable for
vacuum_index_cleanup option. I see there is a demand for this. But this will
need carefull thinking and development. And starting from ugly Enum with
boolean values would be much easy then from boolean with isset_offset.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
"David G. Johnston"
Date:
On Wed, Mar 26, 2025 at 10:29 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
The only thing I am asking for: please keep code consistent. Better keep
reloption core the way it was, but if it is not possible, then keep it
consistent then.

Consistency at the expense of all else is a debatable position that will benefit from being actively justified.  Incremental improvement has a lot of merit.  The cost of incremental improvement is quite low for generally high benefit.  Wholesale rewriting has a much higher cost for basically the same benefit.  That's a lot of added cost in order to stay "consistent", so what are the added benefits of consistency for incurring that cost?

There is a baseline benefit to consistency that any incremental improvement needs to overcome to be worthwhile.  I don't think the isset_offset implementation to handle this rare boolean option overcomes that baseline.  I've tossed my weight, expect others to do so as well, and we'll move forward.  But none of the positions proposed here are considered absolute by the project.

David J.

Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Wed, Mar 26, 2025 at 1:29 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> > What I'm upset about is that it feels to me like Nikolay is trying to
> > win the argument by yelling.
>
> I do not think it is fair. When votes are against you (and I did not ask
> anybody else's help for it) it turned out, that I am yelling, and thus voting
> is not valid.

I'm completely willing to admit that I may be misreading your emails.
It's very difficult to judge tone on the Internet, and I may very well
have misread the tone that you intended to convey. I can tell you how
they came across to me, but, of course, I cannot say how you mean
them.

But I do not think it is true that I am accusing you of yelling
because the voting is against you. The reason why I felt that way is
because you were using what seemed to me to be very strong language
over what seemed to me to be a very minor issue. Phrases like "it will
make whole design inconsistent", "this is nasty thing", "you've just
changed the whole engine", and "this goes against all current practice
of options usage," seem like very strong wording and, in my opinion,
are just not accurate.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от среда, 26 марта 2025 г. 21:16:20 MSK пользователь Robert Haas
написал:

> The reason why I felt that way is
> because you were using what seemed to me to be very strong language
> over what seemed to me to be a very minor issue.

Being not native speaker has it's own difficulties. Plus possible cultural
differences and so on. I feel emotions about "my precious thing have been
broken for no reason at all", and this should affect the text, and I guess I
can not properly keep proper emotional degree as I do not feel fine meaning of
the words. Yes, I feel emotions, No, I am not yelling,  and certainly I am not
yelling as final argument.
Can you please skip the form, that may be not perfect, and see the content. I
hope that I managed to express the core idea quite clear.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Robert Haas
Date:
On Wed, Mar 26, 2025 at 2:34 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> В письме от среда, 26 марта 2025 г. 21:16:20 MSK пользователь Robert Haas
> написал:
> > The reason why I felt that way is
> > because you were using what seemed to me to be very strong language
> > over what seemed to me to be a very minor issue.
>
> Being not native speaker has it's own difficulties. Plus possible cultural
> differences and so on. I feel emotions about "my precious thing have been
> broken for no reason at all", and this should affect the text, and I guess I
> can not properly keep proper emotional degree as I do not feel fine meaning of
> the words. Yes, I feel emotions, No, I am not yelling,  and certainly I am not
> yelling as final argument.

OK, fair enough.

> Can you please skip the form, that may be not perfect, and see the content. I
> hope that I managed to express the core idea quite clear.

Honestly, I'm still pretty baffled. I understand that different people
may have different stylistic preferences here, and that is fair
enough, but I personally think what Nathan did is more elegant than
inventing an enum type with three possible values. Even if a majority
of people feel otherwise, why is it worth arguing about? I cannot see
how this really makes any great difference.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: vacuum_truncate configuration parameter and isset_offset

From
Nikolay Shaplov
Date:
В письме от среда, 26 марта 2025 г. 21:43:42 MSK пользователь Robert Haas
написал:

> why is it worth arguing about?
I have full model of current reloptions and my patch unrolled in my head. They
have some harmony. (Mine has more harmony, but it is not the case).
isset_offset the way it is introduced breaks that harmony. Harmony worth it.

> I cannot see how this really makes any great difference.

git blame reloptions.c | grep "Robert Haas" | wc -l
112

git blame reloptions.c | grep Álvaro | wc -l
792

For me Alvary is the keeper of the concept of reloption. And he sees
difference. If he would said isset_offset is good, I will follow. Until then I
will follow my sense of harmony and code beauty as much as I can.

--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su

Attachment

Re: vacuum_truncate configuration parameter and isset_offset

From
Álvaro Herrera
Date:
On 2025-Mar-26, Nikolay Shaplov wrote:

> git blame reloptions.c | grep "Robert Haas" | wc -l
> 112
> 
> git blame reloptions.c | grep Álvaro | wc -l
> 792
> 
> For me Alvary is the keeper of the concept of reloption. And he sees 
> difference. If he would said isset_offset is good, I will follow. Until then I 
> will follow my sense of harmony and code beauty as much as I can.

Well, as I said upthread, I wrote this code far too long ago and I don't
consider myself as its "owner" in the sense that I can block others from
moving it forward.  I really wish I had had time to review and get your
new options patch to completion earlier -- we probably wouldn't be
having this discussion if I had.  But that regretfully didn't happen.

I think the opinions have been sufficiently voiced, and I wouldn't want
to block Nathan from being able to make progress on the other projects
he has for this release.  Given that continued discussion has not
convinced several committers that this is a change worth making, it
seems reasonable to give up on requiring that your proposed change be done.

I'd recommend you to take a break from it for a while.  Maybe a design
solution will occur to you that will not break everything (for example:
some options allow a default value to be expressed, others allow the
"isset" flag to be used).

Thanks for your persistence

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"[PostgreSQL] is a great group; in my opinion it is THE best open source
development communities in existence anywhere."                (Lamar Owen)