Thread: vacuum_truncate configuration parameter and isset_offset
Hi! I've tired to rebase my reloptions patch (https:// commitfest.postgresql.org/patch/4688/) and have stuck with 0164a0f9 commit. I am deeply involved with reloptions topic, and I should say that I do not like the idea of isset_offset field All other types, as you have mentioned in commit messages, all other types manages "Not set" feature via some specific default value (like -1 for positive integer for example or "auto" for enums). But you can't do that in boolean, and this is true. If you add isset_offset in this case, it would be useless for all types, except boolean. It will make whole design inconsistent, and this is bad. I would suggest to use enum here to achieve same goal, or add some "trilean" data type that can be "true/false/unset" Current reloption code is already a big mess, there is no reason to make it worse. Can we revert it, and do it again, without adding isset_offset field? I can write some code for it too. Or we can patch it over, but I am afraid something will go wrong and we will stuck with isset_offset forever. I do not want it to happen. PS. I've just looked at code for vacuum_index_cleanup it has very same logic, just replace "auto" with anything you like, and uses enum. Grep for StdRdOptIndexCleanupValues for more info StdRdOptIndexCleanupValues -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
В письме от понедельник, 24 марта 2025 г. 15:27:35 MSK пользователь Nikolay Shaplov написал: > PS. I've just looked at code for vacuum_index_cleanup it has very same > logic, just replace "auto" with anything you like, and uses enum. > Grep for StdRdOptIndexCleanupValues for more info I thought about it a bit more... ALTER TABLE test SET (vacuum_truncate); ALTER TABLE test SET (vacuum_truncate=false); ALTER TABLE test RESET (vacuum_truncate); Nobody will ever guess that these are three different cases. since for enum solution ALTER TABLE test SET (vacuum_truncate=on); ALTER TABLE test SET (vacuum_truncate=off); ALTER TABLE test SET (vacuum_truncate=unset); ALTER TABLE test RESET (vacuum_truncate); it is quite clear that first three are different, and 4th is most probably is equivalent for 3rd. So, my suggestion is to change it as soon as possible. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 03:27:35PM +0300, Nikolay Shaplov wrote: > I would suggest to use enum here to achieve same goal, or add some "trilean" > data type that can be "true/false/unset" I did first try making the vacuum_truncate reloption an enum, but I didn't proceed with that approach for a couple of reasons: * vacuum_truncate is already a Boolean reloption, so switching it to an enum would require enumerating all possible values accepted by parse_bool(), which has the following comment: /* * Try to interpret value as boolean value. Valid values are: true, * false, yes, no, on, off, 1, 0; as well as unique prefixes thereof. * If the string parses okay, return true, else false. * If okay and result is not NULL, return the value in *result. */ I also considered adding some sort of pluggable parsing functionality or something like StdRdOptIndexCleanupValues (i.e., a subset of the most popular values accepted by parse_bool()), but those seemed a little too complicated for the problem at hand. * I wasn't sure whether it would be better to expose this third "unset" value or to make it hidden and for internal use only. The former would mean we have two ways to declare a reloption "unset." You could either not set it, or you could explicitly set it to "unset." The latter would require additional code somewhere, or we could just leave it undocumented. Both of these approaches seemed like weird user experiences to me. Overall, the biggest reason I didn't proceed with the enum is because it felt like it was making it the user's problem. Rather than just teaching our code how to determine if a reloption was explicitly set, we'd be introducing unnecessary complexity to the user, or at least we'd be trying to closely match the existing functionality in an attempt to hide this complexity from them. -- nathan
On Mon, Mar 24, 2025 at 10:43 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > Overall, the biggest reason I didn't proceed with the enum is because it > felt like it was making it the user's problem. Rather than just teaching > our code how to determine if a reloption was explicitly set, we'd be > introducing unnecessary complexity to the user, or at least we'd be trying > to closely match the existing functionality in an attempt to hide this > complexity from them. +1. Giving the user the ability to set the option to a value called "unset" doesn't seem right to me. -- Robert Haas EDB: http://www.enterprisedb.com
В письме от понедельник, 24 марта 2025 г. 17:43:38 MSK пользователь Nathan Bossart написал: > * vacuum_truncate is already a Boolean reloption, Yes, I guess schema update would be best solution here. > so switching it to an > enum would require enumerating all possible values accepted by > parse_bool(), which has the following comment: > > /* > * Try to interpret value as boolean value. Valid values are: true, > * false, yes, no, on, off, 1, 0; as well as unique prefixes thereof. > * If the string parses okay, return true, else false. > * If okay and result is not NULL, return the value in *result. > */ We already do it for some option. May be we need more generic solution, but boolean should remain boolean, with only two possible values. > * I wasn't sure whether it would be better to expose this third "unset" > value or to make it hidden and for internal use only. If you hide something from user, you will confuse him. If the solution is more explicit, then it is more easy to use it, and understand it. If something is implying (like deleted option means third value), then there would be confusion. Please do not do that. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
В письме от понедельник, 24 марта 2025 г. 17:48:25 MSK пользователь Robert Haas написал: > On Mon, Mar 24, 2025 at 10:43 AM Nathan Bossart > > <nathandbossart@gmail.com> wrote: > > Overall, the biggest reason I didn't proceed with the enum is because it > > felt like it was making it the user's problem. Rather than just teaching > > our code how to determine if a reloption was explicitly set, we'd be > > introducing unnecessary complexity to the user, or at least we'd be trying > > to closely match the existing functionality in an attempt to hide this > > complexity from them. > > +1. Giving the user the ability to set the option to a value called > "unset" doesn't seem right to me. 1. Enum allows you to make default value "unreachable" for setting. But this is not the most important thing. 2. This option has three possible values: on/off/system_default. We can find better name for system_default, but it would not change it's meaning. I would prefer to let user set these three values explicitly. Nobody would guess that ALTER TABLE test SET (vacuum_truncate=false); means "off" and ALTER TABLE test RESET (vacuum_truncate); means "system_default" This will lead to a lot of confusion. So I strongly against this implying third value for boolean. This is nasty thing. Boolean should have two values. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 11:12 AM Nikolay Shaplov <dhyan@nataraj.su> wrote: > Nobody would guess that > > ALTER TABLE test SET (vacuum_truncate=false); > means "off" > > and > ALTER TABLE test RESET (vacuum_truncate); > means "system_default" > > This will lead to a lot of confusion. I agree that this confuses people, but I don't think it's more confusing here than for other vacuum reloptions. I have seen people try to unset vacuum reloptions by using SET to configure them to the default value rather than by using RESET to remove them. But then later they change the system default and that table is still nailed to the old default. I always find myself slightly bemused by this, because it doesn't seem that hard to me to figure out how it actually works, but it's definitely a real issue. However, I don't see why the issue is any more acute for this parameter than any other, and it certainly does not seem like a good idea to make this parameter work differently from the other ones. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Mar 24, 2025 at 8:26 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Mar 24, 2025 at 11:12 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> Nobody would guess that
>
> ALTER TABLE test SET (vacuum_truncate=false);
> means "off"
>
> and
> ALTER TABLE test RESET (vacuum_truncate);
> means "system_default"
>
> This will lead to a lot of confusion.
I agree that this confuses people, but I don't think it's more
confusing here than for other vacuum reloptions. I have seen people
try to unset vacuum reloptions by using SET to configure them to the
default value rather than by using RESET to remove them. But then
later they change the system default and that table is still nailed to
the old default. I always find myself slightly bemused by this,
because it doesn't seem that hard to me to figure out how it actually
works, but it's definitely a real issue. However, I don't see why the
issue is any more acute for this parameter than any other, and it
certainly does not seem like a good idea to make this parameter work
differently from the other ones.
+1
David J.
В письме от понедельник, 24 марта 2025 г. 18:25:44 MSK пользователь Robert Haas написал: > > and > > ALTER TABLE test RESET (vacuum_truncate); > > means "system_default" > > > > This will lead to a lot of confusion. > > I agree that this confuses people, Though a bit more, I think that as a user I would prefer to be able to explicitly set "system_defaults" value. If I have not set any, most probably I do not care, or have not thought about it. But if I set option to "system_defaults" then I do it intentionally and have some good reasons for it. > but I don't think it's more > confusing here than for other vacuum reloptions. I have seen people > try to unset vacuum reloptions by using SET to configure them to the > default value rather than by using RESET to remove them. But then > later they change the system default and that table is still nailed to > the old default. I always find myself slightly bemused by this, > because it doesn't seem that hard to me to figure out how it actually > works, but it's definitely a real issue. However, I don't see why the > issue is any more acute for this parameter than any other, and it > certainly does not seem like a good idea to make this parameter work > differently from the other ones. May be I can agree with you. But this does not explain why we should switch boolean from two possible values into three... It can't have three values for any practical reason. This should not be boolean anymore in this case. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 8:45 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> but I don't think it's more
> confusing here than for other vacuum reloptions. I have seen people
> try to unset vacuum reloptions by using SET to configure them to the
> default value rather than by using RESET to remove them. But then
> later they change the system default and that table is still nailed to
> the old default. I always find myself slightly bemused by this,
> because it doesn't seem that hard to me to figure out how it actually
> works, but it's definitely a real issue. However, I don't see why the
> issue is any more acute for this parameter than any other, and it
> certainly does not seem like a good idea to make this parameter work
> differently from the other ones.
May be I can agree with you. But this does not explain why we should switch
boolean from two possible values into three... It can't have three values for
any practical reason. This should not be boolean anymore in this case.
The boolean is two-valued. The state of the reloption is also binary - set or unset (i.e., it is listed in pg_class.reloptions, or not). In the unset state the effective value of a reloption comes from the corresponding GUC (or, lacking that, some hard-coded default). If set, the value comes from the associated boolean. RESET places a reloption into the unset state.
David J.
Hello I don't understand why this shouldn't work exactly like vacuum_index_cleanup (cf. vacuum_rel lines 2170ff). That would require no new mechanism. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Java is clearly an example of money oriented programming" (A. Stepanov)
В письме от понедельник, 24 марта 2025 г. 19:00:51 MSK пользователь Álvaro Herrera написал: > I don't understand why this shouldn't work exactly like > vacuum_index_cleanup (cf. vacuum_rel lines 2170ff). That would require > no new mechanism. In my opinion it should. "no new mechanism" is good. "new mechanism" is bad. :-) -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 9:00 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Hello
I don't understand why this shouldn't work exactly like
vacuum_index_cleanup (cf. vacuum_rel lines 2170ff). That would require
no new mechanism.
That reloption is already an enum and there is no GUC to defer to when the value is unset. It doesn't seem like an equivalent scenario. AUTO is a perfectly useful value as opposed to an undocumented sentinel for unset/missing.
David J.
On Mon, Mar 24, 2025 at 9:08 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Mar 24, 2025 at 9:00 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:Hello
I don't understand why this shouldn't work exactly like
vacuum_index_cleanup (cf. vacuum_rel lines 2170ff). That would require
no new mechanism.That reloption is already an enum and there is no GUC to defer to when the value is unset. It doesn't seem like an equivalent scenario. AUTO is a perfectly useful value as opposed to an undocumented sentinel for unset/missing.
Sorry, the "already an enum" comment is wrong - I see the commit now where we basically re-implemented boolean value processing logic and added an "auto" option.
Basically we'd do this to make a boolean-compatible enum adding an undocumented value "null" as a valid and default set value and then interpret "null" as meaning "go use the vacuum_truncate GUC".
It's too late to argue against sentinel values so I suppose this would have to be acceptable.
David J.
On Mon, Mar 24, 2025 at 5:42 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
В письме от понедельник, 24 марта 2025 г. 15:27:35 MSK пользователь Nikolay
Shaplov написал:
> PS. I've just looked at code for vacuum_index_cleanup it has very same
> logic, just replace "auto" with anything you like, and uses enum.
> Grep for StdRdOptIndexCleanupValues for more info
I thought about it a bit more...
ALTER TABLE test SET (vacuum_truncate);
ALTER TABLE test SET (vacuum_truncate=false);
ALTER TABLE test RESET (vacuum_truncate);
Nobody will ever guess that these are three different cases.
since for enum solution
ALTER TABLE test SET (vacuum_truncate=on);
ALTER TABLE test SET (vacuum_truncate=off);
ALTER TABLE test SET (vacuum_truncate=unset);
ALTER TABLE test RESET (vacuum_truncate);
Actually, we'd want to prohibit vacuum_truncate=unset (or whatever internal enum option value we choose) as input since none of the other reloptions allow the user to specify the sentinel value but instead require the use of reset to achieve it.
i.e., the reset value for parallel_workers is -1:
{
"parallel_workers",
"Number of parallel processes that can be used per executor node for this relation.",
RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock
},
-1, 0, 1024
"parallel_workers",
"Number of parallel processes that can be used per executor node for this relation.",
RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock
},
-1, 0, 1024
but:
alter table vct set (parallel_workers=-1);
ERROR: value -1 out of bounds for option "parallel_workers"
DETAIL: Valid values are between "0" and "1024".
ERROR: value -1 out of bounds for option "parallel_workers"
DETAIL: Valid values are between "0" and "1024".
So, given the precedent of vacuum_index_cleanup and the above, we should turn this into an enum that accepts all existing boolean literal inputs and also has a undocumented "unset" default value that the user is not allowed to explicitly set but instead only gets used to resolve an unset reloption at runtime.
David J.
On Mon, Mar 24, 2025 at 05:00:51PM +0100, Álvaro Herrera wrote: > I don't understand why this shouldn't work exactly like > vacuum_index_cleanup (cf. vacuum_rel lines 2170ff). That would require > no new mechanism. I explained my reasons for not proceeding with that approach upthread [0]. I don't think there are any existing reloptions where there are two ways to say "use the GUC." For ones with corresponding GUCs, unsetting the storage parameter is the mechanism for doing so. vacuum_index_cleanup's AUTO option would be the closest thing, but there's no backing GUC, so the meaning of AUTO doesn't change unless someone changes the code. TBH I'm not understanding the pushback for adding a way to determine whether the storage parameter is actually set. It's very simple, and it seems like it could be useful elsewhere. But more importantly, it allows us to more closely match the behavior of the existing reloptions with GUCs, and it prevents type mismatches (e.g., the reloption is an enum but the GUC is a Boolean). [0] https://postgr.es/m/Z-Fvmnf57z6zM8Ky%40nathan -- nathan
В письме от понедельник, 24 марта 2025 г. 19:41:27 MSK пользователь Nathan Bossart написал: > But more importantly, it allows > us to more closely match the behavior of the existing reloptions with GUCs, > and it prevents type mismatches (e.g., the reloption is an enum but the GUC > is a Boolean). Please do not forget, that reloptions is not the only application of options mechanism. We have attribute options, opclass options and may have more. You've just changed the whole engine, for what is seems to be an exceptional case, that can be solved via existing means. Changing of an engine should not be done carelessly. You've just make changes for any boolean option of any postgres object that can ever have options for years and years on. This change should be done after really good consideration. I strongly suggest to revert it back, solve the problem via existing means if it needs immediate solving, and then carefully design boolean-nullable option, or whatever you want to have, keeping in mind that this changes the whole engine for all options postgres will ever have. A general design of a new option fature without anything like "Here I feel like to have this, and there I feel like to have that". A option type that will be useful for everyone. If it is carefully considered, I will support it. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 9:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
TBH I'm not understanding the pushback for adding a way to determine
whether the storage parameter is actually set. It's very simple, and it
seems like it could be useful elsewhere.
IMO this is superior to using sentinel values for the same purpose, but those already exist and having both is reasonably unappealing.
But more importantly, it allows
us to more closely match the behavior of the existing reloptions with GUCs,
and it prevents type mismatches (e.g., the reloption is an enum but the GUC
is a Boolean).
Good point; we could solve this in documentation by simply keeping boolean if no non-boolean enum values are publicly accessible. But this is part of why having "set/unset" not use sentinel values is better.
David J.
On Mon, Mar 24, 2025 at 09:40:24AM -0700, David G. Johnston wrote: > So, given the precedent of vacuum_index_cleanup and the above, we should > turn this into an enum that accepts all existing boolean literal inputs and > also has a undocumented "unset" default value that the user is not allowed > to explicitly set but instead only gets used to resolve an unset reloption > at runtime. This would involve adding a field to relopt_enum_elt_def to declare a value as "unsettable," right? That seems feasible, but IMHO it's comparable to adding a field to reopt_parse_elt. -- nathan
В письме от понедельник, 24 марта 2025 г. 19:58:38 MSK пользователь Nathan Bossart написал: > On Mon, Mar 24, 2025 at 09:40:24AM -0700, David G. Johnston wrote: > > So, given the precedent of vacuum_index_cleanup and the above, we should > > turn this into an enum that accepts all existing boolean literal inputs > > and > > also has a undocumented "unset" default value that the user is not allowed > > to explicitly set but instead only gets used to resolve an unset reloption > > at runtime. > > This would involve adding a field to relopt_enum_elt_def to declare a value > as "unsettable," right? That seems feasible, but IMHO it's comparable to > adding a field to reopt_parse_elt. If you look at view's check_option option, you will see, how unsettable enum default can be implemented using existing code. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 07:53:43PM +0300, Nikolay Shaplov wrote: > You've just changed the whole engine, for what is seems to be an exceptional > case, that can be solved via existing means. I have not changed the whole engine. I have added an optional integer field to a single struct. -- nathan
В письме от понедельник, 24 марта 2025 г. 20:09:23 MSK пользователь Nathan Bossart написал: > > You've just changed the whole engine, for what is seems to be an > > exceptional case, that can be solved via existing means. > I have not changed the whole engine. I have added an optional integer > field to a single struct. That potentially changes the behaviour of all boolean options. They will never be what they were before. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 10:27 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
В письме от понедельник, 24 марта 2025 г. 20:09:23 MSK пользователь Nathan
Bossart написал:
> > You've just changed the whole engine, for what is seems to be an
> > exceptional case, that can be solved via existing means.
> I have not changed the whole engine. I have added an optional integer
> field to a single struct.
That potentially changes the behaviour of all boolean options. They will never
be what they were before.
The implementation sure looks like a pure opt-in from this angle. Can you provide a concrete behavior change example of a boolean option that hasn't opted-in to help me see what I'm missing. I do agree that this should have been a separate commit but let us at least try and do some analysis before throwing the whole thing away. A refactoring/change like this would need an example to work with anyway so that thread you want to create is this one in practice. It has a poor subject line for the material, true, but that doesn't seem uncommon around here...
My main concern when first seeing this was adding an integer to every single option in the entire system for something that is going to be zero 99.9% of the time. A bit bloated but not directly impacting behavior. I wanted to avoid that by just looking in pg_class.reloptions for the vacuum_truncate setting when needed. I'd rather do that then turn this into an enum that is masquerading as a boolean.
David J.
On Mon, Mar 24, 2025 at 1:27 PM Nikolay Shaplov <dhyan@nataraj.su> wrote: > > > You've just changed the whole engine, for what is seems to be an > > > exceptional case, that can be solved via existing means. > > I have not changed the whole engine. I have added an optional integer > > field to a single struct. > > That potentially changes the behaviour of all boolean options. They will never > be what they were before. If you want to convince people to change something, you need to make real arguments, not just wildly accuse Nathan of having broken everything. He has not "changed the behavior of the whole engine," and this does not change the behavior of any Boolean options that don't elect to use it. It is of course possible that there is some better way to solve the problem than what Nathan picked, but this is really a very minor code change that appears to solve the problem in a very natural way. I don't understand why you're upset about this, and I don't think it's fair to beat up Nathan for doing something that looks totally normal without a really clear reason. -- Robert Haas EDB: http://www.enterprisedb.com
For the sake of discussion, here is what the enum approach would look like. -- nathan
Attachment
В письме от понедельник, 24 марта 2025 г. 20:37:50 MSK пользователь David G. Johnston написал: > My main concern when first seeing this was adding an integer to every > single option in the entire system for something that is going to be zero > 99.9% of the time. A bit bloated but not directly impacting behavior. That is 100% true. > I wanted to avoid that by just looking in pg_class.reloptions for the > vacuum_truncate setting when needed. This goes against all current practice of options usage. Option is mapped to a C structure, once loaded it is cached, and reused while cache is still alive. Doing other way will increase the mess. > I'd rather do that then turn this into an enum that is masquerading as a boolean. I think the core of the problem, is that you still thinking about this option as a boolean. It is no longer a boolean. It is nullable-boolean, or call it whatever you want, but not a boolean anymore. It has 3 values available. The most simple way to implement that is to use enum. If you want to have not-set/null status marked as a separate flag, you can have it. But then you should redesign all options to follow that logic: do not use unreachable default value, but use null/unset flag. That is acceptable, if you really need this, but should be done to all options. But I do not think it worth efforts. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
В письме от понедельник, 24 марта 2025 г. 20:48:29 MSK пользователь Nathan Bossart написал: > For the sake of discussion, here is what the enum approach would look like. In my point of view this solution is much-much better: it achieves all goals, has no drawbacks, and do not change reloption engine. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 09:03:11PM +0300, Nikolay Shaplov wrote: > В письме от понедельник, 24 марта 2025 г. 20:48:29 MSK пользователь Nathan > Bossart написал: > >> For the sake of discussion, here is what the enum approach would look like. > > In my point of view this solution is much-much better: it achieves all goals, > has no drawbacks, and do not change reloption engine. I think this change would probably work fine, but it does have a couple of drawbacks: * We'll need to make it really clear in comments why you would want to use this enum instead of making it a Boolean reloption. That's not a big deal. * We'd need to decide what to say on the documentation side. My first instinct is that we should just leave it as "boolean" because otherwise we're going to describe something that sounds an awful lot like a Boolean. * I don't think this matches the parse_bool() behavior exactly. For example, parse_bool() appears to accept inputs like "t" to mean "true". This is also probably not a huge deal. That being said, I'm open to applying this patch if it seems like a majority of folks prefer this implementation. FWIW I'm still partial to the isset_offset approach for the reasons I've already discussed. -- nathan
On Mon, Mar 24, 2025 at 2:45 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > * I don't think this matches the parse_bool() behavior exactly. For > example, parse_bool() appears to accept inputs like "t" to mean "true". > This is also probably not a huge deal. It's not great. We allow t/f in most places and some people may rely on it. Having one place be randomly different is not ideal. I still don't understand what problem we're trying to solve here. Using a Boolean flag variable to indicate whether some other value is or is not initialized is an extremely standard programming practice. I've done it hundreds of times, and I'm sure some of those times are now part of PostgreSQL. If we're going to change this -- especially in a way that has potentially user-visible consequences -- we should have some non-hyperbolic explanation of why it's a problem in this particular case. I think that the answer here might be that Nikolay doesn't like this because it interacts badly with his "New [relation] options engine," mentioned upthread. While I'm sympathetic from the point of view that it is no fun at all to have your patch get sideswiped, either that patch can be easily adapted to handle the possibility of an is-set flag or it can't. If it can't, then I think that's a flaw in that patch, not this one. I think it's been demonstrated that there are some things that somebody might like to do with the current reloptions engine that are actually fairly hard to make work. We should be looking for an engine that is more flexible and can easily accommodate new needs as they arise. An isset flag is such a minor deviation from existing practice that it shouldn't be a big deal; in fact, I think it's surprising we haven't needed it sooner, and I think if an engine is so opinionated about how things have to work that it can't accommodate that, it's not a good idea for us to adopt it. And if I'm wrong in thinking that, then I would like a detailed technical argument explaining exactly why. -- Robert Haas EDB: http://www.enterprisedb.com
В письме от понедельник, 24 марта 2025 г. 21:45:27 MSK пользователь Nathan Bossart написал: > * We'd need to decide what to say on the documentation side. My first > instinct is that we should just leave it as "boolean" because otherwise > we're going to describe something that sounds an awful lot like a > Boolean. I think that "boolean-like" is a good word for describing it. > * I don't think this matches the parse_bool() behavior exactly. For > example, parse_bool() appears to accept inputs like "t" to mean "true". > This is also probably not a huge deal. That was me, who added Enum Reloption to the postgres code, and it was Alvaro (if I recall correctly) who improved StdRdOptIndexCleanupValues option list to it's current state. I would trust him here, since, if I recall correctly he is original author of reloptions the way we know them. But if "t" is really missing I think we can patch both StdRdOptIndexCleanupValues and StdRdOptBoolValues to have "t". But in separate commit. Moreover if this Boolean-Based enum is a common case, we can think about adding some basic template, that can be extended by adding "auto" value, or not-set value and so on. But I do not think that two cases makes it common case. > That being said, I'm open to applying this patch if it seems like a > majority of folks prefer this implementation. FWIW I'm still partial to > the isset_offset approach for the reasons I've already discussed. I am for a long while working with reloptions, and can say, that almost nobody fully understand the idea, how it work and how it was designed. So I would suggest follow Alvaro, since he is keeper of that knowledge. And second general idea: changing engine is bad, at least when you can manage without changing it. We can, your patch proves it. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 10:12:05PM +0300, Nikolay Shaplov wrote: > And second general idea: changing engine is bad, at least when you can manage > without changing it. You have asserted this a couple of times without providing any reasons why. I know of no general project policy about changing the reloption code. I would expect this code to evolve just like any other part of Postgres, whether it's to improve performance or to expand the feature set. -- nathan
В письме от понедельник, 24 марта 2025 г. 22:11:19 MSK пользователь Robert Haas написал: > I think that the answer here might be that Nikolay doesn't like this > because it interacts badly with his "New [relation] options engine," There is no problem with adding isset_offset into "New [relation] options engine", not a bit deal. But it will break all internal logic of reloption. Both current version and my patch > And if I'm wrong in thinking that, then I would like a detailed > technical argument explaining exactly why. You can't have both isset_offset flag and default_value. Either one or another. You can't have them both, it will make whole design inconsistent. We already have default_value. Even for boolean. If you want custom behaviour when option is not set, you set unreachable default value. This is the way it is done for _all_ options that has custom behavior for unset option. Even those that a technically boolean. Why vacuum_truncate should do another way? Either do it the way it have been done before, or suggest total redisign for all oprions that has custom unset behaviour via unreachable defaults. This will be consistent. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
В письме от понедельник, 24 марта 2025 г. 22:26:17 MSK пользователь Nathan Bossart написал: > > And second general idea: changing engine is bad, at least when you can > > manage without changing it. > > You have asserted this a couple of times without providing any reasons why. > I know of no general project policy about changing the reloption code. I > would expect this code to evolve just like any other part of Postgres, > whether it's to improve performance or to expand the feature set. Because this code was carefully designed, and it is intentionally was made the way it is. It can be redesigned, but redesigning is not just adding another field to a C structure. It requires very carefully consideration, not as a part of option patch, but as a redesign patch. See my answer to Robert. We can have isset_offset, but then we have redesign all options with custom unset behavior to use it, instead of unreachable default value. This will make it consistent then. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 10:35:41PM +0300, Nikolay Shaplov wrote: > We can have isset_offset, but then we have redesign all options with > custom unset behavior to use it, instead of unreachable default value. > This will make it consistent then. I don't see any reason why we are compelled to redesign all such options, but in any case, I would think that would be preferable to magic special values. For example, /* -1 is used to disable max threshold */ vac_max_thresh = (relopts && relopts->vacuum_max_threshold >= -1) ? relopts->vacuum_max_threshold : autovacuum_vac_max_thresh; would become something like if (relopts && relopts->vacuum_max_threshold_set) vac_max_thresh = relopts->vacuum_max_threshold; else vac_max_thresh = autovacuum_vac_max_thresh; The former requires you to know that the reloption defaults to -2 if not set, which does not seem particularly obvious to me. At least, I did not find this out-of-range reloption default technique obvious when I was working on autovacuum_vacuum_max_threshold. But again, I don't see any strong reason why we must change all such reloptions. -- nathan
В письме от понедельник, 24 марта 2025 г. 23:04:39 MSK пользователь Nathan Bossart написал: > > We can have isset_offset, but then we have redesign all options with > > custom unset behavior to use it, instead of unreachable default value. > > This will make it consistent then. > > I don't see any reason why we are compelled to redesign all such options, > but in any case, I would think that would be preferable to magic special > values. For example, > > /* -1 is used to disable max threshold */ > vac_max_thresh = (relopts && relopts->vacuum_max_threshold >= -1) > ? relopts->vacuum_max_threshold > > : autovacuum_vac_max_thresh; > > would become something like > > if (relopts && relopts->vacuum_max_threshold_set) > vac_max_thresh = relopts->vacuum_max_threshold; > else > vac_max_thresh = autovacuum_vac_max_thresh; I totally agree. This looks much nicer. But see how the code would change for fillfactor, where default value is in valid range and to be actually used? What would happen to default_val attribute in relopt_* structures? Should we keep it? Should we remove it? If we keep it, what should we set for vacuum_max_threshold? If we remove it how we should treat default value for fillfactor? Then you see, you will have extra int field for at least for every option with unreachable defaults. Do we really need this? default_val and isset_* flags are two completely different ways to do similar thing. You will never find a nice way to use them both. So I am up to keeping it the way it is: there exist only default_val and it is used to detect unset option when needed. > But again, I don't see any strong reason why we must change all such > reloptions. Because code of the engine should be consistent. We can't have two different ways to do same thing. If we have isset flag, we should go for it everywhere, where isset logic exists. Or do not use it at all. Other ways will lead us to a much bigger mess, then we have today. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Mon, Mar 24, 2025 at 11:27:15PM +0300, Nikolay Shaplov wrote: > В письме от понедельник, 24 марта 2025 г. 23:04:39 MSK пользователь Nathan > Bossart написал: >> But again, I don't see any strong reason why we must change all such >> reloptions. > > Because code of the engine should be consistent. We can't have two different > ways to do same thing. If we have isset flag, we should go for it everywhere, > where isset logic exists. Or do not use it at all. Other ways will lead us to > a much bigger mess, then we have today. I don't disagree that it might be desirable for all reloptions with corresponding GUCs to use isset_offset, but I am not following why this is critically important. The out-of-range default approach has worked just fine for years, and I'm not aware of any reason that isset_offset isn't also a suitable solution. -- nathan
On Mon, Mar 24, 2025 at 11:45 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 24, 2025 at 09:03:11PM +0300, Nikolay Shaplov wrote:
> В письме от понедельник, 24 марта 2025 г. 20:48:29 MSK пользователь Nathan
> Bossart написал:
>
>> For the sake of discussion, here is what the enum approach would look like.
>
> In my point of view this solution is much-much better: it achieves all goals,
> has no drawbacks, and do not change reloption engine.
I am in favor of changing over to an enum at this point (let's try to keep users believing its a boolean if possible).
This isn't an endorsement on this idea and framing of things (say were we in a green-field situation) but rather that this patch is basically the "null hypothesis" and the alternate (presently committed) patch doesn't do enough to reject the null hypothesis, IME. (More detail on this below.)
I think this change would probably work fine, but it does have a couple of
drawbacks:
* We'll need to make it really clear in comments why you would want to use
this enum instead of making it a Boolean reloption. That's not a big
deal.
Agreed.
* We'd need to decide what to say on the documentation side. My first
instinct is that we should just leave it as "boolean" because otherwise
we're going to describe something that sounds an awful lot like a
Boolean.
I'd start here too and wait to be convinced its pointless because the user is going to be exposed to the enum aspect anyway via system introspection.
* I don't think this matches the parse_bool() behavior exactly. For
example, parse_bool() appears to accept inputs like "t" to mean "true".
This is also probably not a huge deal.
Fixable for sure.
That being said, I'm open to applying this patch if it seems like a
majority of folks prefer this implementation. FWIW I'm still partial to
the isset_offset approach for the reasons I've already discussed.
I'm preferable to turning the implementation into an enum. Slight preference to documenting it as the boolean it presently is; pending any information regarding where in user-space, particularly in psql but also any client-side details, this internal detail might show through. It avoids having to tell the reader that we choose enum in order to accommodate optionality. But it isn't the end of the world if we do say that either.
I'm not convinced our existing way of approaching/framing this is great - defaults and is-set are better treated as being orthogonal; but, I'm also not fond of reserving space for a bunch of zeros and having to using naming conventions to match up vacuum_truncate and vacuum_truncate_set. So while "isset_offset" better matches my preferred "framing" of this system I agree forcing it in here with an implementation that isn't ideal doesn't make sense when we can just continue on doing what we've been doing and use this experience to bring a fresh perspective to the larger work that apparently is going on in this area.
We do at least have the user experience correct either way - use reset to change to the state that denotes "this option has not been set".
David J.
On Mon, Mar 24, 2025 at 08:57:27PM -0700, David G. Johnston wrote: > On Mon, Mar 24, 2025 at 11:45 AM Nathan Bossart <nathandbossart@gmail.com> > wrote: >> * I don't think this matches the parse_bool() behavior exactly. For >> example, parse_bool() appears to accept inputs like "t" to mean "true". >> This is also probably not a huge deal. > > Fixable for sure. I've attached an attempt at this. >> That being said, I'm open to applying this patch if it seems like a >> majority of folks prefer this implementation. FWIW I'm still partial to >> the isset_offset approach for the reasons I've already discussed. > > I'm preferable to turning the implementation into an enum. Slight > preference to documenting it as the boolean it presently is; pending any > information regarding where in user-space, particularly in psql but also > any client-side details, this internal detail might show through. It > avoids having to tell the reader that we choose enum in order to > accommodate optionality. But it isn't the end of the world if we do say > that either. The only other place I found that reveals this internal implementation detail is parse_one_reloption. In the attached patch, I've modified it to emit the same error message as for RELOPT_TYPE_BOOL to further uphold the illusion that it is a Boolean reloption. I'm not sure I've patched up all such holes, though. TBH I find the attached to be even more of a hack than isset_offset. We have to try to match behavior in multiple places, and we need lengthy commentary about why you'd want to use this enum. Furthermore, it's only useful for Boolean-style reloptions, unlike isset_offset which can be used for any type. But perhaps more importantly, I agree with Robert that it's not clear what problem it's trying to solve. The concrete arguments in this thread range from avoiding extra bytes in the struct to maintaining consistency with other options with backing GUCs, which I'd argue are more style preferences than anything. In any case, AFAICT the votes are somewhat evenly divided at the moment, so I do not intend to proceed with this patch for now. -- nathan
Attachment
В письме от вторник, 25 марта 2025 г. 17:57:46 MSK пользователь Nathan Bossart написал: > In any case, AFAICT the votes are somewhat evenly divided at the moment, so > I do not intend to proceed with this patch for now. Counting votes does not lead anywhere, as I can ask friends and colleagues to vote for me for example. This is wrong way. There are people who involved it supporting reloption engine. If Alvaro said "isset_offset" is good, let's have it, I will shut keep my mouth shut, right after saying "As you say". Because he is the author of that part of the code, I respect this. I am also the person that dedicated lot's of efforts to work with reloptions. I will do it for may years ahead, I think. This isset_offset I will have to support from now on, if we do not revert it. I do not like it, I see there is no logic in it, at least the way you suggest it. If you want to dedicate part of your life to reloptions, you are welcome. I need help with a code to review. If you really need isset_offset let's together redisign options to use it instead of default_value. Go for it, but not partly, but totally. But if you want to fix one option and leave, please do not leave us with isset_offset. This is not about democracy, this is about who will deal with that part of the code later. I guess it will be me and Alvaro, not you. Please let us have the code the way we like it, since we support it most of the time. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Tue, Mar 25, 2025 at 12:35 PM Nikolay Shaplov <dhyan@nataraj.su> wrote: > This is not about democracy, this is about who will deal with that part of the > code later. I guess it will be me and Alvaro, not you. Please let us have the > code the way we like it, since we support it most of the time. Actually, we decide many things around here by counting votes. But we also decide things by providing reasons that other people can engage with intellectually, and I still say you're not really doing that. You're just saying you like X better than Y, but without really giving any reason that anybody else can understand and agree with. Also, the last time you authored a commit to this file was in 2019. Even if it were last week and even if you were a committer, that wouldn't give you some kind of maintainer-level control over the file to prevent other people from doing logical things with it that you happen not to like. -- Robert Haas EDB: http://www.enterprisedb.com
On Wednesday, March 26, 2025, Robert Haas <robertmhaas@gmail.com> wrote:
But we also decide things by providing reasons that other people can
engage with intellectually, and I still say you're not really doing
that. You're just saying you like X better than Y, but without really
giving any reason that anybody else can understand and agree with.
The argument being made is that the enum patch adheres to established practices; and when adding new code that new code is encouraged to adhere to how existing code is written. A vote to keep to such guidelines seems reasonable and sufficient; and can outweigh quite a bit of deficiency such existing code may have relative to the new proposal. The burden is on the new code to justify why it should violate established conventions.
So, the question posed is should the new way of doing this, established by the committed patch, be allowed to stay and exist side-by-side with the original convention for handling the determination of whether an option is set? I’m +.75 for no and +.25 for yes. Convention tips the balance for me.
David J.
On 2025-Mar-26, David G. Johnston wrote: > The argument being made is that the enum patch adheres to established > practices; and when adding new code that new code is encouraged to adhere > to how existing code is written. A vote to keep to such guidelines seems > reasonable and sufficient; and can outweigh quite a bit of deficiency such > existing code may have relative to the new proposal. The burden is on the > new code to justify why it should violate established conventions. I think the goal of unifying the various pieces of code that handle options by building a generic "engine" is a good one, and I tend to agree with the argument that the less things this "engine" has to support (while keeping the functionality the same), the better. Given that we've gone far and long without needing the "isset" flag, and that it's not clear that we really need the concept, I'd rather go with not having it. Now -- Nikolay also appears to be saying that we either choose the concept of a "default" or the concept of the "isset" flag, but not both. That says to me that the other option would be to remove all default values and rely solely on isset. Is that a workable option? If it's not, then it would be clear to me that we'd rather stick with defaults. But if it is, and if that would get us to an overall simplification of the concepts, then perhaps that is better. (In other words: for the other enums that are currently imitating booleans, can we remove that and instead use isset? And are there other uses of default values that aren't booleans that can be turned into isset or into something different?) -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Las navajas y los monos deben estar siempre distantes" (Germán Poo)
В письме от среда, 26 марта 2025 г. 16:52:09 MSK пользователь Robert Haas написал: > But we also decide things by providing reasons that other people can > engage with intellectually, and I still say you're not really doing > that. You're just saying you like X better than Y, but without really > giving any reason that anybody else can understand and agree with. Then please, listen to a reason. If we add isset_offset, it will make code inconsistent, because it follow same purpose as unreachable default_value in other non boolean options. Having two conflicting ways to do same thing is a bad design. We can get rid of unreachable default_value in every option and use isset_offset everywhere to detect unset options, but this will give us lot's of extra flags in StdRdOptions, and more over we still have case when we have useful default_value. So code will have to either ignore default_value in option definition when author imply that he will use isset_offset flag later, or ignore isset_offset flag when processing option with useful default value. These two cases, I emphasize, are not directly stated in the code, they are implied somewhere in author's head, and you will never guess which one is in this case, until you finish reading, or one need to write a lot of comments. Please try to expand all this isset_offset-based schema for fillfactor (option with default value), vacuum_max_threshold (option with unreachable default value) and vacuum_truncate. If you do it carefully you will see, there is no way to do it without any conflicts. Better way, if you want to track if vacuum_truncate is, is to encode it in the default value somehow, as it have been done before. It will not be boolean anymore, but it can keep boolean-like behaviour for user. The most simple way to do this, using current code base, is to use enum. If this will become common case (And I think it will some day), then we should add separate data type for it. This would be better then enum. But it will still have to use unreachable default value to detect unset option, not some kind of flag. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Wed, Mar 26, 2025 at 03:30:49PM +0100, Álvaro Herrera wrote: > On 2025-Mar-26, David G. Johnston wrote: >> The argument being made is that the enum patch adheres to established >> practices; and when adding new code that new code is encouraged to adhere >> to how existing code is written. A vote to keep to such guidelines seems >> reasonable and sufficient; and can outweigh quite a bit of deficiency such >> existing code may have relative to the new proposal. The burden is on the >> new code to justify why it should violate established conventions. There's no existing code for an enum reloption that has been hacked to look like a bool reloption. There are a couple of enums that accept many of the same values as bool relopts but that also accept another value, but AFAIK we don't try to pretend that they are actually bools. > Now -- Nikolay also appears to be saying that we either choose the > concept of a "default" or the concept of the "isset" flag, but not both. > That says to me that the other option would be to remove all default > values and rely solely on isset. Is that a workable option? If it's > not, then it would be clear to me that we'd rather stick with defaults. > But if it is, and if that would get us to an overall simplification of > the concepts, then perhaps that is better. (In other words: for the > other enums that are currently imitating booleans, can we remove that > and instead use isset? And are there other uses of default values that > aren't booleans that can be turned into isset or into something > different?) Many of the autovacuum reloptions could use isset. Right now, we use an out-of-range value as the default so that we can determine whether the reloption is actually set for the relation. We'd likely still need a default value field for reloptions without corresponding GUCs, but that doesn't seem too troublesome to me. TBH I was quite surprised that nobody had encountered this problem with Boolean reloptions yet. -- nathan
On Wed, Mar 26, 2025 at 10:17 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > The argument being made is that the enum patch adheres to established practices; and when adding new code that new codeis encouraged to adhere to how existing code is written. A vote to keep to such guidelines seems reasonable and sufficient;and can outweigh quite a bit of deficiency such existing code may have relative to the new proposal. The burdenis on the new code to justify why it should violate established conventions. I kind of agree with that, but: 1. We're talking about a minor deviation resulting in a very small amount of additional code. It's entirely unclear to me why anyone thinks this is a big deal either way, even if one accepts that the patch is "wrong", which I don't. 2. While it is true that reloptions have up until not had any case where the default value wasn't representable by the underlying data type, but there's a lot of PostgreSQL code and there is ample precedent across the broader code base for either (a) using a second Boolean to track whether a first Boolean is set or (b) converting a Boolean to a three-valued enum. There are many examples of both styles. 3. It seems like the fact that this has not been needed up until now is partly just good luck. It's probably true that most integer or real-valued reloptions can use 0 or -1 for this purpose, strings can probably mostly use the empty string, and enums can add a value just for this purpose. But I can't see any intrinsic reason why every reloption anyone ever adds in the future has to work out nicely in that way. I feel like you're trying to take an accidental property of the status quo ante and elevating it to a rule that doesn't seem to have clearly been intended and wouldn't be a particularly sound design principle anyway. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Mar 26, 2025 at 10:40 AM Nikolay Shaplov <dhyan@nataraj.su> wrote: > Then please, listen to a reason. If we add isset_offset, it will make code > inconsistent, because it follow same purpose as unreachable default_value in > other non boolean options. Having two conflicting ways to do same thing is a > bad design. As Nathan also says, I'm not sure why we're obliged to do everything one way or the other. If you look at postgresql.conf, there are some settings where a designated value is used to disable something -- for instance temp_file_limit=-1 disables any limit on the size of temp files, and backend_flush_after=0 also disables the corresponding behavior -- but there are other settings where that doesn't make sense and we add a separate switch e.g. ssl=off disables SSL, separate from any of the other SSL-affecting GUCs, and archive_mode=off disables archiving separately from the archive_command setting. Your argument seems to lead to the conclusion that it is bad design, but I don't see it that way. I think it would be bad design to add a separate temp_file_limit_enabled=on/off setting when we could just use temp_file_limit=-1 to mean that, and I think it would be equally a bad idea to enable ssl based on some non-obvious criterion like whether ssl_cert_file is set to the empty string. The cases are different, and it's OK to treat them differently, IMHO. > We can get rid of unreachable default_value in every option and use > isset_offset everywhere to detect unset options, but this will give us lot's of > extra flags in StdRdOptions, I agree, that doesn't seem like a good idea. > and more over we still have case when we have > useful default_value. So code will have to either ignore default_value in > option definition when author imply that he will use isset_offset flag later, or > ignore isset_offset flag when processing option with useful default value. These > two cases, I emphasize, are not directly stated in the code, they are implied > somewhere in author's head, and you will never guess which one is in this > case, until you finish reading, or one need to write a lot of comments. I'm surprised that you think it will be hard to clarify this in the comments. -- Robert Haas EDB: http://www.enterprisedb.com
В письме от среда, 26 марта 2025 г. 17:42:23 MSK пользователь Robert Haas написал: > 1. We're talking about a minor deviation resulting in a very small > amount of additional code. It's entirely unclear to me why anyone > thinks this is a big deal either way, even if one accepts that the > patch is "wrong", which I don't. This code is important part of my life for 8 years. There are not many commits, but I do code rebases all the time, I port features that have been added for all these years. I do care about it, it is big deal for me. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Wed, Mar 26, 2025 at 7:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Mar 26, 2025 at 10:17 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> The argument being made is that the enum patch adheres to established practices; and when adding new code that new code is encouraged to adhere to how existing code is written. A vote to keep to such guidelines seems reasonable and sufficient; and can outweigh quite a bit of deficiency such existing code may have relative to the new proposal. The burden is on the new code to justify why it should violate established conventions.
I kind of agree with that, but:
1. We're talking about a minor deviation resulting in a very small
amount of additional code. It's entirely unclear to me why anyone
thinks this is a big deal either way, even if one accepts that the
patch is "wrong", which I don't.
I'm willing to say "I don't know why this is so very important to Nikolay, but I trust him that it is, and since my opinion isn't that strong and this isn't a big deal, so I will accommodate the person screaming that adding this will make their life miserable in a real way." Maybe others need more evidence of what that misery looks like?
David J.
On Wed, Mar 26, 2025 at 08:09:53AM -0700, David G. Johnston wrote: > I'm willing to say "I don't know why this is so very important to Nikolay, > but I trust him that it is, and since my opinion isn't that strong and this > isn't a big deal, so I will accommodate the person screaming that adding > this will make their life miserable in a real way." Maybe others need more > evidence of what that misery looks like? To summarize the discussion thus far, Robert and I seem to be okay with isset_offset, David is leaning no (+.25 yes and -.75 no), and Niolay and Álvaro appear to favor an enum approach, which would look like this [0]. That's +2.25 and -2.75. Is that an accurate analysis of where folks stand? [0] https://postgr.es/m/attachment/174762/v2-0001-change-vacuum_truncate-relopt-to-enum.patch -- nathan
On Wed, Mar 26, 2025 at 11:10 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > I'm willing to say "I don't know why this is so very important to Nikolay, but I trust him that it is, and since my opinionisn't that strong and this isn't a big deal, so I will accommodate the person screaming that adding this will maketheir life miserable in a real way." Maybe others need more evidence of what that misery looks like? What I'm upset about is that it feels to me like Nikolay is trying to win the argument by yelling. I don't want that to be the way we do things around here. I admit that sometimes it is, and I think that is bad, no matter who the yeller is and who is getting yelled at. People get upset, including me, and that is life, but whether people are upset should never be the determinant of what goes into the tree. I fear that this argument will make Nathan - or other committers - more cautious in committing routine patches for no real reason, and I think that would be bad for the project. Committing patches frequently results in a bunch of people getting annoyed at you because of some oversight, and it's one of the reasons why it takes so long to get a patch committed around here -- people are risk-averse and afraid of doing something wrong, so they do nothing. I don't really know how we can solve that problem, but I definitely do not want to see us start getting upset with people when they didn't even do anything wrong. And that's how I see this case. Somebody could have chosen to solve this problem in a few ways, Nathan picked one of the reasonable ones, and now we're spending dozens of email messages arguing about whether Nathan is a terrible person for doing something that looks totally normal. In my mind, that is absolutely clear. Nathan is not a terrible person for that, and I feel rather strongly that saying otherwise is harmful to the development community and to the project as a whole. I have no problem with a rational discussion of what the best option is here, but I am absolutely not OK with vitriolic rhetoric about how things are awful when, AFAICS, nothing has happened here beyond the totally routine. In a certain sense, the damage here has already been done. Nathan has already had to spend a significant amount of time engaging with this thread over what I think should be a complete non-event, and will probably have to spend more, and all that takes away from time that could, for example, be spent reviewing and committing other patches. And for what? If it ever happens that the design decision that this patch made caused a real problem, some future patch could have reverted it and substituted something else and probably nobody would have cared. Even now, if some committer other than Nathan cares enough to change something here, I doubt that Nathan will really care. But I cannot see any world in which pinning Nathan beyond a barrel and demanding action is a win for the project overall. If we argued this much about every design detail of my patches, I would have quit working on this project long ago. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Mar 26, 2025 at 11:41:13AM -0400, Robert Haas wrote: > What I'm upset about is that it feels to me like Nikolay is trying to > win the argument by yelling. I don't want that to be the way we do > things around here. I admit that sometimes it is, and I think that is > bad, no matter who the yeller is and who is getting yelled at. People > get upset, including me, and that is life, but whether people are > upset should never be the determinant of what goes into the tree. +1 > I have no > problem with a rational discussion of what the best option is here, > but I am absolutely not OK with vitriolic rhetoric about how things > are awful when, AFAICS, nothing has happened here beyond the totally > routine. +1 > In a certain sense, the damage here has already been done. > Nathan has already had to spend a significant amount of time engaging > with this thread over what I think should be a complete non-event, and > will probably have to spend more, and all that takes away from time > that could, for example, be spent reviewing and committing other > patches. And for what? I've debated bringing this up, but this has been the most frustrating part of the discussion for me. While I'm trying to responsibly commit a couple of other big patches for v18 (for which I am not the primary author), I'm also spending a huge amount of time trying to have some sort of rational discussion about a handful of lines of code that seem to work just fine. FWIW one of the big reasons I didn't proceed with the enum approach initially is because I worried that I'd end up in a similar discussion about how terrible _that_ approach is. When I look at that patch [0], I genuinely wonder if folks would accept that without the isset_offset context. Maybe I misjudged... > If it ever happens that the design decision that this patch made > caused a real problem, some future patch could have reverted it and > substituted something else and probably nobody would have cared. Even > now, if some committer other than Nathan cares enough to change > something here, I doubt that Nathan will really care. But I cannot see > any world in which pinning Nathan beyond a barrel and demanding action > is a win for the project overall. If we argued this much about every > design detail of my patches, I would have quit working on this project > long ago. I change others' code all the time, and I fully expect that people will change my code from time to time, too. The vacuum_truncate code is no exception. As long as it advances the project in some way, I'm happy. [0] https://postgr.es/m/attachment/174762/v2-0001-change-vacuum_truncate-relopt-to-enum.patch -- nathan
On Wed, Mar 26, 2025 at 8:41 AM Robert Haas <robertmhaas@gmail.com> wrote:
Nathan has already had to spend a significant amount of time engaging
with this thread over what I think should be a complete non-event, and
will probably have to spend more, and all that takes away from time
that could, for example, be spent reviewing and committing other
patches. And for what?
A broader awareness of what is going on in this little corner of an expansive codebase. And making explicit those little unwritten rules happened upon by accident so many years ago, and their consequences.
I don't see where anyone has done something wrong or bad. This touched new territory in a patch that otherwise few people likely cared about - the vacuum_truncate feature itself being already developed - a post-commit realization and discussion was the most likely outcome. Nikolay's desire for an API-only thread/patch in which to have this discussion does match with ideal circumstances but is not a practical reality. I wouldn't be too hard on him for expressing that desire - I'm sure we've all wished to not be surprised in this way.
A more concrete explanation of the effect of this patch on other work would be nice to have; and maybe we keep the patch in until we all get a better feel of the negative side-effects that are presently just being alluded to. This can all easily wait until May.
David J.
On Wed, Mar 26, 2025 at 9:14 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
FWIW one of the big reasons I didn't proceed with the enum approach
initially is because I worried that I'd end up in a similar discussion
about how terrible _that_ approach is. When I look at that patch [0], I
genuinely wonder if folks would accept that without the isset_offset
context. Maybe I misjudged...
I think this discussion was going to happen no matter which approach was actually committed. The concept of "is set" is too obvious and clean a solution to not be brought up and considered; while at the same time this argument about staying consistent with nearby code, even in the face of a hack-ish implementation, was going to need to be explicitly considered.
This discussion was preordained the moment we decided to add vacuum_truncate to the system. It was only a matter of when. And while hindsight is 20-20 your own comments regarding your uncertainty suggests you at least had an inkling of suspicion that this discussion was going to be part of the outcome of committing this.
I would have been fine with: "I committed this approach because it's cleaner, and here is why I dislike the enum approach. Let's have a discussion in May if this choice is unappealing for reasons." Getting in the user-facing feature "DBA choice of default" before feature freeze was warranted and the patch as committed did meet all the necessary requirements.
David J.
В письме от среда, 26 марта 2025 г. 18:41:13 MSK пользователь Robert Haas написал: > What I'm upset about is that it feels to me like Nikolay is trying to > win the argument by yelling. I do not think it is fair. When votes are against you (and I did not ask anybody else's help for it) it turned out, that I am yelling, and thus voting is not valid. I do not demand enum, you can create new reloption data type that can be true false and unset. I even would agree with isset_offset, it you manage to make code consistent. I doubt anybody manages do it, but may be I am wrong. You did not even try, you just insist on keeping code inconsistent, because nobody care. Nikolay cares, but he is yelling while caring, so let's set him aside. The only thing I am asking for: please keep code consistent. Better keep reloption core the way it was, but if it is not possible, then keep it consistent then. Enum is the cheapest way to achieve the goal Nathan want, and keep code consistent. Cheapest, not the best. When the dust settles down I will try to invent boolean-based data type, that would not be ugly as this enum, and even may be suitable for vacuum_index_cleanup option. I see there is a demand for this. But this will need carefull thinking and development. And starting from ugly Enum with boolean values would be much easy then from boolean with isset_offset. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Wed, Mar 26, 2025 at 10:29 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
The only thing I am asking for: please keep code consistent. Better keep
reloption core the way it was, but if it is not possible, then keep it
consistent then.
Consistency at the expense of all else is a debatable position that will benefit from being actively justified. Incremental improvement has a lot of merit. The cost of incremental improvement is quite low for generally high benefit. Wholesale rewriting has a much higher cost for basically the same benefit. That's a lot of added cost in order to stay "consistent", so what are the added benefits of consistency for incurring that cost?
There is a baseline benefit to consistency that any incremental improvement needs to overcome to be worthwhile. I don't think the isset_offset implementation to handle this rare boolean option overcomes that baseline. I've tossed my weight, expect others to do so as well, and we'll move forward. But none of the positions proposed here are considered absolute by the project.
David J.
On Wed, Mar 26, 2025 at 1:29 PM Nikolay Shaplov <dhyan@nataraj.su> wrote: > > What I'm upset about is that it feels to me like Nikolay is trying to > > win the argument by yelling. > > I do not think it is fair. When votes are against you (and I did not ask > anybody else's help for it) it turned out, that I am yelling, and thus voting > is not valid. I'm completely willing to admit that I may be misreading your emails. It's very difficult to judge tone on the Internet, and I may very well have misread the tone that you intended to convey. I can tell you how they came across to me, but, of course, I cannot say how you mean them. But I do not think it is true that I am accusing you of yelling because the voting is against you. The reason why I felt that way is because you were using what seemed to me to be very strong language over what seemed to me to be a very minor issue. Phrases like "it will make whole design inconsistent", "this is nasty thing", "you've just changed the whole engine", and "this goes against all current practice of options usage," seem like very strong wording and, in my opinion, are just not accurate. -- Robert Haas EDB: http://www.enterprisedb.com
В письме от среда, 26 марта 2025 г. 21:16:20 MSK пользователь Robert Haas написал: > The reason why I felt that way is > because you were using what seemed to me to be very strong language > over what seemed to me to be a very minor issue. Being not native speaker has it's own difficulties. Plus possible cultural differences and so on. I feel emotions about "my precious thing have been broken for no reason at all", and this should affect the text, and I guess I can not properly keep proper emotional degree as I do not feel fine meaning of the words. Yes, I feel emotions, No, I am not yelling, and certainly I am not yelling as final argument. Can you please skip the form, that may be not perfect, and see the content. I hope that I managed to express the core idea quite clear. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On Wed, Mar 26, 2025 at 2:34 PM Nikolay Shaplov <dhyan@nataraj.su> wrote: > В письме от среда, 26 марта 2025 г. 21:16:20 MSK пользователь Robert Haas > написал: > > The reason why I felt that way is > > because you were using what seemed to me to be very strong language > > over what seemed to me to be a very minor issue. > > Being not native speaker has it's own difficulties. Plus possible cultural > differences and so on. I feel emotions about "my precious thing have been > broken for no reason at all", and this should affect the text, and I guess I > can not properly keep proper emotional degree as I do not feel fine meaning of > the words. Yes, I feel emotions, No, I am not yelling, and certainly I am not > yelling as final argument. OK, fair enough. > Can you please skip the form, that may be not perfect, and see the content. I > hope that I managed to express the core idea quite clear. Honestly, I'm still pretty baffled. I understand that different people may have different stylistic preferences here, and that is fair enough, but I personally think what Nathan did is more elegant than inventing an enum type with three possible values. Even if a majority of people feel otherwise, why is it worth arguing about? I cannot see how this really makes any great difference. -- Robert Haas EDB: http://www.enterprisedb.com
В письме от среда, 26 марта 2025 г. 21:43:42 MSK пользователь Robert Haas написал: > why is it worth arguing about? I have full model of current reloptions and my patch unrolled in my head. They have some harmony. (Mine has more harmony, but it is not the case). isset_offset the way it is introduced breaks that harmony. Harmony worth it. > I cannot see how this really makes any great difference. git blame reloptions.c | grep "Robert Haas" | wc -l 112 git blame reloptions.c | grep Álvaro | wc -l 792 For me Alvary is the keeper of the concept of reloption. And he sees difference. If he would said isset_offset is good, I will follow. Until then I will follow my sense of harmony and code beauty as much as I can. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Attachment
On 2025-Mar-26, Nikolay Shaplov wrote: > git blame reloptions.c | grep "Robert Haas" | wc -l > 112 > > git blame reloptions.c | grep Álvaro | wc -l > 792 > > For me Alvary is the keeper of the concept of reloption. And he sees > difference. If he would said isset_offset is good, I will follow. Until then I > will follow my sense of harmony and code beauty as much as I can. Well, as I said upthread, I wrote this code far too long ago and I don't consider myself as its "owner" in the sense that I can block others from moving it forward. I really wish I had had time to review and get your new options patch to completion earlier -- we probably wouldn't be having this discussion if I had. But that regretfully didn't happen. I think the opinions have been sufficiently voiced, and I wouldn't want to block Nathan from being able to make progress on the other projects he has for this release. Given that continued discussion has not convinced several committers that this is a change worth making, it seems reasonable to give up on requiring that your proposed change be done. I'd recommend you to take a break from it for a while. Maybe a design solution will occur to you that will not break everything (for example: some options allow a default value to be expressed, others allow the "isset" flag to be used). Thanks for your persistence -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "[PostgreSQL] is a great group; in my opinion it is THE best open source development communities in existence anywhere." (Lamar Owen)