Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions
Date
Msg-id 20190805142139.GW29202@tamriel.snowman.net
Whole thread Raw
In response to Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions
Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions
Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions
List pgsql-hackers
Greetings,

* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
> On Fri, Aug 02, 2019 at 06:38:46PM -0400, Stephen Frost wrote:
> >On Fri, Aug 2, 2019 at 18:27 Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> >>> There seems to be a consensus that this this not a pg_basebackup issue
> >>> (i.e. duplicate values don't make the file invalid), and it should be
> >>> handled in ALTER SYSTEM.
> >>
> >>Yeah.  I doubt pg_basebackup is the only actor that can create such
> >>situations.
> >>
> >>> The proposal seems to be to run through the .auto.conf file, remove any
> >>> duplicates, and append the new entry at the end. That seems reasonable.
> >>
> >>+1
> >
> >I disagree that this should only be addressed in alter system, as I’ve said
> >before and as others have agreed with.  Having one set of code that can be
> >used to update parameters in the auto.conf and then have that be used by
> >pg_basebackup, alter system, and external tools, is the right approach.
> >
> >The idea that alter system should be the only thing that doesn’t just
> >append changes to the file is just going to lead to confusion and bugs down
> >the road.
>
> I don't remember any suggestions ALTER SYSTEM should be the only thing
> that can rewrite the config file, but maybe it's buried somewhere in the
> thread history. The current proposal certainly does not prohibit any
> external tool from doing so, it just says we should expect duplicates.

There's an ongoing assumption that's been made that only ALTER SYSTEM
could make these changes because nothing else has the full GUC system
and a running PG instance to validate everything.

The suggestion that an external tool could do it goes against that.

If we can't, for whatever reason, work our way towards having code that
external tools could leverage to manage .auto.conf, then if we could at
least document what the expectations are and what tools can/can't do
with the file, that would put us in a better position than where we are
now.

I strongly believe that whatever the rules and expectations are that we
come up with, both ALTER SYSTEM and the in-core and external tools
should follow them.

If we say to that tools should expect duplicates in the file, then
ALTER SYSTEM should as well, which was the whole issue in the first
place- ALTER SYSTEM didn't expect duplicates, but the external tools and
the GUC system did.

If we say that it's acceptable for something to remove duplicate GUC
entries from the file, keeping the last one, then external tools should
feel comfortable doing that too and we should make it clear what
"duplicate" means here and how to identify one.

If we say it's optional for a tool to remove duplicates, then we should
point out the risk of "running out of disk space" for tool authors to
consider.  I don't agree with the idea that tool authors should be asked
to depend on someone running ALTER SYSTEM to address that risk.  If
there's a strong feeling that tool authors should be able to depend on
PG to perform that cleanup for them, then we should use a mechanism to
do so which doesn't involve an entirely optional feature.

For reference, all of the above, while not as cleanly as it could have
been, was addressed with the recovery.conf/recovery.done system.  Tool
authors had a good sense that they could replace that file, and that PG
would clean it up at exactly the right moment, and there wasn't this
ugly interaction with ALTER SYSTEM to have to worry about.  That none of
this was really even discussed or addressed previously even after being
pointed out is really disappointing.

Just to be clear, I brought up this exact concern back in *November*:

https://www.postgresql.org/message-id/20181127153405.GX3415%40tamriel.snowman.net

And now because this was pushed forward and the concerns that I raised
ignored, we're having to deal with this towards the end of the release
cycle instead of during normal development.  The things we're talking
about now and which I'm getting push-back on because of the release
cycle situation were specifically suggestions I made in the above email
in November where I also brought up concern that ALTER SYSTEM would be
confused by the duplicates- giving external tools guideance on how to
modify .auto.conf, or providing them a tool (or library), or both.

None of this should be coming as a surprise to anyone who was following
and I feel we should be upset that this was left to such a late point in
the release cycle to address these issues.

> >>There was a discussion whether to print warnings about the duplicates. I
> >>> personally see not much point in doing that - if we consider duplicates
> >>> to be expected, and if ALTER SYSTEM has the license to rework the config
> >>> file any way it wants, why warn about it?
> >>
> >>Personally I agree that warnings are unnecessary.
> >
> >And at least Magnus and I disagree with that, as I recall from this
> >thread.  Let’s have a clean and clear way to modify the auto.conf and have
> >everything that touches the file update it in a consistent way.
>
> Well, I personally don't feel very strongly about it. I think the
> warnings will be a nuisance bothering people with expeced stuff, but I'm
> not willing to fight against it.

I'd be happier with one set of code at least being the recommended
approach to modifying the file and only one set of code in our codebase
that's messing with .auto.conf, so that, hopefully, it's done
consistently and properly and in a way that everything agrees on and
expects, but if we can't get there due to concerns about where we are in
the release cycle, et al, then let's at least document what is
*supposed* to happen and have our code do so.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: SSL Connection still showing TLSv1.3 even it is disabled in ssl_ciphers
Next
From: Stephen Frost
Date:
Subject: Re: block-level incremental backup