Thread: A recent message added to pg_upgade

A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

27 October 2023, 02:57:59

Hello.

Some messages recently introduced by commit 29d0a77fa6 seem to deviate
slightly from our standards.

+        if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade)
+        {
+            ereport(ERROR,
+                    errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                    errmsg("replication slots must not be invalidated during the upgrade"),
+                    errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"));

The message for errhint is not a complete sentence. And errmsg is not
in telegraph style.  The first attached makes minimum changes.

However, if allowed, I'd like to propose an alternative set of
messages as follows:

+                    errmsg("replication slot is invalidated during upgrade"),
+                    errhint("Set \"max_slot_wal_keep_size\" to -1 to avoid invalidation."));

The second attached does this.

What do you think about those?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

Re: A recent message added to pg_upgade

From

Bharath Rupireddy

Date:

27 October 2023, 03:21:47

On Fri, Oct 27, 2023 at 8:28 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> Hello.
>
> Some messages recently introduced by commit 29d0a77fa6 seem to deviate
> slightly from our standards.
>
> +               if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade)
> +               {
> +                       ereport(ERROR,
> +                                       errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                                       errmsg("replication slots must not be invalidated during the upgrade"),
> +                                       errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"));
>
> The message for errhint is not a complete sentence.

Yeah, the hint message should have ended with a period -
https://www.postgresql.org/docs/current/error-style-guide.html#ERROR-STYLE-GUIDE-GRAMMAR-PUNCTUATION.

> The second attached does this.
>
> What do you think about those?

+                    errmsg("replication slot is invalidated during upgrade"),
+                    errhint("Set \"max_slot_wal_keep_size\" to -1 to
avoid invalidation."));
         }

The above errhint LGTM. How about a slightly different errmsg, like
the following?

+                    errmsg("cannot invalidate replication slots when
in binary upgrade mode"),
+                    errhint("Set \"max_slot_wal_keep_size\" to -1 to
avoid invalidation."));

".... when in binary upgrade mode" is being used in many places.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

27 October 2023, 04:06:13

On Fri, Oct 27, 2023 at 8:52 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Fri, Oct 27, 2023 at 8:28 AM Kyotaro Horiguchi:
> The above errhint LGTM. How about a slightly different errmsg, like
> the following?
>
> +                    errmsg("cannot invalidate replication slots when
> in binary upgrade mode"),
> +                    errhint("Set \"max_slot_wal_keep_size\" to -1 to
> avoid invalidation."));
>
> ".... when in binary upgrade mode" is being used in many places.
>

By this time slot may be already invalidated, so how about:
"replication slot was invalidated when in binary upgrade mode"?

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Peter Smith

Date:

27 October 2023, 04:06:50

On Fri, Oct 27, 2023 at 1:58 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> Hello.
>
> Some messages recently introduced by commit 29d0a77fa6 seem to deviate
> slightly from our standards.
>
> +               if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade)
> +               {
> +                       ereport(ERROR,
> +                                       errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                                       errmsg("replication slots must not be invalidated during the upgrade"),
> +                                       errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"));
>
> The message for errhint is not a complete sentence. And errmsg is not
> in telegraph style.  The first attached makes minimum changes.
>
> However, if allowed, I'd like to propose an alternative set of
> messages as follows:
>
> +                                       errmsg("replication slot is invalidated during upgrade"),
> +                                       errhint("Set \"max_slot_wal_keep_size\" to -1 to avoid invalidation."));
>
> The second attached does this.
>
> What do you think about those?
>

IIUC the only possible way to reach this error (according to the
comment preceding it) is by the user overriding the GUC value (which
was defaulted -1) on the command line.

+ /*
+ * The logical replication slots shouldn't be invalidated as
+ * max_slot_wal_keep_size GUC is set to -1 during the upgrade.
+ *
+ * The following is just a sanity check.
+ */

Given that, I felt a more relevant msg/hint might be like:

errmsg("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"),
errhint("Do not override \"max_slot_wal_keep_size\" using command line
options."));

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

27 October 2023, 04:21:43

On Fri, Oct 27, 2023 at 9:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Oct 27, 2023 at 1:58 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > Hello.
> >
> > Some messages recently introduced by commit 29d0a77fa6 seem to deviate
> > slightly from our standards.
> >
> > +               if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade)
> > +               {
> > +                       ereport(ERROR,
> > +                                       errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > +                                       errmsg("replication slots must not be invalidated during the upgrade"),
> > +                                       errhint("\"max_slot_wal_keep_size\" must be set to -1 during the
upgrade"));
> >
> > The message for errhint is not a complete sentence. And errmsg is not
> > in telegraph style.  The first attached makes minimum changes.
> >
> > However, if allowed, I'd like to propose an alternative set of
> > messages as follows:
> >
> > +                                       errmsg("replication slot is invalidated during upgrade"),
> > +                                       errhint("Set \"max_slot_wal_keep_size\" to -1 to avoid invalidation."));
> >
> > The second attached does this.
> >
> > What do you think about those?
> >
>
> IIUC the only possible way to reach this error (according to the
> comment preceding it) is by the user overriding the GUC value (which
> was defaulted -1) on the command line.
>

Yeah, this is my understanding as well.

> + /*
> + * The logical replication slots shouldn't be invalidated as
> + * max_slot_wal_keep_size GUC is set to -1 during the upgrade.
> + *
> + * The following is just a sanity check.
> + */
>
> Given that, I felt a more relevant msg/hint might be like:
>
> errmsg("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"),
> errhint("Do not override \"max_slot_wal_keep_size\" using command line
> options."));
>

But OTOH, we don't have a value of user-passed options to ensure that.
So, how about a slightly different message: "This can be caused by
overriding \"max_slot_wal_keep_size\" using command line options." or
something along those lines? I see a somewhat similar message in the
existing code (errhint("This can be caused ...")).

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Bharath Rupireddy

Date:

27 October 2023, 04:59:28

On Fri, Oct 27, 2023 at 9:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 27, 2023 at 8:52 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Fri, Oct 27, 2023 at 8:28 AM Kyotaro Horiguchi:
> > The above errhint LGTM. How about a slightly different errmsg, like
> > the following?
> >
> > +                    errmsg("cannot invalidate replication slots when
> > in binary upgrade mode"),
> > +                    errhint("Set \"max_slot_wal_keep_size\" to -1 to
> > avoid invalidation."));
> >
> > ".... when in binary upgrade mode" is being used in many places.
> >
>
> By this time slot may be already invalidated, so how about:
> "replication slot was invalidated when in binary upgrade mode"?

In this error spot, the is invalidated in memory but the invalidated
state is not persisted to disk which happens after somewhere later:

        else
        {
            /*
             * We hold the slot now and have already invalidated it; flush it
             * to ensure that state persists.
             *

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: A recent message added to pg_upgade

From

Bharath Rupireddy

Date:

27 October 2023, 05:29:52

On Fri, Oct 27, 2023 at 9:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > errmsg("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"),
> > errhint("Do not override \"max_slot_wal_keep_size\" using command line
> > options."));
> >
>
> But OTOH, we don't have a value of user-passed options to ensure that.
> So, how about a slightly different message: "This can be caused by
> overriding \"max_slot_wal_keep_size\" using command line options." or
> something along those lines? I see a somewhat similar message in the
> existing code (errhint("This can be caused ...")).

I get it. I think having errdetail explaining the possible cause of
the error is wanted here, something like:

errmsg("cannot invalidate replication slots when in binary upgrade mode"),
errdetail("This can be caused by overriding \"max_slot_wal_keep_size\"
using command line options."));
errhint("Do not override or set \"max_slot_wal_keep_size\" to -1 ."));

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

27 October 2023, 05:35:12

At Fri, 27 Oct 2023 09:51:43 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> On Fri, Oct 27, 2023 at 9:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > IIUC the only possible way to reach this error (according to the
> > comment preceding it) is by the user overriding the GUC value (which
> > was defaulted -1) on the command line.
> >
> 
> Yeah, this is my understanding as well.
> 
> > + /*
> > + * The logical replication slots shouldn't be invalidated as
> > + * max_slot_wal_keep_size GUC is set to -1 during the upgrade.
> > + *
> > + * The following is just a sanity check.
> > + */
> >
> > Given that, I felt a more relevant msg/hint might be like:
> >
> > errmsg("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"),
> > errhint("Do not override \"max_slot_wal_keep_size\" using command line
> > options."));
> >
> 
> But OTOH, we don't have a value of user-passed options to ensure that.
> So, how about a slightly different message: "This can be caused by
> overriding \"max_slot_wal_keep_size\" using command line options." or
> something along those lines? I see a somewhat similar message in the
> existing code (errhint("This can be caused ...")).

The suggested error message looks to me like that of the GUC
mechanism. While I don't have the wider picture about the feature,
might we consider rejecting the parameter setting? With that
modification, this message can be changed to elog one.

I believe it's somewhat inconsiderate to point out what shouldn't have
been done only after a problem has occurred.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: A recent message added to pg_upgade

From

Alvaro Herrera

Date:

27 October 2023, 08:31:54

On 2023-Oct-27, Kyotaro Horiguchi wrote:

> @@ -1433,8 +1433,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
>          {
>              ereport(ERROR,
>                      errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> -                    errmsg("replication slots must not be invalidated during the upgrade"),
> -                    errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"));

Hmm, if I read this code right, this error is going to be thrown by the
checkpointer while finishing a checkpoint.  Fortunately, the checkpoint
record has already been written, but I don't know what would happen if
this is thrown while trying to write the shutdown checkpoint.  Probably
nothing terribly good.

I don't think this is useful.  If the setting is invalid during binary
upgrade, let's prevent it from being set at all right from the start of
the upgrade process.  In InvalidatePossiblyObsoleteSlot() we could have
just an Assert() or elog(PANIC).

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

27 October 2023, 09:27:10

On Fri, Oct 27, 2023 at 2:02 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Oct-27, Kyotaro Horiguchi wrote:
>
> > @@ -1433,8 +1433,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
> >               {
> >                       ereport(ERROR,
> >                                       errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > -                                     errmsg("replication slots must not be invalidated during the upgrade"),
> > -                                     errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"));
>
> Hmm, if I read this code right, this error is going to be thrown by the
> checkpointer while finishing a checkpoint.  Fortunately, the checkpoint
> record has already been written, but I don't know what would happen if
> this is thrown while trying to write the shutdown checkpoint.  Probably
> nothing terribly good.
>
> I don't think this is useful.  If the setting is invalid during binary
> upgrade, let's prevent it from being set at all right from the start of
> the upgrade process.

We are already forcing the required setting
"max_slot_wal_keep_size=-1" during the upgrade similar to some of the
other settings like "full_page_writes". However, the user can provide
an option for "max_slot_wal_keep_size" in which case it will be
overridden. Now, I think (a) we can ensure that our setting always
takes precedence in this case. The other idea is (b) to parse the
user-provided options and check if "max_slot_wal_keep_size" has a
value different than expected and raise an error accordingly. Or we
can simply (c) document the usage of max_slot_wal_keep_size in the
upgrade. I am not sure whether it is worth complicating the code for
this as the user shouldn't be using such an option during the upgrade.
So, I think doing (a) and (c) could be simpler.

>
>  In InvalidatePossiblyObsoleteSlot() we could have
> just an Assert() or elog(PANIC).
>

Yeah, we can change to either of those.

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

30 October 2023, 02:28:55

At Fri, 27 Oct 2023 14:57:10 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> On Fri, Oct 27, 2023 at 2:02 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2023-Oct-27, Kyotaro Horiguchi wrote:
> >
> > > @@ -1433,8 +1433,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
> > >               {
> > >                       ereport(ERROR,
> > >                                       errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > -                                     errmsg("replication slots must not be invalidated during the upgrade"),
> > > -                                     errhint("\"max_slot_wal_keep_size\" must be set to -1 during the
upgrade"));
> >
> > Hmm, if I read this code right, this error is going to be thrown by the
> > checkpointer while finishing a checkpoint.  Fortunately, the checkpoint
> > record has already been written, but I don't know what would happen if
> > this is thrown while trying to write the shutdown checkpoint.  Probably
> > nothing terribly good.
> >
> > I don't think this is useful.  If the setting is invalid during binary
> > upgrade, let's prevent it from being set at all right from the start of
> > the upgrade process.
> 
> We are already forcing the required setting
> "max_slot_wal_keep_size=-1" during the upgrade similar to some of the
> other settings like "full_page_writes". However, the user can provide
> an option for "max_slot_wal_keep_size" in which case it will be
> overridden. Now, I think (a) we can ensure that our setting always
> takes precedence in this case. The other idea is (b) to parse the
> user-provided options and check if "max_slot_wal_keep_size" has a
> value different than expected and raise an error accordingly. Or we
> can simply (c) document the usage of max_slot_wal_keep_size in the
> upgrade. I am not sure whether it is worth complicating the code for
> this as the user shouldn't be using such an option during the upgrade.
> So, I think doing (a) and (c) could be simpler.
> >
> >  In InvalidatePossiblyObsoleteSlot() we could have
> > just an Assert() or elog(PANIC).
> >
> 
> Yeah, we can change to either of those.

This discussion seems like a bit off from my point. I suggested adding
a check for that setting when IsBinaryUpgraded is true at the GUC
level as shown in the attached patch. I believe Álvaro made a similar
suggestion.  While the error message is somewhat succinct, I think it
is sufficient given the low possilibility of the scenario and the fact
that it cannot occur inadvertently.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

inhibit_m_s_w_k_s_during_upgrade.txt

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

30 October 2023, 03:21:18

On Mon, Oct 30, 2023 at 7:58 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Fri, 27 Oct 2023 14:57:10 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
> > On Fri, Oct 27, 2023 at 2:02 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > >
> > > On 2023-Oct-27, Kyotaro Horiguchi wrote:
> > >
> > > > @@ -1433,8 +1433,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
> > > >               {
> > > >                       ereport(ERROR,
> > > >                                       errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > > -                                     errmsg("replication slots must not be invalidated during the upgrade"),
> > > > -                                     errhint("\"max_slot_wal_keep_size\" must be set to -1 during the
upgrade"));
> > >
> > > Hmm, if I read this code right, this error is going to be thrown by the
> > > checkpointer while finishing a checkpoint.  Fortunately, the checkpoint
> > > record has already been written, but I don't know what would happen if
> > > this is thrown while trying to write the shutdown checkpoint.  Probably
> > > nothing terribly good.
> > >
> > > I don't think this is useful.  If the setting is invalid during binary
> > > upgrade, let's prevent it from being set at all right from the start of
> > > the upgrade process.
> >
> > We are already forcing the required setting
> > "max_slot_wal_keep_size=-1" during the upgrade similar to some of the
> > other settings like "full_page_writes". However, the user can provide
> > an option for "max_slot_wal_keep_size" in which case it will be
> > overridden. Now, I think (a) we can ensure that our setting always
> > takes precedence in this case. The other idea is (b) to parse the
> > user-provided options and check if "max_slot_wal_keep_size" has a
> > value different than expected and raise an error accordingly. Or we
> > can simply (c) document the usage of max_slot_wal_keep_size in the
> > upgrade. I am not sure whether it is worth complicating the code for
> > this as the user shouldn't be using such an option during the upgrade.
> > So, I think doing (a) and (c) could be simpler.
> > >
> > >  In InvalidatePossiblyObsoleteSlot() we could have
> > > just an Assert() or elog(PANIC).
> > >
> >
> > Yeah, we can change to either of those.
>
> This discussion seems like a bit off from my point. I suggested adding
> a check for that setting when IsBinaryUpgraded is true at the GUC
> level as shown in the attached patch. I believe Álvaro made a similar
> suggestion.  While the error message is somewhat succinct, I think it
> is sufficient given the low possilibility of the scenario and the fact
> that it cannot occur inadvertently.
>

I think we can simply change that error message to assert if we want
to go with the check hook idea of yours. BTW, can we add
GUC_check_errdetail() with a better message as some of the other check
function uses? Also, I guess we can add some comments or at least
refer to the existing comments to explain the reason of such a check.

--
With Regards,
Amit Kapila.

RE: A recent message added to pg_upgade

From

"Zhijie Hou (Fujitsu)"

Date:

30 October 2023, 03:36:41

On Monday, October 30, 2023 10:29 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> 
> At Fri, 27 Oct 2023 14:57:10 +0530, Amit Kapila <amit.kapila16@gmail.com>
> wrote in
> > On Fri, Oct 27, 2023 at 2:02 PM Alvaro Herrera <alvherre@alvh.no-ip.org>
> wrote:
> > >
> > > On 2023-Oct-27, Kyotaro Horiguchi wrote:
> > >
> > > > @@ -1433,8 +1433,8 @@
> InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
> > > >               {
> > > >                       ereport(ERROR,
> > > >
> errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > > -                                     errmsg("replication slots must not
> be invalidated during the upgrade"),
> > > > -
> errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"));
> > >
> > > Hmm, if I read this code right, this error is going to be thrown by
> > > the checkpointer while finishing a checkpoint.  Fortunately, the
> > > checkpoint record has already been written, but I don't know what
> > > would happen if this is thrown while trying to write the shutdown
> > > checkpoint.  Probably nothing terribly good.
> > >
> > > I don't think this is useful.  If the setting is invalid during
> > > binary upgrade, let's prevent it from being set at all right from
> > > the start of the upgrade process.
> >
> > We are already forcing the required setting
> > "max_slot_wal_keep_size=-1" during the upgrade similar to some of the
> > other settings like "full_page_writes". However, the user can provide
> > an option for "max_slot_wal_keep_size" in which case it will be
> > overridden. Now, I think (a) we can ensure that our setting always
> > takes precedence in this case. The other idea is (b) to parse the
> > user-provided options and check if "max_slot_wal_keep_size" has a
> > value different than expected and raise an error accordingly. Or we
> > can simply (c) document the usage of max_slot_wal_keep_size in the
> > upgrade. I am not sure whether it is worth complicating the code for
> > this as the user shouldn't be using such an option during the upgrade.
> > So, I think doing (a) and (c) could be simpler.
> > >
> > >  In InvalidatePossiblyObsoleteSlot() we could have just an Assert()
> > > or elog(PANIC).
> > >
> >
> > Yeah, we can change to either of those.
> 
> This discussion seems like a bit off from my point. I suggested adding a check
> for that setting when IsBinaryUpgraded is true at the GUC level as shown in the
> attached patch. I believe Álvaro made a similar suggestion.  While the error
> message is somewhat succinct, I think it is sufficient given the low possilibility
> of the scenario and the fact that it cannot occur inadvertently.

Thanks for the diff and I think the approach basically works.

One notable behavior of this approach it will reject the GUC setting even if there
are no slots on old cluster or user set the value to a big enough value which
doesn't cause invalidation. The behavior doesn't look bad to me but just mention it
for reference.

Best Regards,
Hou zj

Re: A recent message added to pg_upgade

From

Bharath Rupireddy

Date:

30 October 2023, 07:08:47

On Mon, Oct 30, 2023 at 8:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > This discussion seems like a bit off from my point. I suggested adding
> > a check for that setting when IsBinaryUpgraded is true at the GUC
> > level as shown in the attached patch. I believe Álvaro made a similar
> > suggestion.  While the error message is somewhat succinct, I think it
> > is sufficient given the low possilibility of the scenario and the fact
> > that it cannot occur inadvertently.
> >
>
> I think we can simply change that error message to assert if we want
> to go with the check hook idea of yours. BTW, can we add
> GUC_check_errdetail() with a better message as some of the other check
> function uses? Also, I guess we can add some comments or at least
> refer to the existing comments to explain the reason of such a check.

Will the check_hook approach work correctly? I haven't checked that by
myself, but I see InitializeGUCOptions() getting called before
IsBinaryUpgrade is set to true and the passed-in config options ('c')
are parsed.

If the check_hook approach works correctly, I think we must add a test
hitting the error in check_max_slot_wal_keep_size for the
IsBinaryUpgrade case. And, I agree with Amit to have a detailed
messaging with GUC_check_errmsg/GUC_check_errdetail. Also, IMV,
leaving the error message in InvalidatePossiblyObsoleteSlot() there
(if required with a better wording as discussed initially in this
thread) does no harm. Actually, it acts as another safety net given
that max_slot_wal_keep_size GUC is reloadable via SIGHUP.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

30 October 2023, 07:40:45

At Mon, 30 Oct 2023 08:51:18 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> I think we can simply change that error message to assert if we want
> to go with the check hook idea of yours. BTW, can we add
> GUC_check_errdetail() with a better message as some of the other check
> function uses? Also, I guess we can add some comments or at least
> refer to the existing comments to explain the reason of such a check.

Definitely.  I've attached the following changes.

1. Added GUC_check_errdetail() to the check function.

2. Added a comment to the check function (based on my knowledge about
  the feature).

3. Altered the ereport() into Assert() in
   InvalidatePossiblyObsoleteSlot().  I considered removing the
   !SlotIsLogical() condition since pg_upgrade always sets
   max_slot_wal_keep_size to -1, but I left it unchanged.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

inhibit_m_s_w_k_s_during_upgrade_2.txt

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

30 October 2023, 07:46:42

At Mon, 30 Oct 2023 03:36:41 +0000, "Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com> wrote in 
> Thanks for the diff and I think the approach basically works.
> 
> One notable behavior of this approach it will reject the GUC setting even if there
> are no slots on old cluster or user set the value to a big enough value which
> doesn't cause invalidation. The behavior doesn't look bad to me but just mention it
> for reference.

Indeed.  pg_upgrade anyway sets the variable to -1 irrespective of the
slot's existence, and I see no justification for allowing users to
forcibly change it.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

30 October 2023, 08:12:15

At Mon, 30 Oct 2023 12:38:47 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> Will the check_hook approach work correctly? I haven't checked that by
> myself, but I see InitializeGUCOptions() getting called before
> IsBinaryUpgrade is set to true and the passed-in config options ('c')
> are parsed.

I'm not sure about the wanted behavior exactly, but the fast you
pointed doesn't matter because the check is required after parsing the
command line options. On the other hand, I'm not sure about the
behavior that a setting in postgresql.conf is rejected.

> If the check_hook approach works correctly, I think we must add a test
> hitting the error in check_max_slot_wal_keep_size for the
> IsBinaryUpgrade case. And, I agree with Amit to have a detailed
> messaging with GUC_check_errmsg/GUC_check_errdetail. Also, IMV,
> leaving the error message in InvalidatePossiblyObsoleteSlot() there
> (if required with a better wording as discussed initially in this
> thread) does no harm. Actually, it acts as another safety net given
> that max_slot_wal_keep_size GUC is reloadable via SIGHUP.

The error message, which is deemed impossible, adds an additional
message translation. In another thread, we are discussing the
reduction of translatable messages. Therefore, I suggest using elog()
for the condition at the very least. Whether it should be elog() or
Assert() remains open for discussion, as I don't have a firm stance on
it.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: A recent message added to pg_upgade

From

Bharath Rupireddy

Date:

30 October 2023, 09:01:56

On Mon, Oct 30, 2023 at 1:42 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Mon, 30 Oct 2023 12:38:47 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
> > Will the check_hook approach work correctly? I haven't checked that by
> > myself, but I see InitializeGUCOptions() getting called before
> > IsBinaryUpgrade is set to true and the passed-in config options ('c')
> > are parsed.
>
> I'm not sure about the wanted behavior exactly, but the fast you
> pointed doesn't matter because the check is required after parsing the
> command line options. On the other hand, I'm not sure about the
> behavior that a setting in postgresql.conf is rejected.

Yeah. The check_hook is called even after the param is specified in
postgresql.conf during the upgrade, so I see no problem there.

> > If the check_hook approach works correctly, I think we must add a test
> > hitting the error in check_max_slot_wal_keep_size for the
> > IsBinaryUpgrade case. And, I agree with Amit to have a detailed
> > messaging with GUC_check_errmsg/GUC_check_errdetail. Also, IMV,
> > leaving the error message in InvalidatePossiblyObsoleteSlot() there
> > (if required with a better wording as discussed initially in this
> > thread) does no harm. Actually, it acts as another safety net given
> > that max_slot_wal_keep_size GUC is reloadable via SIGHUP.
>
> The error message, which is deemed impossible, adds an additional
> message translation. In another thread, we are discussing the
> reduction of translatable messages. Therefore, I suggest using elog()
> for the condition at the very least. Whether it should be elog() or
> Assert() remains open for discussion, as I don't have a firm stance on
> it.

I get it. I agree to go with just the assert because the GUC
check_hook kinda tightens the screws against setting
max_slot_wal_keep_size to a value other than -1 during the binary
upgrade,

A few comments on your inhibit_m_s_w_k_s_during_upgrade_2.txt:

1.



--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

RE: A recent message added to pg_upgade

From

"Hayato Kuroda (Fujitsu)"

Date:

30 October 2023, 09:03:24

Dear Bharath,

> Will the check_hook approach work correctly?

I tested by using the first version and worked well (rejected). Please see the
log which recorded the output and log. Below lines were copied from server
log and found that max_slot_wal_keep_size must not be >= 0.

```
waiting for server to start....2023-10-30 08:53:32.529 GMT [6903] FATAL:  invalid value for parameter
"max_slot_wal_keep_size":1
 
 stopped waiting
pg_ctl: could not start serve
```

> I haven't checked that by
> myself, but I see InitializeGUCOptions() getting called before
> IsBinaryUpgrade is set to true and the passed-in config options ('c')
> are parsed.

I thought the key point was that user-defined options are aligned after the "-b".
User-defined options are set after the '-b'  option, so check_hook could work
as we expected. Thought?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment

Re: A recent message added to pg_upgade

From

Bharath Rupireddy

Date:

30 October 2023, 09:25:01

On Mon, Oct 30, 2023 at 2:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>

Never mind. Previous message was accidentally sent before I finished
writing my comments.

> Yeah. The check_hook is called even after the param is specified in
> postgresql.conf during the upgrade, so I see no problem there.
>
> > The error message, which is deemed impossible, adds an additional
> > message translation. In another thread, we are discussing the
> > reduction of translatable messages. Therefore, I suggest using elog()
> > for the condition at the very least. Whether it should be elog() or
> > Assert() remains open for discussion, as I don't have a firm stance on
> > it.
>
> I get it. I agree to go with just the assert because the GUC
> check_hook kinda tightens the screws against setting
> max_slot_wal_keep_size to a value other than -1 during the binary
> upgrade,

 A few comments on your inhibit_m_s_w_k_s_during_upgrade_2.txt:

1.
+ * All WAL files on the publisher node must be retained during an upgrade to
+ * maintain connections from downstreams.  While pg_upgrade explicitly sets

It's not just the publisher, anyone using logical slots. Also, no
downstream please. If you want, you can repurpose the comment that's
added by 29d0a77f.

    /*
     * Use max_slot_wal_keep_size as -1 to prevent the WAL removal by the
     * checkpointer process.  If WALs required by logical replication slots
     * are removed, the slots are unusable.  This setting prevents the
     * invalidation of slots during the upgrade. We set this option when
     * cluster is PG17 or later because logical replication slots can only be
     * migrated since then. Besides, max_slot_wal_keep_size is added in PG13.
     */

2.
 At present, only logical slots really require
+         * this.

Can we remove the above comment as the code with SlotIsLogical(s)
explains it all?

3.
+        GUC_check_errdetail("\"max_slot_wal_keep_size\" must be set
to -1 during the upgrade.");
+        return false;

How about we be explicit like the following which helps users reason
about this restriction instead of them looking at the comments/docs?

            GUC_check_errcode(ERRCODE_INVALID_PARAMETER_VALUE);
            GUC_check_errmsg(""\"max_slot_wal_keep_size\" must be set
to -1 when in binary upgrade mode");
            GUC_check_errdetail("A value of -1 prevents the removal of
WAL required for logical slots upgrade.");
            return false;

4. I think a test case to hit the error in the check hook in
003_logical_slots.pl will help a lot here - not only covers the code
but also helps demonstrate how one can reach the error.

5. I think the check_hook is better defined in xlog.c the place where
it's actually being declared and in action. IMV, there's no reason for
it to be in slot.c as it doesn't deal with any slot related
variables/structs. This also avoids an unnecessary "utils/guc_hooks.h"
inclusion in slot.c.
+bool
+check_max_slot_wal_keep_size(int *newval, void **extra, GucSource source)
+{

5. A possible problem with this check_hook approach is that it doesn't
let anyone setting max_slot_wal_keep_size to a value other than -1
during pg_ugprade even if someone doesn't have logical slots or
doesn't want to upgrade logical slots in which case the WAL file
growth during pg_upgrade may be huge (transiently) unless the
pg_resetwal step of pg_upgrade removes it at the end.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

31 October 2023, 08:49:32

At Mon, 30 Oct 2023 14:55:01 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> > I get it. I agree to go with just the assert because the GUC
> > check_hook kinda tightens the screws against setting
> > max_slot_wal_keep_size to a value other than -1 during the binary
> > upgrade,

Thanks for being on the same page.

>  A few comments on your inhibit_m_s_w_k_s_during_upgrade_2.txt:
> 
> 1.
> + * All WAL files on the publisher node must be retained during an upgrade to
> + * maintain connections from downstreams.  While pg_upgrade explicitly sets
> 
> It's not just the publisher, anyone using logical slots. Also, no
> downstream please. If you want, you can repurpose the comment that's
> added by 29d0a77f.
> 
>     /*
>      * Use max_slot_wal_keep_size as -1 to prevent the WAL removal by the
>      * checkpointer process.  If WALs required by logical replication slots
>      * are removed, the slots are unusable.  This setting prevents the
>      * invalidation of slots during the upgrade. We set this option when
>      * cluster is PG17 or later because logical replication slots can only be
>      * migrated since then. Besides, max_slot_wal_keep_size is added in PG13.
>      */

It is helpful. Thanks!

> 2.
>  At present, only logical slots really require
> +         * this.
> 
> Can we remove the above comment as the code with SlotIsLogical(s)
> explains it all?

max_slot_wal_keep_size affects both logical and physical
slots. Therefore, if we are interested only one of the two types of
slots, it's important to clarify the rationale.  Regardless of any
potential exntension to physical slots, I believe it's essential to
clarify the rationale. I couldn't determine from that extensive thread
whether there's a possible extension to physical slots. Could you
inform me if such an extension can happen and, if not, provide the
reason?


> 3.
> +        GUC_check_errdetail("\"max_slot_wal_keep_size\" must be set
> to -1 during the upgrade.");
> +        return false;
> 
> How about we be explicit like the following which helps users reason
> about this restriction instead of them looking at the comments/docs?
> 
>             GUC_check_errcode(ERRCODE_INVALID_PARAMETER_VALUE);
>             GUC_check_errmsg(""\"max_slot_wal_keep_size\" must be set
> to -1 when in binary upgrade mode");
>             GUC_check_errdetail("A value of -1 prevents the removal of
> WAL required for logical slots upgrade.");
>             return false;



I don't quite see the reason to provide such a detailed explanation
just for this error. Additionally, since this check is performed
regardless of the presence or absense of logical slots, I think the
errdetail message might potentially confuse those whosee it. Adding
"binary" looks fine as is and done in the attached.

> 4. I think a test case to hit the error in the check hook in
> 003_logical_slots.pl will help a lot here - not only covers the code
> but also helps demonstrate how one can reach the error.

Yeah, of course. I was planning to add tests once the direction of the
discussion became clear. I will add them in the next version.

> 5. I think the check_hook is better defined in xlog.c the place where
> it's actually being declared and in action. IMV, there's no reason for
> it to be in slot.c as it doesn't deal with any slot related
> variables/structs. This also avoids an unnecessary "utils/guc_hooks.h"
> inclusion in slot.c.
> +bool
> +check_max_slot_wal_keep_size(int *newval, void **extra, GucSource source)
> +{

Sounds reasonable. Moved. I simply moved it to xlog.c, but the
function comment was thoroughly written only for this moved function,
making it somewhat stand out..

> 5. A possible problem with this check_hook approach is that it doesn't
> let anyone setting max_slot_wal_keep_size to a value other than -1
> during pg_ugprade even if someone doesn't have logical slots or
> doesn't want to upgrade logical slots in which case the WAL file
> growth during pg_upgrade may be huge (transiently) unless the
> pg_resetwal step of pg_upgrade removes it at the end.

While I doubt anyone wishes to set the variable to a specific value
during upgrade, think there are individuals who might be reluctant to
edit the config file due to unclear reasons.  While we could consider
an alternative - checking for logical slots during binary upgrade-
it's debatable if the effort is justified. (I haven't verified its
feasibility, however.)


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

inhibit_m_s_w_k_s_during_upgrade_3.txt

Re: A recent message added to pg_upgade

From

Bharath Rupireddy

Date:

31 October 2023, 10:30:00

On Tue, Oct 31, 2023 at 2:19 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> >             GUC_check_errcode(ERRCODE_INVALID_PARAMETER_VALUE);
> >             GUC_check_errmsg(""\"max_slot_wal_keep_size\" must be set
> > to -1 when in binary upgrade mode");
> >             GUC_check_errdetail("A value of -1 prevents the removal of
> > WAL required for logical slots upgrade.");
> >             return false;
>
> I don't quite see the reason to provide such a detailed explanation
> just for this error. Additionally, since this check is performed
> regardless of the presence or absense of logical slots.

Okay, I get it.

> > 4. I think a test case to hit the error in the check hook in
> > 003_logical_slots.pl will help a lot here - not only covers the code
> > but also helps demonstrate how one can reach the error.
>
> Yeah, of course. I was planning to add tests once the direction of the
> discussion became clear. I will add them in the next version.

Yes, please. The test case to hit the ERROR in
InvalidatePossiblyObsoleteSlot() is important even if the check_hook
approach isn't going anywhere.

> function comment was thoroughly written only for this moved function,
> making it somewhat stand out..

I think that's fine.

> > 5. A possible problem with this check_hook approach is that it doesn't
> > let anyone setting max_slot_wal_keep_size to a value other than -1
> > during pg_ugprade even if someone doesn't have logical slots or
> > doesn't want to upgrade logical slots in which case the WAL file
> > growth during pg_upgrade may be huge (transiently) unless the
> > pg_resetwal step of pg_upgrade removes it at the end.
>
> While I doubt anyone wishes to set the variable to a specific value
> during upgrade, think there are individuals who might be reluctant to
> edit the config file due to unclear reasons.  While we could consider
> an alternative - checking for logical slots during binary upgrade-
> it's debatable if the effort is justified. (I haven't verified its
> feasibility, however.)

Checking for logical slots during binary upgrade doesn't help - what
if there are logical slots present but no upgrade is wanted (via a new
pg_uprade option)? Basically, how will the postgres server know
whether someone wants pg_upgrade of logical slots or not? Can we check
if someone is overriding max_slot_wal_keep_size in pg_upgrade itself
(via pg_settings  query from the server)? If yes, if logical slots
exist and upgrade is wanted, then disallow the upgrade if GUC is set
to value other than -1.

I believe disallowing setting max_slot_wal_keep_size to a value other
than -1 during binary upgrade may have serious consequences as it
impacts WAL retention before the pg_resetwal comes into picture as
part of pg_upgrade.

Or what if we just live with what we have right now? I mean with ERROR
in InvalidatePossiblyObsoleteSlot().

Or what if we just remove ERROR in InvalidatePossiblyObsoleteSlot or
make it an Assert and say do not override max_slot_wal_keep_size in
docs? Even if someone did override, let the pg_upgrade report the slot
as invalidated and let the user delete the slot or decide what to do
with it.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

31 October 2023, 11:17:38

On Tue, Oct 31, 2023 at 4:00 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> > > 5. A possible problem with this check_hook approach is that it doesn't
> > > let anyone setting max_slot_wal_keep_size to a value other than -1
> > > during pg_ugprade even if someone doesn't have logical slots or
> > > doesn't want to upgrade logical slots in which case the WAL file
> > > growth during pg_upgrade may be huge (transiently) unless the
> > > pg_resetwal step of pg_upgrade removes it at the end.
> >
> > While I doubt anyone wishes to set the variable to a specific value
> > during upgrade, think there are individuals who might be reluctant to
> > edit the config file due to unclear reasons.  While we could consider
> > an alternative - checking for logical slots during binary upgrade-
> > it's debatable if the effort is justified. (I haven't verified its
> > feasibility, however.)
>
> Checking for logical slots during binary upgrade doesn't help - what
> if there are logical slots present but no upgrade is wanted (via a new
> pg_uprade option)? Basically, how will the postgres server know
> whether someone wants pg_upgrade of logical slots or not? Can we check
> if someone is overriding max_slot_wal_keep_size in pg_upgrade itself
> (via pg_settings  query from the server)? If yes, if logical slots
> exist and upgrade is wanted, then disallow the upgrade if GUC is set
> to value other than -1.
>

I feel we can try to extend the functionality if we really see some
user demand. It is not that we can't do it now but it doesn't seem
prudent to make the functionality/code more complex than really
required.

> I believe disallowing setting max_slot_wal_keep_size to a value other
> than -1 during binary upgrade may have serious consequences as it
> impacts WAL retention before the pg_resetwal comes into picture as
> part of pg_upgrade.
>

I don't think this is completely true because this setting will only
impact if there are active slots and those slots need some WAL which
we want to remove. This setting shouldn't be used as often as you are
imagining.

> Or what if we just live with what we have right now? I mean with ERROR
> in InvalidatePossiblyObsoleteSlot().
>
> Or what if we just remove ERROR in InvalidatePossiblyObsoleteSlot or
> make it an Assert and say do not override max_slot_wal_keep_size in
> docs? Even if someone did override, let the pg_upgrade report the slot
> as invalidated and let the user delete the slot or decide what to do
> with it.
>

The problem is this can happen in the background so it can happen at
the time of shutdown when all the upgrade is complete.

--
With Regards,
Amit Kapila.

RE: A recent message added to pg_upgade

From

"Hayato Kuroda (Fujitsu)"

Date:

31 October 2023, 13:44:07

Dear Horiguchi-san,

Thanks for making the patch!

> > 4. I think a test case to hit the error in the check hook in
> > 003_logical_slots.pl will help a lot here - not only covers the code
> > but also helps demonstrate how one can reach the error.
>
> Yeah, of course. I was planning to add tests once the direction of the
> discussion became clear. I will add them in the next version.

I tried to make the part. Feel free to include it if not yet. We can check the
server log, but I think it may be overkill.

Also, I have one comment.

```
+bool
+check_max_slot_wal_keep_size(int *newval, void **extra, GucSource source)
+{
+    if (IsBinaryUpgrade && *newval != -1)
+    {
+        GUC_check_errdetail("\"max_slot_wal_keep_size\" must be set to -1 during binary upgrade mode.");
+        return false;
+    }
+    return true;
+}
```

Just to confirm - should we check the GucSource? Based on ur requirement, it might
be enough we avoid overwriting while starting the server.
Personally current code is OK because it is simpler, but I want to hear your opinion.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment

add_test.patch

Re: A recent message added to pg_upgade

From

Peter Smith

Date:

01 November 2023, 07:08:19

Hi, here are some minor review comments for the v3 patch.

======
src/backend/access/transam/xlog.c

1. check_max_slot_wal_keep_size

+/*
+ * GUC check_hook for max_slot_wal_keep_size
+ *
+ * If WALs required by logical replication slots are removed, the slots are
+ * unusable.  While pg_upgrade sets this variable to -1 via the command line to
+ * attempt to prevent such removal during binary upgrade, there are ways for
+ * users to override it. For the sake of completing the objective, ensure that
+ * this variable remains unchanged.  See InvalidatePossiblyObsoleteSlot() and
+ * start_postmaster() in pg_upgrade for more details.
+ */

I asked ChatGPT to suggest alternative wording for that comment, and
it came up with something that I felt was a slight improvement.

SUGGESTION
...
If WALs needed by logical replication slots are deleted, these slots
become inoperable. During a binary upgrade, pg_upgrade sets this
variable to -1 via the command line in an attempt to prevent such
deletions, but users have ways to override it. To ensure the
successful completion of the upgrade, it's essential to keep this
variable unaltered.
...

~~~

2.
+bool
+check_max_slot_wal_keep_size(int *newval, void **extra, GucSource source)
+{
+ if (IsBinaryUpgrade && *newval != -1)
+ {
+ GUC_check_errdetail("\"max_slot_wal_keep_size\" must be set to -1
during binary upgrade mode.");
+ return false;
+ }
+ return true;
+}

Some of the other GUC_check_errdetail()'s do not include the GUC name
in the translatable message text. Isn't that a preferred style?

SUGGESTION
GUC_check_errdetail("\"%s\" must be set to -1 during binary upgrade mode.",
  "max_slot_wal_keep_size");

======
src/backend/replication/slot.c

3. InvalidatePossiblyObsoleteSlot

- if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade)
- {
- ereport(ERROR,
- errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("replication slots must not be invalidated during the upgrade"),
- errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"));
- }
+ Assert (!*invalidated || !SlotIsLogical(s) || !IsBinaryUpgrade);

IMO new Assert became trickier to understand than the original condition. YMMV.

SUGGESTION
Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

02 November 2023, 02:58:34

Thanks you for the comments!

At Wed, 1 Nov 2023 18:08:19 +1100, Peter Smith <smithpb2250@gmail.com> wrote in 
> Hi, here are some minor review comments for the v3 patch.
> 
> ======
> src/backend/access/transam/xlog.c

> I asked ChatGPT to suggest alternative wording for that comment, and
> it came up with something that I felt was a slight improvement.
> 
> SUGGESTION
> ...
> If WALs needed by logical replication slots are deleted, these slots
> become inoperable. During a binary upgrade, pg_upgrade sets this
> variable to -1 via the command line in an attempt to prevent such
> deletions, but users have ways to override it. To ensure the
> successful completion of the upgrade, it's essential to keep this
> variable unaltered.
> ...
> 
> ~~~

ChatGPT seems to tend to generate sentences in a slightly different
from our usual writing. While I tried to retain the original phrasing
in the patch, I don't mind using the suggested version.  Used as is.

> 2.

> + GUC_check_errdetail("\"max_slot_wal_keep_size\" must be set to -1
> during binary upgrade mode.");

> Some of the other GUC_check_errdetail()'s do not include the GUC name
> in the translatable message text. Isn't that a preferred style?

> SUGGESTION
> GUC_check_errdetail("\"%s\" must be set to -1 during binary upgrade mode.",
>   "max_slot_wal_keep_size");

I believe that that style was adopted to minimize translatable
messages by consolidting identical ones that only differ in variable
names.  I see both versions in the tree. I didn't find necessity to
adopt this approach for this specific message, especially since I'm
skeptical about adding new messages that end with "must be set to -1
during binary upgrade mode".  (pg_upgrade sets synchronous_commit,
fsync and full_page_writes to "off".)

However, some unique messages are in this style, so I'm fine with
using that style.  Revised accordingly.

> ======
> src/backend/replication/slot.c
> 
> 3. InvalidatePossiblyObsoleteSlot

> + Assert (!*invalidated || !SlotIsLogical(s) || !IsBinaryUpgrade);
> 
> IMO new Assert became trickier to understand than the original condition. YMMV.
> 
> SUGGESTION
> Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));

Yeah, I also liked that style and considered using it, but I didn't
feel it was too hard to read in this particular case, so I ended up
using the current way.  Just like with the point of other comments,
I'm not particularly attached to this style. Thus if someone find it
difficult to read, I have no issue with changing it. Revised as
suggested.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

inhibit_m_s_w_k_s_during_upgrade_4.txt

Re: A recent message added to pg_upgade

From

Peter Smith

Date:

02 November 2023, 03:25:53

On Thu, Nov 2, 2023 at 1:58 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> Thanks you for the comments!
>
> At Wed, 1 Nov 2023 18:08:19 +1100, Peter Smith <smithpb2250@gmail.com> wrote in
> > Hi, here are some minor review comments for the v3 patch.
> >
> > ======
> > src/backend/access/transam/xlog.c
>
...
> > 2.
>
> > + GUC_check_errdetail("\"max_slot_wal_keep_size\" must be set to -1
> > during binary upgrade mode.");
>
> > Some of the other GUC_check_errdetail()'s do not include the GUC name
> > in the translatable message text. Isn't that a preferred style?
>
> > SUGGESTION
> > GUC_check_errdetail("\"%s\" must be set to -1 during binary upgrade mode.",
> >   "max_slot_wal_keep_size");
>
> I believe that that style was adopted to minimize translatable
> messages by consolidting identical ones that only differ in variable
> names.  I see both versions in the tree. I didn't find necessity to
> adopt this approach for this specific message, especially since I'm
> skeptical about adding new messages that end with "must be set to -1
> during binary upgrade mode".  (pg_upgrade sets synchronous_commit,
> fsync and full_page_writes to "off".)
>
> However, some unique messages are in this style, so I'm fine with
> using that style.  Revised accordingly.
>

Checking this patch yesterday prompted me to create a new thread
questioning the inconsistencies of the "GUC names in messages". In
that thread, Tom Lane replied and gave some background information [1]
about the GUC name embedding versus substitution. In hindsight, I
think your original message was fine as-is, but there seem to be
examples of every kind of style, so whatever you do would have some
precedent.

The patch v4 LGTM.

======
[1] https://www.postgresql.org/message-id/2758485.1698848717%40sss.pgh.pa.us

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: A recent message added to pg_upgade

From

Peter Smith

Date:

02 November 2023, 03:32:07

On Thu, Nov 2, 2023 at 2:25 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Nov 2, 2023 at 1:58 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > Thanks you for the comments!
> >
> > At Wed, 1 Nov 2023 18:08:19 +1100, Peter Smith <smithpb2250@gmail.com> wrote in
> > > Hi, here are some minor review comments for the v3 patch.
> > >
> > > ======
> > > src/backend/access/transam/xlog.c
> >
> ...
> > > 2.
> >
> > > + GUC_check_errdetail("\"max_slot_wal_keep_size\" must be set to -1
> > > during binary upgrade mode.");
> >
> > > Some of the other GUC_check_errdetail()'s do not include the GUC name
> > > in the translatable message text. Isn't that a preferred style?
> >
> > > SUGGESTION
> > > GUC_check_errdetail("\"%s\" must be set to -1 during binary upgrade mode.",
> > >   "max_slot_wal_keep_size");
> >
> > I believe that that style was adopted to minimize translatable
> > messages by consolidting identical ones that only differ in variable
> > names.  I see both versions in the tree. I didn't find necessity to
> > adopt this approach for this specific message, especially since I'm
> > skeptical about adding new messages that end with "must be set to -1
> > during binary upgrade mode".  (pg_upgrade sets synchronous_commit,
> > fsync and full_page_writes to "off".)
> >
> > However, some unique messages are in this style, so I'm fine with
> > using that style.  Revised accordingly.
> >
>
> Checking this patch yesterday prompted me to create a new thread
> questioning the inconsistencies of the "GUC names in messages". In
> that thread, Tom Lane replied and gave some background information [1]
> about the GUC name embedding versus substitution. In hindsight, I
> think your original message was fine as-is, but there seem to be
> examples of every kind of style, so whatever you do would have some
> precedent.
>
> The patch v4 LGTM.
>

To clarify, all the current code LGTM, but the patch is still missing
a guc_hook test case, right?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

02 November 2023, 06:02:26

On Thu, Nov 02, 2023 at 02:32:07PM +1100, Peter Smith wrote:
> On Thu, Nov 2, 2023 at 2:25 PM Peter Smith <smithpb2250@gmail.com> wrote:
>> Checking this patch yesterday prompted me to create a new thread
>> questioning the inconsistencies of the "GUC names in messages". In
>> that thread, Tom Lane replied and gave some background information [1]
>> about the GUC name embedding versus substitution. In hindsight, I
>> think your original message was fine as-is, but there seem to be
>> examples of every kind of style, so whatever you do would have some
>> precedent.
>>
>> The patch v4 LGTM.
>
> To clarify, all the current code LGTM, but the patch is still missing
> a guc_hook test case, right?

-        NULL, NULL, NULL
+        check_max_slot_wal_keep_size, NULL, NULL

FWIW, I am +-0 with what you are proposing here.  I don't quite get
why one may want to enforce this specific GUC at upgrade.  Anyway, if
they do, I'd be curious to hear why this is required and this patch
would prevent them to do so.  Actually, this could be a good reason
for making the logical slot handling during pg_upgrade an option
rather than a mandatory thing.
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

02 November 2023, 09:06:09

On Thu, Nov 2, 2023 at 11:32 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Nov 02, 2023 at 02:32:07PM +1100, Peter Smith wrote:
> > On Thu, Nov 2, 2023 at 2:25 PM Peter Smith <smithpb2250@gmail.com> wrote:
> >> Checking this patch yesterday prompted me to create a new thread
> >> questioning the inconsistencies of the "GUC names in messages". In
> >> that thread, Tom Lane replied and gave some background information [1]
> >> about the GUC name embedding versus substitution. In hindsight, I
> >> think your original message was fine as-is, but there seem to be
> >> examples of every kind of style, so whatever you do would have some
> >> precedent.
> >>
> >> The patch v4 LGTM.
> >
> > To clarify, all the current code LGTM, but the patch is still missing
> > a guc_hook test case, right?
>
> -               NULL, NULL, NULL
> +               check_max_slot_wal_keep_size, NULL, NULL
>
> FWIW, I am +-0 with what you are proposing here.  I don't quite get
> why one may want to enforce this specific GUC at upgrade.
>

I also can't think of a good reason to do so but OTOH, I can't imagine
all possible scenarios. As this setting is invalid or can cause
problems, it seems people favor preventing it. Alvaro also voted in
favor of preventing it, so we are considering to proceed with it
unless more people think otherwise.

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

03 November 2023, 02:11:20

On Thu, Nov 2, 2023 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 2, 2023 at 11:32 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Thu, Nov 02, 2023 at 02:32:07PM +1100, Peter Smith wrote:
> > > On Thu, Nov 2, 2023 at 2:25 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > >> Checking this patch yesterday prompted me to create a new thread
> > >> questioning the inconsistencies of the "GUC names in messages". In
> > >> that thread, Tom Lane replied and gave some background information [1]
> > >> about the GUC name embedding versus substitution. In hindsight, I
> > >> think your original message was fine as-is, but there seem to be
> > >> examples of every kind of style, so whatever you do would have some
> > >> precedent.
> > >>
> > >> The patch v4 LGTM.
> > >
> > > To clarify, all the current code LGTM, but the patch is still missing
> > > a guc_hook test case, right?
> >
> > -               NULL, NULL, NULL
> > +               check_max_slot_wal_keep_size, NULL, NULL
> >
> > FWIW, I am +-0 with what you are proposing here.  I don't quite get
> > why one may want to enforce this specific GUC at upgrade.
> >
>
> I also can't think of a good reason to do so but OTOH, I can't imagine
> all possible scenarios. As this setting is invalid or can cause
> problems, it seems people favor preventing it. Alvaro also voted in
> favor of preventing it, so we are considering to proceed with it
> unless more people think otherwise.
>

Now, that Michael also committed another similar change in commit
7021d3b176, it is better to be consistent in both cases. So, either we
should have check hooks for both parameters or follow another route
where we always forcibly override these parameters (which means the
user-provided values for these parameters will be ignored) in
pg_upgrade and document it. Yet another simple way is to simply
document the current behavior. In the future, if we see users complain
about this or have use cases to use these parameters during an
upgrade, we can accordingly try to adapt the behavior.

Thoughts?

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Peter Smith

Date:

03 November 2023, 02:33:26

On Fri, Nov 3, 2023 at 1:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 2, 2023 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Nov 2, 2023 at 11:32 AM Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Thu, Nov 02, 2023 at 02:32:07PM +1100, Peter Smith wrote:
> > > > On Thu, Nov 2, 2023 at 2:25 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >> Checking this patch yesterday prompted me to create a new thread
> > > >> questioning the inconsistencies of the "GUC names in messages". In
> > > >> that thread, Tom Lane replied and gave some background information [1]
> > > >> about the GUC name embedding versus substitution. In hindsight, I
> > > >> think your original message was fine as-is, but there seem to be
> > > >> examples of every kind of style, so whatever you do would have some
> > > >> precedent.
> > > >>
> > > >> The patch v4 LGTM.
> > > >
> > > > To clarify, all the current code LGTM, but the patch is still missing
> > > > a guc_hook test case, right?
> > >
> > > -               NULL, NULL, NULL
> > > +               check_max_slot_wal_keep_size, NULL, NULL
> > >
> > > FWIW, I am +-0 with what you are proposing here.  I don't quite get
> > > why one may want to enforce this specific GUC at upgrade.
> > >
> >
> > I also can't think of a good reason to do so but OTOH, I can't imagine
> > all possible scenarios. As this setting is invalid or can cause
> > problems, it seems people favor preventing it. Alvaro also voted in
> > favor of preventing it, so we are considering to proceed with it
> > unless more people think otherwise.
> >
>
> Now, that Michael also committed another similar change in commit
> 7021d3b176, it is better to be consistent in both cases. So, either we

I agree. Both patches are setting a special GUC value at the command
line, and both of them don't want the user to somehow override that.
Since the requirements are the same, I felt the implementations
(regardless if they use a guc hook or something else) should also be
done the same way. Yesterday I posted a review comment on the other
thread [1] (#2c) trying to express the same point about consistency.

======
[1] https://www.postgresql.org/message-id/CAHut%2BPsCzt%3DO3_xkyrskaZ3SMxaXoN4L5Z5CgvaGPNx3mXXxOQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

05 November 2023, 00:03:06

On Fri, Nov 03, 2023 at 01:33:26PM +1100, Peter Smith wrote:
> On Fri, Nov 3, 2023 at 1:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Now, that Michael also committed another similar change in commit
>> 7021d3b176, it is better to be consistent in both cases. So, either we
>
> I agree. Both patches are setting a special GUC value at the command
> line, and both of them don't want the user to somehow override that.
> Since the requirements are the same, I felt the implementations
> (regardless if they use a guc hook or something else) should also be
> done the same way. Yesterday I posted a review comment on the other
> thread [1] (#2c) trying to express the same point about consistency.

Yeah, I certainly agree about consistency in the implementation for
both sides of the coin.

Nevertheless, I'm still +-0 on the GUC hook addition as I am wondering
if there could be a case where one would be interested in enforcing
the state of the GUCs anyway, and we'd prevent entirely that.  Another
thing that we can do for max_logical_replication_workers, rather than
a GUC hook, is to add a check on IsBinaryUpgrade in
ApplyLauncherRegister().  At least that would be consistent with what
we do for autovacuum as the apply worker is just a bgworker.
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

07 November 2023, 02:29:46

On Sun, Nov 5, 2023 at 5:33 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Nov 03, 2023 at 01:33:26PM +1100, Peter Smith wrote:
> > On Fri, Nov 3, 2023 at 1:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> Now, that Michael also committed another similar change in commit
> >> 7021d3b176, it is better to be consistent in both cases. So, either we
> >
> > I agree. Both patches are setting a special GUC value at the command
> > line, and both of them don't want the user to somehow override that.
> > Since the requirements are the same, I felt the implementations
> > (regardless if they use a guc hook or something else) should also be
> > done the same way. Yesterday I posted a review comment on the other
> > thread [1] (#2c) trying to express the same point about consistency.
>
> Yeah, I certainly agree about consistency in the implementation for
> both sides of the coin.
>
> Nevertheless, I'm still +-0 on the GUC hook addition as I am wondering
> if there could be a case where one would be interested in enforcing
> the state of the GUCs anyway, and we'd prevent entirely that.  Another
> thing that we can do for max_logical_replication_workers, rather than
> a GUC hook, is to add a check on IsBinaryUpgrade in
> ApplyLauncherRegister().
>

Do you mean to say that if 'IsBinaryUpgrade' is true then let's not
allow to launch launcher or apply worker? If so, I guess this won't be
any better than prohibiting at an early stage or explicitly overriding
those with internal values and documenting it, at least that way we
can be consistent for both variables (max_logical_replication_workers
and max_slot_wal_keep_size).

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

07 November 2023, 02:41:58

On Tue, Nov 07, 2023 at 07:59:46AM +0530, Amit Kapila wrote:
> Do you mean to say that if 'IsBinaryUpgrade' is true then let's not
> allow to launch launcher or apply worker? If so, I guess this won't be
> any better than prohibiting at an early stage or explicitly overriding
> those with internal values and documenting it, at least that way we
> can be consistent for both variables (max_logical_replication_workers
> and max_slot_wal_keep_size).

Yes, I mean to paint an extra IsBinaryUpgrade before registering the
apply worker launcher.  That would be consistent with what we do for
autovacuum in the postmaster.
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

07 November 2023, 10:46:21

On Tue, Nov 7, 2023 at 8:12 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Nov 07, 2023 at 07:59:46AM +0530, Amit Kapila wrote:
> > Do you mean to say that if 'IsBinaryUpgrade' is true then let's not
> > allow to launch launcher or apply worker? If so, I guess this won't be
> > any better than prohibiting at an early stage or explicitly overriding
> > those with internal values and documenting it, at least that way we
> > can be consistent for both variables (max_logical_replication_workers
> > and max_slot_wal_keep_size).
>
> Yes, I mean to paint an extra IsBinaryUpgrade before registering the
> apply worker launcher.  That would be consistent with what we do for
> autovacuum in the postmaster.
>

But then we don't need the hardcoded value of
max_logical_replication_workers as zero by pg_upgrade. I think doing
IsBinaryUpgrade for slots won't be neat, so we anyway need to keep
using the special value of max_slot_wal_keep_size GUC. Though the
handling for both won't be the same but I guess given the situation,
that seems like a reasonable thing to do. If we follow that then we
can have this special GUC hook only for max_slot_wal_keep_size GUC.

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

09 November 2023, 04:23:07

On Tue, Nov 7, 2023 at 4:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 7, 2023 at 8:12 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Tue, Nov 07, 2023 at 07:59:46AM +0530, Amit Kapila wrote:
> > > Do you mean to say that if 'IsBinaryUpgrade' is true then let's not
> > > allow to launch launcher or apply worker? If so, I guess this won't be
> > > any better than prohibiting at an early stage or explicitly overriding
> > > those with internal values and documenting it, at least that way we
> > > can be consistent for both variables (max_logical_replication_workers
> > > and max_slot_wal_keep_size).
> >
> > Yes, I mean to paint an extra IsBinaryUpgrade before registering the
> > apply worker launcher.  That would be consistent with what we do for
> > autovacuum in the postmaster.
> >
>
> But then we don't need the hardcoded value of
> max_logical_replication_workers as zero by pg_upgrade. I think doing
> IsBinaryUpgrade for slots won't be neat, so we anyway need to keep
> using the special value of max_slot_wal_keep_size GUC. Though the
> handling for both won't be the same but I guess given the situation,
> that seems like a reasonable thing to do. If we follow that then we
> can have this special GUC hook only for max_slot_wal_keep_size GUC.
>

Michael, Horiguchi-San, and others, do you have any thoughts on what
is the best way to proceed?

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

09 November 2023, 04:54:54

On Thu, Nov 09, 2023 at 09:53:07AM +0530, Amit Kapila wrote:
> On Tue, Nov 7, 2023 at 4:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> But then we don't need the hardcoded value of
>> max_logical_replication_workers as zero by pg_upgrade. I think doing
>> IsBinaryUpgrade for slots won't be neat, so we anyway need to keep
>> using the special value of max_slot_wal_keep_size GUC. Though the
>> handling for both won't be the same but I guess given the situation,
>> that seems like a reasonable thing to do. If we follow that then we
>> can have this special GUC hook only for max_slot_wal_keep_size GUC.
>
> Michael, Horiguchi-San, and others, do you have any thoughts on what
> is the best way to proceed?

No problem for me to use a GUC hook for the WAL retention GUCs if you
feel strongly about it at the end, but I'd rather use an approach
based on IsBinaryUpgrade for the logical worker launcher to be
consistent with autovacuum (where there's also an argument to refactor
it to use a bgworker registration, centralizing the checks on
IsBinaryUpgrade for all bgworkers, but that would be material for a
different thread, if there's interest in doing that).

The two situations we are trying to prevent (slot invalidation and
bgworker launch) can be triggered under different contexts, so they
don't have to use the same mechanisms to prevent what should not
happen during an upgrade.
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Peter Smith

Date:

09 November 2023, 06:04:28

On Thu, Nov 9, 2023 at 3:55 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Nov 09, 2023 at 09:53:07AM +0530, Amit Kapila wrote:
> > On Tue, Nov 7, 2023 at 4:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> But then we don't need the hardcoded value of
> >> max_logical_replication_workers as zero by pg_upgrade. I think doing
> >> IsBinaryUpgrade for slots won't be neat, so we anyway need to keep
> >> using the special value of max_slot_wal_keep_size GUC. Though the
> >> handling for both won't be the same but I guess given the situation,
> >> that seems like a reasonable thing to do. If we follow that then we
> >> can have this special GUC hook only for max_slot_wal_keep_size GUC.
> >
> > Michael, Horiguchi-San, and others, do you have any thoughts on what
> > is the best way to proceed?
>
> No problem for me to use a GUC hook for the WAL retention GUCs if you
> feel strongly about it at the end, but I'd rather use an approach
> based on IsBinaryUpgrade for the logical worker launcher to be
> consistent with autovacuum (where there's also an argument to refactor
> it to use a bgworker registration, centralizing the checks on
> IsBinaryUpgrade for all bgworkers, but that would be material for a
> different thread, if there's interest in doing that).
>
> The two situations we are trying to prevent (slot invalidation and
> bgworker launch) can be triggered under different contexts, so they
> don't have to use the same mechanisms to prevent what should not
> happen during an upgrade.
> --

Having a GUC hook for the "max_slot_wal_keep_size" seemed OK to me. If
the user overrides a GUC value (admittedly, maybe there is no reason
why they would want to) then at least the hook will give an error,
rather than us silently overwriting the user's value with -1.

So, patch v4 LGTM, except it is better to include a test case.

~

Meanwhile, if preventing the apply worker launch is considered better
to be implemented differently in ApplyLauncherRegister, then so be it.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

09 November 2023, 06:10:09

At Thu, 9 Nov 2023 09:53:07 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> Michael, Horiguchi-San, and others, do you have any thoughts on what
> is the best way to proceed?

As I previously mentioned, I believe that if rejection is to be the
course of action, it would be best to proceed with it sooner rather
than later. On the other hand, I am concerned about the need for users
to perform extra steps depending on the source cluster
conrfiguration. Therefore, another possible approach could be to
simply ignore the given settings in the assignment hook rather than
rejecting by the check hook, and forcibuly apply -1.

What do you think about this third approach?

I haven't checked this with pg_upgrade, but a standalone postmaster
would emit the following messages.

> postgres -b -c max_slot_wal_keep_size=-1
> LOG:  "max_slot_wal_keep_size" is foced to set to -1 during binary upgrade mode.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

inhibit_m_s_w_k_s_during_upgrade_b_1.txt

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

09 November 2023, 06:30:59

On Thu, Nov 9, 2023 at 11:40 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Thu, 9 Nov 2023 09:53:07 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
> > Michael, Horiguchi-San, and others, do you have any thoughts on what
> > is the best way to proceed?
>
> As I previously mentioned, I believe that if rejection is to be the
> course of action, it would be best to proceed with it sooner rather
> than later. On the other hand, I am concerned about the need for users
> to perform extra steps depending on the source cluster
> conrfiguration. Therefore, another possible approach could be to
> simply ignore the given settings in the assignment hook rather than
> rejecting by the check hook, and forcibuly apply -1.
>
> What do you think about this third approach?
>

I have also proposed that as one of the alternatives but didn't get
many votes. And, I think if the user is passing a special value of
max_slot_wal_keep_size during the upgrade, it has to be a special
case, and rejecting it upfront doesn't seem unreasonable to me.

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Kyotaro Horiguchi

Date:

09 November 2023, 06:42:30

At Thu, 9 Nov 2023 12:00:59 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> I have also proposed that as one of the alternatives but didn't get
> many votes. And, I think if the user is passing a special value of
> max_slot_wal_keep_size during the upgrade, it has to be a special
> case, and rejecting it upfront doesn't seem unreasonable to me.

Oops. Sorry, and understood.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

09 November 2023, 07:08:38

On Thu, Nov 09, 2023 at 05:04:28PM +1100, Peter Smith wrote:
> Having a GUC hook for the "max_slot_wal_keep_size" seemed OK to me. If
> the user overrides a GUC value (admittedly, maybe there is no reason
> why they would want to) then at least the hook will give an error,
> rather than us silently overwriting the user's value with -1.
>
> So, patch v4 LGTM, except it is better to include a test case.

Where's this v4?  I may be missing, but it does not seem to be
attached to this thread..
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

09 November 2023, 07:42:54

On Thu, Nov 9, 2023 at 12:38 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Nov 09, 2023 at 05:04:28PM +1100, Peter Smith wrote:
> > Having a GUC hook for the "max_slot_wal_keep_size" seemed OK to me. If
> > the user overrides a GUC value (admittedly, maybe there is no reason
> > why they would want to) then at least the hook will give an error,
> > rather than us silently overwriting the user's value with -1.
> >
> > So, patch v4 LGTM, except it is better to include a test case.
>
> Where's this v4?
>

I think it is in an email[1]. I can take care of this unless we see
some opposition to this idea.

[1] - https://www.postgresql.org/message-id/20231102.115834.1012152975995247837.horikyota.ntt%40gmail.com

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

09 November 2023, 07:52:32

On Thu, Nov 09, 2023 at 01:12:54PM +0530, Amit Kapila wrote:
> I think it is in an email[1].

Noted.

> I can take care of this unless we see some opposition to this idea.

Thanks!
--
Michael

Attachment

signature.asc

RE: A recent message added to pg_upgade

From

"Hayato Kuroda (Fujitsu)"

Date:

09 November 2023, 08:20:40

Dear Horiguchi-san, hackers,

> Thanks you for the comments!

Thanks for updating the patch!
I'm not sure it is intentional, but you might miss my post...I suggested to add a
testcase.

I attached the updated version which is almost the same as Horiguchi-san's one,
but has a test. How do you think? Do you have other idea for testing?


Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment

inhibit_m_s_w_k_s_during_upgrade_5.txt

Re: A recent message added to pg_upgade

From

Alvaro Herrera

Date:

09 November 2023, 10:39:21

On 2023-Nov-02, Kyotaro Horiguchi wrote:

> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
> index b541be8eec..46833f6ecd 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -2063,6 +2063,29 @@ check_wal_segment_size(int *newval, void **extra, GucSource source)
>      return true;
>  }
>  
> +/*
> + * GUC check_hook for max_slot_wal_keep_size
> + *
> + * If WALs needed by logical replication slots are deleted, these slots become
> + * inoperable. During a binary upgrade, pg_upgrade sets this variable to -1 via
> + * the command line in an attempt to prevent such deletions, but users have
> + * ways to override it. To ensure the successful completion of the upgrade,
> + * it's essential to keep this variable unaltered.  See
> + * InvalidatePossiblyObsoleteSlot() and start_postmaster() in pg_upgrade for
> + * more details.
> + */
> +bool
> +check_max_slot_wal_keep_size(int *newval, void **extra, GucSource source)
> +{
> +    if (IsBinaryUpgrade && *newval != -1)
> +    {
> +        GUC_check_errdetail("\"%s\" must be set to -1 during binary upgrade mode.",
> +            "max_slot_wal_keep_size");
> +        return false;
> +    }
> +    return true;
> +}

One sentence in that comment reads weird.  I'd do this:

s/To ensure the ... unaltered/This check callback ensures the value is
not overridden by the user/


> diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
> index 99823df3c7..5c3d2b1082 100644
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -1424,18 +1424,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
>          SpinLockRelease(&s->mutex);
>  
>          /*
> -         * The logical replication slots shouldn't be invalidated as
> -         * max_slot_wal_keep_size GUC is set to -1 during the upgrade.
> -         *
> -         * The following is just a sanity check.
> +         * check_max_slot_wal_keep_size() ensures max_slot_wal_keep_size is set
> +         * to -1, so, slot invalidation for logical slots shouldn't happen
> +         * during an upgrade. At present, only logical slots really require
> +         * this.
>           */
> -        if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade)
> -        {
> -            ereport(ERROR,
> -                    errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> -                    errmsg("replication slots must not be invalidated during the upgrade"),
> -                    errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade"));
> -        }
> +        Assert (!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));

I think it's worth adding a comment here, pointing to
check_old_cluster_for_valid_slots() verifying that no
already-invalidated slots exist before the upgrade starts.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

09 November 2023, 13:28:28

On Thu, Nov 9, 2023 at 4:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Nov-02, Kyotaro Horiguchi wrote:
>
> > diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
> > index b541be8eec..46833f6ecd 100644
> > --- a/src/backend/access/transam/xlog.c
> > +++ b/src/backend/access/transam/xlog.c
> > @@ -2063,6 +2063,29 @@ check_wal_segment_size(int *newval, void **extra, GucSource source)
> >       return true;
> >  }
> >
> > +/*
> > + * GUC check_hook for max_slot_wal_keep_size
> > + *
> > + * If WALs needed by logical replication slots are deleted, these slots become
> > + * inoperable. During a binary upgrade, pg_upgrade sets this variable to -1 via
> > + * the command line in an attempt to prevent such deletions, but users have
> > + * ways to override it. To ensure the successful completion of the upgrade,
> > + * it's essential to keep this variable unaltered.  See
> > + * InvalidatePossiblyObsoleteSlot() and start_postmaster() in pg_upgrade for
> > + * more details.
> > + */
> > +bool
> > +check_max_slot_wal_keep_size(int *newval, void **extra, GucSource source)
> > +{
> > +     if (IsBinaryUpgrade && *newval != -1)
> > +     {
> > +             GUC_check_errdetail("\"%s\" must be set to -1 during binary upgrade mode.",
> > +                     "max_slot_wal_keep_size");
> > +             return false;
> > +     }
> > +     return true;
> > +}
>
> One sentence in that comment reads weird.  I'd do this:
>
> s/To ensure the ... unaltered/This check callback ensures the value is
> not overridden by the user/
>

These comments appear mostly repetitive to what is already mentioned
in start_postmaster(). So, I have changed those referred to already
written comments, and slightly adjusted the comments at another place.
See attached. Personally, I don't see the need for a test for this, so
removed the same but can add it back if you or others think so.

--
With Regards,
Amit Kapila.

Attachment

inhibit_m_s_w_k_s_during_upgrade_6.patch

Re: A recent message added to pg_upgade

From

Alvaro Herrera

Date:

09 November 2023, 14:24:23

On 2023-Nov-09, Amit Kapila wrote:

> These comments appear mostly repetitive to what is already mentioned
> in start_postmaster(). So, I have changed those referred to already
> written comments, and slightly adjusted the comments at another place.
> See attached.

I'd still rather mention check_old_cluster_for_valid_slots() just above
the Assert() in InvalidatePossiblyObsoleteSlot().  It looks too bare to
me otherwise.

> Personally, I don't see the need for a test for this, so removed the
> same but can add it back if you or others think so.

I'm neutral on having a test for this.  I'm not sure this is easy to
break unintentionally.  OTOH the test is cheap, since it only has to run
pg_upgrade itself and not, say, another initdb.  On the (as Robert says)
third hand, would we have tests for each possible GUC that we'd like not
to be changed during pg_upgrade?  I suspect not, which suggests we don't
want this one either.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

10 November 2023, 02:20:41

On Thu, Nov 09, 2023 at 04:52:32PM +0900, Michael Paquier wrote:
> Thanks!

Also, please see also a patch about switching the logirep launcher to
rely on IsBinaryUpgrade to prevent its startup.  Any thoughts about
that?
--
Michael

Attachment

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

10 November 2023, 09:57:25

On Fri, Nov 10, 2023 at 7:50 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Nov 09, 2023 at 04:52:32PM +0900, Michael Paquier wrote:
> > Thanks!
>
> Also, please see also a patch about switching the logirep launcher to
> rely on IsBinaryUpgrade to prevent its startup.  Any thoughts about
> that?
>

Preventing these
+ * processes from starting while upgrading avoids any activity on the new
+ * cluster before the physical files are put in place, which could cause
+ * corruption on the new cluster upgrading to.

I don't think this comment is correct because there won't be any apply
activity on the new cluster as after restoration subscriptions should
be disabled. On the old cluster, I think one problem is that the
origins may move forward after we copy them which can cause data
inconsistency issues. The other is that we may not prefer to generate
additional data and WAL during the upgrade. Also, I am not completely
sure about using the word 'corruption' in this context.

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

11 November 2023, 00:16:22

On Fri, Nov 10, 2023 at 03:27:25PM +0530, Amit Kapila wrote:
> I don't think this comment is correct because there won't be any apply
> activity on the new cluster as after restoration subscriptions should
> be disabled. On the old cluster, I think one problem is that the
> origins may move forward after we copy them which can cause data
> inconsistency issues. The other is that we may not prefer to generate
> additional data and WAL during the upgrade. Also, I am not completely
> sure about using the word 'corruption' in this context.

What is your suggestion here?  Would it be better to just aim for
simplicity and just say that we don't want it to run because "it can
lead to some inconsistent behaviors"?
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

13 November 2023, 03:15:12

On Sat, Nov 11, 2023 at 5:46 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Nov 10, 2023 at 03:27:25PM +0530, Amit Kapila wrote:
> > I don't think this comment is correct because there won't be any apply
> > activity on the new cluster as after restoration subscriptions should
> > be disabled. On the old cluster, I think one problem is that the
> > origins may move forward after we copy them which can cause data
> > inconsistency issues. The other is that we may not prefer to generate
> > additional data and WAL during the upgrade. Also, I am not completely
> > sure about using the word 'corruption' in this context.
>
> What is your suggestion here?  Would it be better to just aim for
> simplicity and just say that we don't want it to run because "it can
> lead to some inconsistent behaviors"?
>

I think we can be specific about logical replication stuff. I have not
done any study on autovacuum behavior related to this, so we can
update about it separately if required. I could think of something
like the following:

-       /* Use -b to disable autovacuum. */
+       /*
+        * Use -b to disable autovacuum and logical replication launcher
+        * (effective in PG17 or later for the latter).
+        *
+        * Logical replication workers can stream data during the
upgrade which can
+        * cause replication origins to move forward after we have copied them.
+        * It can cause the system to request the data which is already present
+        * in the new cluster.
+        */

Now, ideally, such a comment change makes more sense along with the
main patch, so either we can go without this comment change or
probably wait till the main patch is ready and merge just before it or
along with it. I am fine either way.

BTW, it is not clear to me another part of the comment "... for the
latter" in the proposed wording. Is there any typo there or am I
missing something?

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

13 November 2023, 04:49:39

On Mon, Nov 13, 2023 at 08:45:12AM +0530, Amit Kapila wrote:
> I think we can be specific about logical replication stuff. I have not
> done any study on autovacuum behavior related to this, so we can
> update about it separately if required.

Autovacuum, as far as I recall, could decide to do some work before
files are physically copied from the old to the new cluster,
corrupting the new cluster.  Per 76dd09bbec89:

+        *  If we have lost the autovacuum launcher, try to start a new one.
+        *  We don't want autovacuum to run in binary upgrade mode because
+        *  autovacuum might update relfrozenxid for empty tables before
+        *  the physical files are put in place.

> -       /* Use -b to disable autovacuum. */
> +       /*
> +        * Use -b to disable autovacuum and logical replication launcher
> +        * (effective in PG17 or later for the latter).
> +        *
> +        * Logical replication workers can stream data during the
> upgrade which can
> +        * cause replication origins to move forward after we have copied them.
> +        * It can cause the system to request the data which is already present
> +        * in the new cluster.
> +        */
>
> Now, ideally, such a comment change makes more sense along with the
> main patch, so either we can go without this comment change or
> probably wait till the main patch is ready and merge just before it or
> along with it. I am fine either way.

Another location would be to document that stuff directly in
launcher.c where the check for IsBinaryUpgrade would be added.  You
are right that it makes little sense to document that now, so how
about:
1) keeping pg_upgrade.c minimal, say:
-       /* Use -b to disable autovacuum. */
+       /*
+        * Use -b to disable autovacuum and logical replication
+        * launcher (in 17~).
+        */
With a removal of the comment block related to
max_logical_replication_workers=0?
2) Document that in ApplyLauncherRegister() as part of the main patch
for the subscribers?

> BTW, it is not clear to me another part of the comment "... for the
> latter" in the proposed wording. Is there any typo there or am I
> missing something?

The "latter" refers to the logirep launcher here, as -b would affect
it only in 17~ with the patch I sent previously.
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

15 November 2023, 02:28:06

On Mon, Nov 13, 2023 at 10:19 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Nov 13, 2023 at 08:45:12AM +0530, Amit Kapila wrote:
> > I think we can be specific about logical replication stuff. I have not
> > done any study on autovacuum behavior related to this, so we can
> > update about it separately if required.
>
> Autovacuum, as far as I recall, could decide to do some work before
> files are physically copied from the old to the new cluster,
> corrupting the new cluster.  Per 76dd09bbec89:
>
> +        *  If we have lost the autovacuum launcher, try to start a new one.
> +        *  We don't want autovacuum to run in binary upgrade mode because
> +        *  autovacuum might update relfrozenxid for empty tables before
> +        *  the physical files are put in place.
>
> > -       /* Use -b to disable autovacuum. */
> > +       /*
> > +        * Use -b to disable autovacuum and logical replication launcher
> > +        * (effective in PG17 or later for the latter).
> > +        *
> > +        * Logical replication workers can stream data during the
> > upgrade which can
> > +        * cause replication origins to move forward after we have copied them.
> > +        * It can cause the system to request the data which is already present
> > +        * in the new cluster.
> > +        */
> >
> > Now, ideally, such a comment change makes more sense along with the
> > main patch, so either we can go without this comment change or
> > probably wait till the main patch is ready and merge just before it or
> > along with it. I am fine either way.
>
> Another location would be to document that stuff directly in
> launcher.c where the check for IsBinaryUpgrade would be added.  You
> are right that it makes little sense to document that now, so how
> about:
> 1) keeping pg_upgrade.c minimal, say:
> -       /* Use -b to disable autovacuum. */
> +       /*
> +        * Use -b to disable autovacuum and logical replication
> +        * launcher (in 17~).
> +        */
> With a removal of the comment block related to
> max_logical_replication_workers=0?
> 2) Document that in ApplyLauncherRegister() as part of the main patch
> for the subscribers?
>

I am fine with this but there is no harm in doing this before or along
with the main patch. As of now, I don't see any problem but as the
main patch is still under review, so thought we could even wait for
the patch to become "Ready For Committer".

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

15 November 2023, 02:30:41

On Wed, Nov 15, 2023 at 07:58:06AM +0530, Amit Kapila wrote:
> I am fine with this but there is no harm in doing this before or along
> with the main patch. As of now, I don't see any problem but as the
> main patch is still under review, so thought we could even wait for
> the patch to become "Ready For Committer".

WFM to wait until the other patch is ready before doing something
here.
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

10 January 2024, 04:41:03

On Wed, Nov 15, 2023 at 07:58:06AM +0530, Amit Kapila wrote:
> I am fine with this but there is no harm in doing this before or along
> with the main patch. As of now, I don't see any problem but as the
> main patch is still under review, so thought we could even wait for
> the patch to become "Ready For Committer".

My apologies for the delay.

Now that 9a17be1e244a is in the tree, please find attached a patch to
restrict the startup of the launcher using IsBinaryUpgrade in
ApplyLauncherRegister(), with adjustments to the surrounding comments.

Was there anything else you wanted to be covered and/or updated?
--
Michael

Attachment

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

10 January 2024, 12:32:12

On Wed, Jan 10, 2024 at 10:11 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Nov 15, 2023 at 07:58:06AM +0530, Amit Kapila wrote:
> > I am fine with this but there is no harm in doing this before or along
> > with the main patch. As of now, I don't see any problem but as the
> > main patch is still under review, so thought we could even wait for
> > the patch to become "Ready For Committer".
>
> My apologies for the delay.
>
> Now that 9a17be1e244a is in the tree, please find attached a patch to
> restrict the startup of the launcher using IsBinaryUpgrade in
> ApplyLauncherRegister(), with adjustments to the surrounding comments.
>

- if (max_logical_replication_workers == 0)
+ /*
+ * The logical replication launcher is disabled during binary upgrades,
+ * as logical replication workers can stream data during the upgrade
+ * which can cause replication origins to move forward after we have
+ * copied them.  It can cause the system to request the data which is
+ * already present in the new cluster.
+ */
+ if (max_logical_replication_workers == 0 || IsBinaryUpgrade)

This comment is not very clear to me. The first part of the sentence
can't apply to the new cluster as after the upgrade, subscriptions
will be disabled and the second part talks about requesting the wrong
data in the new cluster. As per my understanding, the problem here is
that, on the old cluster, the origins may move forward after we copy
them and then we copy physical files. Now, in the new cluster when we
try to request the data, it will be already present.

> Was there anything else you wanted to be covered and/or updated?
>

No, only this patch.

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

11 January 2024, 03:37:58

On Wed, Jan 10, 2024 at 06:02:12PM +0530, Amit Kapila wrote:
> - if (max_logical_replication_workers == 0)
> + /*
> + * The logical replication launcher is disabled during binary upgrades,
> + * as logical replication workers can stream data during the upgrade
> + * which can cause replication origins to move forward after we have
> + * copied them.  It can cause the system to request the data which is
> + * already present in the new cluster.
> + */
> + if (max_logical_replication_workers == 0 || IsBinaryUpgrade)
>
> This comment is not very clear to me. The first part of the sentence
> can't apply to the new cluster as after the upgrade, subscriptions
> will be disabled and the second part talks about requesting the wrong
> data in the new cluster. As per my understanding, the problem here is
> that, on the old cluster, the origins may move forward after we copy
> them and then we copy physical files. Now, in the new cluster when we
> try to request the data, it will be already present.

As far as I understand your complaint is about being more precise
about where the workers could run when we do an upgrade.  My patch
covers the reason why it would be a problem, and I agree that it could
be more detailed.

Hence, how about something like that:
"The logical replication launcher is disabled during binary upgrades,
as a logical replication workers running on the cluster upgrading from
could cause replication origins to move forward after they are copied
to the cluster upgrading to, creating potentially conflicts with the
physical files copied over."

If you have a better suggestion, feel free.
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Amit Kapila

Date:

11 January 2024, 05:55:44

On Thu, Jan 11, 2024 at 9:08 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jan 10, 2024 at 06:02:12PM +0530, Amit Kapila wrote:
> > - if (max_logical_replication_workers == 0)
> > + /*
> > + * The logical replication launcher is disabled during binary upgrades,
> > + * as logical replication workers can stream data during the upgrade
> > + * which can cause replication origins to move forward after we have
> > + * copied them.  It can cause the system to request the data which is
> > + * already present in the new cluster.
> > + */
> > + if (max_logical_replication_workers == 0 || IsBinaryUpgrade)
> >
> > This comment is not very clear to me. The first part of the sentence
> > can't apply to the new cluster as after the upgrade, subscriptions
> > will be disabled and the second part talks about requesting the wrong
> > data in the new cluster. As per my understanding, the problem here is
> > that, on the old cluster, the origins may move forward after we copy
> > them and then we copy physical files. Now, in the new cluster when we
> > try to request the data, it will be already present.
>
> As far as I understand your complaint is about being more precise
> about where the workers could run when we do an upgrade.  My patch
> covers the reason why it would be a problem, and I agree that it could
> be more detailed.
>
> Hence, how about something like that:
> "The logical replication launcher is disabled during binary upgrades,
> as a logical replication workers running on the cluster upgrading from
> could cause replication origins to move forward after they are copied
> to the cluster upgrading to, creating potentially conflicts with the
> physical files copied over."
>

Looks better. One minor nitpick: /potentially/potential

--
With Regards,
Amit Kapila.

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

11 January 2024, 06:04:42

On Thu, Jan 11, 2024 at 11:25:44AM +0530, Amit Kapila wrote:
> On Thu, Jan 11, 2024 at 9:08 AM Michael Paquier <michael@paquier.xyz> wrote:
>> Hence, how about something like that:
>> "The logical replication launcher is disabled during binary upgrades,
>> as a logical replication workers running on the cluster upgrading from
>> could cause replication origins to move forward after they are copied
>> to the cluster upgrading to, creating potentially conflicts with the
>> physical files copied over."
>
> Looks better. One minor nitpick: /potentially/potential

Sure, WFM.  Let's wait a bit and see if others have more comments to
offer.
--
Michael

Attachment

signature.asc

Re: A recent message added to pg_upgade

From

Alvaro Herrera

Date:

11 January 2024, 12:04:18

On 2024-Jan-11, Michael Paquier wrote:

> Hence, how about something like that:
> "The logical replication launcher is disabled during binary upgrades,
> as a logical replication workers running on the cluster upgrading from
> could cause replication origins to move forward after they are copied
> to the cluster upgrading to, creating potentially conflicts with the
> physical files copied over." 

"The logical replication launcher is disabled during binary upgrades, to
avoid logical replication workers running on the source cluster.  That
would cause replication origins to move forward after having been copied
to the target cluster, potentially creating conflicts with the copied
data files."

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Linux transformó mi computadora, de una `máquina para hacer cosas',
en un aparato realmente entretenido, sobre el cual cada día aprendo
algo nuevo" (Jaime Salinas)

Re: A recent message added to pg_upgade

From

Tom Lane

Date:

11 January 2024, 15:01:16

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> "The logical replication launcher is disabled during binary upgrades, to
> avoid logical replication workers running on the source cluster.  That
> would cause replication origins to move forward after having been copied
> to the target cluster, potentially creating conflicts with the copied
> data files."

"avoid logical replication workers running" still seems like shaky
grammar.  Perhaps s/avoid/avoid having/, or write "to prevent logical
replication workers from running ...".

Also perhaps s/would/could/.

Otherwise +1

            regards, tom lane

Re: A recent message added to pg_upgade

From

Michael Paquier

Date:

11 January 2024, 23:37:49

On Thu, Jan 11, 2024 at 10:01:16AM -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> "The logical replication launcher is disabled during binary upgrades, to
>> avoid logical replication workers running on the source cluster.  That
>> would cause replication origins to move forward after having been copied
>> to the target cluster, potentially creating conflicts with the copied
>> data files."
>
> "avoid logical replication workers running" still seems like shaky
> grammar.  Perhaps s/avoid/avoid having/, or write "to prevent logical
> replication workers from running ...".

After sleeping on it, your last suggestion sounds better to me, so
I've incorporated that with Alvaro's wording (also cleaner than what I
have posted), and applied the patch on HEAD.

> Also perhaps s/would/could/.

Yep.
--
Michael

Attachment

signature.asc