Re: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id CAA4eK1JVg9Kv=-1_Do9K5xWR-pUD6bpS38FOfQZ=smvOHnKErQ@mail.gmail.com
Whole thread Raw
In response to RE: [PoC] pg_upgrade: allow to upgrade publisher node  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses Re: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
On Wed, Nov 29, 2023 at 2:56 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > > >
> > > > Pushed!
> > >
> > > Hi all, the CF entry for this is marked RfC, and CI is trying to apply
> > > the last patch committed. Is there further work that needs to be
> > > re-attached and/or rebased?
> > >
> >
> > No. I have marked it as committed.
> >
>
> I found another failure related with the commit [1]. I think it is caused by the
> autovacuum. I want to propose a patch which disables the feature for old publisher.
>
> More detail, please see below.
>
> # Analysis of the failure
>
> Summary: this failure occurs when the autovacuum starts after the subscription
> is disabled but before doing pg_upgrade.
>
> According to the regress file, it unexpectedly failed the pg_upgrade [2]. There are
> no possibilities for slots are invalidated, so some WALs seemed to be generated
> after disabling the subscriber.
>
> Also, server log caused by oldpub said that autovacuum worker was terminated when
> it stopped. This was occurred after walsender released the logical slots. WAL records
> caused by autovacuum workers could not be consumed by the slots, so that upgrading
> function returned false.
>
> # How to reproduce
>
> I made a small file for reproducing the failure. Please see reproduce.txt. This contains
> changes for launching autovacuum worker very often and for ensuring actual works are
> done. After applying it, I could reproduce the same failure every time.
>
> # How to fix
>
> I think it is sufficient to fix only the test code.
> The easiest way is to disable the autovacuum on old publisher. PSA the patch file.
>

Agreed, for now, we should change the test as you proposed. I'll take
care of that. However, I wonder, if we should also ensure that
autovacuum or any other worker is shut down before walsender processes
the last set of WAL before shutdown. We can analyze more on this and
probably start a separate thread to discuss this point.


--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Is this a problem in GenericXLogFinish()?
Next
From: Tom Lane
Date:
Subject: Re: proposal: possibility to read dumped table's name from file