RE: Build-farm - intermittent error in 031_column_list.pl - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Build-farm - intermittent error in 031_column_list.pl
Date
Msg-id TYCPR01MB837380308F399111E27076DDEDD69@TYCPR01MB8373.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Build-farm - intermittent error in 031_column_list.pl  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Tuesday, May 24, 2022 9:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, May 21, 2022 at 9:03 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
> > On Fri, May 20, 2022 at 4:01 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >
> > > Also, we'd probably have to ignore RelationSyncEntry for a while,
> > > which seems quite expensive.
> > >
> >
> > Yet another option could be that we continue using a historic snapshot
> > but ignore publications that are not found for the purpose of
> > computing RelSyncEntry attributes. We won't mark such an entry as
> > valid till all the publications are loaded without anything missing. I
> > think such cases in practice won't be enough to matter. This means we
> > won't publish operations on tables corresponding to that publication
> > till we found such a publication and that seems okay.
> >
> 
> Attached, find the patch to show what I have in mind for this. Today, we have
> received a bug report with a similar symptom [1] and that should also be fixed
> with this. The reported bug should also be fixed with this.
> 
> Thoughts?
Hi,


I agree with this direction.
I think this approach solves the issue fundamentally
and is better than the first approach to add several calls
of wait_for_catchup in the test, since taking the first one
means we need to care about avoiding the same issue,
whenever we write a new (similar) test, even after the modification.


I've used the patch to check below things.
1. The patch can be applied and make check-world has passed without failure.
2. HEAD applied with the patch passed all tests in src/test/subscription
   (including 031_column_list.pl), after commenting out of WalSndWaitForWal's WalSndKeepalive.
3. The new bug fix report in 'How is this possible "publication does not exist"' thread
   has been fixed. FYI, after I execute the script's function, I also conduct
   additional insert to the publisher, and this was correctly replicated on the subscriber.

Best Regards,
    Takamichi Osumi


pgsql-hackers by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Build-farm - intermittent error in 031_column_list.pl