On Fri, Sep 9, 2022 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > Pushed.
>
> Recently a number of buildfarm animals have failed at the same
> place in src/test/subscription/t/100_bugs.pl [1][2][3][4]:
>
> # Failed test '2x3000 rows in t'
> # at t/100_bugs.pl line 149.
> # got: '9000'
> # expected: '6000'
> # Looks like you failed 1 test of 7.
> [09:30:56] t/100_bugs.pl ......................
>
> This was the last commit to touch that test script. I'm thinking
> maybe it wasn't adjusted quite correctly? On the other hand, since
> I can't find any similar failures before the last 48 hours, maybe
> there is some other more-recent commit to blame. Anyway, something
> is wrong there.
It seems that this commit is innocent as it changed only how to wait.
Rather, looking at the logs, the tablesync worker errored out at an
interesting point:
022-09-09 09:30:19.630 EDT [631b3feb.840:13]
pg_16400_sync_16392_7141371862484106124 ERROR: could not find record
while sending logically-decoded data: missing contrecord at 0/1D4FFF8
2022-09-09 09:30:19.630 EDT [631b3feb.840:14]
pg_16400_sync_16392_7141371862484106124 STATEMENT: START_REPLICATION
SLOT "pg_16400_sync_16392_7141371862484106124" LOGICAL 0/0
(proto_version '3', origin 'any', publication_names '"testpub"')
ERROR: could not find record while sending logically-decoded data:
missing contrecord at 0/1D4FFF8
2022-09-09 09:30:19.631 EDT [631b3feb.26e8:2] ERROR: error while
shutting down streaming COPY: ERROR: could not find record while
sending logically-decoded data: missing contrecord at 0/1D4FFF8
It's likely that the commit f6c5edb8abcac04eb3eac6da356e59d399b2bcef
is relevant.
Regards,
--
Masahiko Sawada