Re: subscriptionCheck failures on nightjar - Mailing list pgsql-hackers

From Andres Freund
Subject Re: subscriptionCheck failures on nightjar
Date
Msg-id 20190920170831.aaljabal6lyivre5@alap3.anarazel.de
Whole thread Raw
In response to Re: subscriptionCheck failures on nightjar  (Kuntal Ghosh <kuntalghosh.2007@gmail.com>)
Responses Re: subscriptionCheck failures on nightjar
List pgsql-hackers
Hi,

On 2019-09-19 17:20:15 +0530, Kuntal Ghosh wrote:
> It seems there is a pattern how the error is occurring in different
> systems. Following are the relevant log snippets:
> 
> nightjar:
> sub3 LOG:  received replication command: CREATE_REPLICATION_SLOT
> "sub3_16414_sync_16394" TEMPORARY LOGICAL pgoutput USE_SNAPSHOT
> sub3 LOG:  logical decoding found consistent point at 0/160B578
> sub1 PANIC:  could not open file
> "pg_logical/snapshots/0-160B578.snap": No such file or directory
> 
> dromedary scenario 1:
> sub3_16414_sync_16399 LOG:  received replication command:
> CREATE_REPLICATION_SLOT "sub3_16414_sync_16399" TEMPORARY LOGICAL
> pgoutput USE_SNAPSHOT
> sub3_16414_sync_16399 LOG:  logical decoding found consistent point at 0/15EA694
> sub2 PANIC:  could not open file
> "pg_logical/snapshots/0-15EA694.snap": No such file or directory
> 
> 
> dromedary scenario 2:
> sub3_16414_sync_16399 LOG:  received replication command:
> CREATE_REPLICATION_SLOT "sub3_16414_sync_16399" TEMPORARY LOGICAL
> pgoutput USE_SNAPSHOT
> sub3_16414_sync_16399 LOG:  logical decoding found consistent point at 0/15EA694
> sub1 PANIC:  could not open file
> "pg_logical/snapshots/0-15EA694.snap": No such file or directory
> 
> While subscription 3 is created, it eventually reaches to a consistent
> snapshot point and prints the WAL location corresponding to it. It
> seems sub1/sub2 immediately fails to serialize the snapshot to the
> .snap file having the same WAL location.

Since now a number of people (I tried as well), failed to reproduce this
locally, I propose that we increase the log-level during this test on
master. And perhaps expand the set of debugging information. With the
hope that the additional information on the cases encountered on the bf
helps us build a reproducer or, even better, diagnose the issue
directly.  If people agree, I'll come up with a patch.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: A problem presentaion about ECPG, DECLARE STATEMENT
Next
From: Andres Freund
Date:
Subject: Re: log bind parameter values on error