On 07/25/2018 08:40 AM, Dimitri Maziuk wrote:
> On 7/25/2018 10:28 AM, Andres Freund wrote:
>>
>> Are you really expecting us to be able to reproduce the problem based on
>> the above description? Our test suites do setup plain replications
>> setups, and the problem doesn't occur there.
>
> I don't, by definition, have a reproducible case: it only happened once
> so far.
Where you using pg_export_snapshot() by any chance?:
https://www.postgresql.org/docs/10/static/functions-admin.html#FUNCTIONS-SNAPSHOT-SYNCHRONIZATION
Where there any relevant error messages in the log before the database hung?
>
> If nobody knows what limits the number of files created in
> $PGDATA/pg_logical/snapshots then we'll all have to wait until this
> happens again.
>
> (To somebody else as I'm obviously not turning logical replication back
> on until I know it won't kill my server again.)
Given that it took 3 weeks to manifest itself before, I would say give
it a try and monitor $PGDATA/pg_logical/snapshots. That would help
provide information for getting at the source of the problem. You can
always disable the replication if it looks like it running away.
>
> Dima
>
--
Adrian Klaver
adrian.klaver@aklaver.com