Thread: SSL SYSCALL error during logical replication

SSL SYSCALL error during logical replication

From
Rijesh TP
Date:
Hi Team,
We are encountering an issue with logical replication on PostgreSQL version 15.10. The error message is as follows:

"ERROR: could not receive data from WAL stream: SSL SYSCALL error: EOF detected
LOG: background worker "logical replication worker" (PID 3286) exited with exit code 1."


Logical replication had been working perfectly on version 15.8, and we successfully completed multiple systems on that version. However, after upgrading to PostgreSQL 15.10, this issue has arisen.
We’ve tried adjusting various parameters but haven’t had any success.
We have also tried disabling the SSL and selinux etc, but no luck, same error every time.
Replication fails at some point without any additional errors, leading us to suspect a potential bug in logical replication for version 15.10.

Has anyone else reported a similar issue, or is there a known fix for this? Please let us know if additional details are required.

PG Version : postgresql 15.10
OS Details : CentOS 7



Thanks & Regards,
Rijesh

Re: SSL SYSCALL error during logical replication

From
Tom Lane
Date:
Rijesh TP <rijesh.tp@opsveda.com> writes:
> We are encountering an issue with logical replication on PostgreSQL version
> 15.10. The error message is as follows:

> *"ERROR: could not receive data from WAL stream: SSL SYSCALL error: EOF
> detectedLOG: background worker "logical replication worker" (PID 3286)
> exited with exit code 1."*

This seems to be a symptom of something going wrong on the sending
side.  Have you looked into that postmaster's log file to see what
happened?

            regards, tom lane



Re: SSL SYSCALL error during logical replication

From
Rijesh TP
Date:
Hi Tom,
Thanks for the reply.
On the source DB, we're seeing these messages coming up often, but there aren't any other errors showing up.
All DB parameters including timeout we set to high, but no luck yet.
Pls advise on this

user=replicator DEBUG:  failed to increase restart lsn: proposed 13EB/B420DE18, after 13EB/B420DE18, current candidate 13EB/B420DD70, current after 13EB/B420DD70, flushed up to 13EB/B420DBB0


Thanks & Regards,
Rijesh

 




On Mon, Dec 9, 2024 at 9:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rijesh TP <rijesh.tp@opsveda.com> writes:
> We are encountering an issue with logical replication on PostgreSQL version
> 15.10. The error message is as follows:

> *"ERROR: could not receive data from WAL stream: SSL SYSCALL error: EOF
> detectedLOG: background worker "logical replication worker" (PID 3286)
> exited with exit code 1."*

This seems to be a symptom of something going wrong on the sending
side.  Have you looked into that postmaster's log file to see what
happened?

                        regards, tom lane