FATAL: could not send end-of-streaming message to primary: no COPY in progress - Mailing list pgsql-hackers

From Thomas Munro
Subject FATAL: could not send end-of-streaming message to primary: no COPY in progress
Date
Msg-id CAEepm=1fEZ7Dt2r-HEp71BL8Bf+HzLmU6gN-yTgH4u7zhxEcXQ@mail.gmail.com
Whole thread Raw
Responses Re: FATAL: could not send end-of-streaming message to primary: no COPY in progress  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
Hi hackers,

If you shut down a primary server, a standby that is streaming from it says54:

LOG:  replication terminated by primary server
DETAIL:  End of WAL reached on timeline 1 at 0/14F4B68.
FATAL:  could not send end-of-streaming message to primary: no COPY in progress

Isn't that FATAL ereport a bug?

I haven't worked out the root cause but the immediate problem seems to
be libpqrcv_endstreaming calls PQputCopyEnd which doesn't like the
state that the libpq connection is in, namely PGASYNC_BUSY.  That
state seems to have been established by the call to walrcv_receive
that returned -1 (end of copy).  It doesn't happen in the similar case
of promotion of the remote server.

How is clean server shutdown supposed to work?  It looks like
walsender sends COPY 0 and then just hangs up.  Meanwhile, walreceiver
has to distinguish between that case and the the new timeline case
which involves a further exchange of messages.  Is an explicit message
at the end of the copy stream saying either "goodbye" or "but wait,
there's more" lacking here?  Or is there some other way that
walreceiver could distinguish between clean shutdown of remote server
(no error necessary), unclean shutdown of remote server, and timeline
negotiation?

-- 
Thomas Munro
http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Josh berkus
Date:
Subject: So, can we stop supporting Windows native now?
Next
From: Noah Misch
Date:
Subject: 9.6 Open Item Ownership