Re: [HACKERS] walsender termination error messages worse in v10 - Mailing list pgsql-hackers
| From | Petr Jelinek |
|---|---|
| Subject | Re: [HACKERS] walsender termination error messages worse in v10 |
| Date | |
| Msg-id | 2d132a85-1043-ba5f-3e23-131eb279df14@2ndquadrant.com Whole thread Raw |
| In response to | [HACKERS] walsender termination error messages worse in v10 (Andres Freund <andres@anarazel.de>) |
| Responses |
Re: [HACKERS] walsender termination error messages worse in v10
|
| List | pgsql-hackers |
Hi Andres,
On 02/06/17 20:51, Andres Freund wrote:
> Hi,
>
> commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920
> Author: Peter Eisentraut <peter_e@gmx.net>
> Date: 2017-03-23 08:36:36 -0400
>
> Logical replication support for initial data copy
>
> made walreceiver emit worse messages in v10 than before when the master
> gets shut down. Before 10 you'll get something like:
>
> 2017-06-02 11:46:07.173 PDT [16143][] LOG: replication terminated by primary server
> 2017-06-02 11:46:07.173 PDT [16143][] DETAIL: End of WAL reached on timeline 1 at 0/1B7FB8C8.
> 2017-06-02 11:46:07.173 PDT [16143][] FATAL: could not send end-of-streaming message to primary: no COPY in
progress
>
> the last bit is a bit superflous, but it still kinda makes sense. Now
> you get:
> 2017-06-02 11:47:09.362 PDT [17061][] FATAL: unexpected result after CommandComplete: server closed the connection
unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
>
> the reason is that
>
> static int
> libpqrcv_receive(char **buffer, pgsocket *wait_fd)
> {
>
> previously did:
> if (rawlen == -1) /* end-of-streaming or error */
> {
> PGresult *res;
>
> res = PQgetResult(streamConn);
> if (PQresultStatus(res) == PGRES_COMMAND_OK ||
> PQresultStatus(res) == PGRES_COPY_IN)
> {
> PQclear(res);
> return -1;
> }
> else
> {
> PQclear(res);
> ereport(ERROR,
> (errmsg("could not receive data from WAL stream: %s",
> PQerrorMessage(streamConn))));
> }
> }
>
> and now does
>
> if (rawlen == -1) /* end-of-streaming or error */
> {
> PGresult *res;
>
> res = PQgetResult(conn->streamConn);
> if (PQresultStatus(res) == PGRES_COMMAND_OK)
> {
> PQclear(res);
>
> /* Verify that there are no more results */
> res = PQgetResult(conn->streamConn);
> if (res != NULL)
> ereport(ERROR,
> (errmsg("unexpected result after CommandComplete: %s",
> PQerrorMessage(conn->streamConn))));
> return -1;
> }
> else if (PQresultStatus(res) == PGRES_COPY_IN)
> {
> PQclear(res);
> return -1;
> }
> else
> {
> PQclear(res);
> ereport(ERROR,
> (errmsg("could not receive data from WAL stream: %s",
> pchomp(PQerrorMessage(conn->streamConn)))));
> }
> }
>
> note the new /* Verify that there are no more results */ bit.
>
> I don't understand why the new block is there, nor does the commit
> message explain it.
>
Hmm, that particular change can actually be reverted. It was needed for
one those custom replication commands which were replaced by normal
query support. I have missed it during the rewrite.
-- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services
pgsql-hackers by date: