Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
Date
Msg-id CALj2ACXb85AHaUg=qZ0sBNR2nye2Q2NuYV14kBL1oXZud=tOJg@mail.gmail.com
Whole thread Raw
In response to Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
Thanks for the comments.

On Tue, Sep 29, 2020 at 7:30 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> +1 to add debug3 message there. But this message doesn't seem to
> match with what the error actually happened. What about something like
> "could not start remote transaction on connection %p", instead?
>

Looks better. Changed.

>
> Also maybe it's better to append PQerrorMessage(entry->conn)?
>

Added. Now the log looks like [1].

>
> +-- Generate a connection to remote. Local backend will cache it.
> +SELECT * FROM ft1 LIMIT 1;
>
> The result of this query would not be stable. Probably you need to,
> for example, add ORDER BY or replace * with 1, etc.
>

Changed to SELECT 1 FROM ft1 LIMIT 1;

>
> +-- Retrieve pid of remote backend with application name fdw_retry_check
> +-- and kill it intentionally here. Note that, local backend still has
> +-- the remote connection/backend info in it's cache.
> +SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE
> +backend_type = 'client backend' AND application_name = 'fdw_retry_check';
>
> Isn't this test fragile because there is no gurantee that the target backend
> has exited just after calling pg_terminate_backend()?
>

I think this is okay, because pg_terminate_backend() sends SIGTERM to
the backend, and upon receiving SIGTERM the signal handler die() will
be called and since there is no query being executed on the backend by
the time SIGTERM is received, it will exit immediately. Thoughts?

pqsignal(SIGTERM, die); /* cancel current query and exit */

And also, pg_terminate_backend() returns true in case the backend is
killed successfully, otherwise it returns false.     PG_RETURN_BOOL(r
== SIGNAL_BACKEND_SUCCESS);

Attaching v6 patch, please review it further.

[1]
DEBUG:  starting remote transaction on connection 0x55cd393a66e0
DEBUG:  could not start remote transaction on connection 0x55cd393a66e0
DETAIL:  server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
DEBUG:  closing connection 0x55cd393a66e0 to reestablish a new one
DEBUG:  new postgres_fdw connection 0x55cd393a66e0 for server
"foreign_server" (user mapping oid 16398, userid 10)
DEBUG:  starting remote transaction on connection 0x55cd393a66e0
DEBUG:  closing remote transaction on connection 0x55cd393a66e0


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Planner making bad choice in alternative subplan decision
Next
From: Fujii Masao
Date:
Subject: Re: New statistics for tuning WAL buffer size