Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect - Mailing list pgsql-committers

From Fujii Masao
Subject Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
Date
Msg-id 14cd32c8-9532-47b8-77d4-b81eb3b192ea@oss.nttdata.com
Whole thread Raw
In response to Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect  (Michael Paquier <michael@paquier.xyz>)
Responses Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-committers

On 2020/10/07 11:13, Michael Paquier wrote:
> Hi Fujii-san,
> 
> On Tue, Oct 06, 2020 at 01:52:55AM +0000, Fujii Masao wrote:
>> postgres_fdw: reestablish new connection if cached one is detected as broken.
>>
>> In postgres_fdw, once remote connections are established, they are cached
>> and re-used for subsequent queries and transactions. There can be some
>> cases where those cached connections are unavaiable, for example,
>> by the restart of remote server. In these cases, previously an error was
>> reported and the query accessing to remote server failed if new remote
>> transaction failed to start because the cached connection was broken.
>>
>> This commit improves postgres_fdw so that new connection is remade
>> if broken connection is detected when starting new remote transaction.
>> This is useful to avoid unnecessary failure of queries when connection is
>> broken but can be reestablished.
> 
> lorikeet is telling that the test introduced by this commit is
> unstable:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&dt=2020-10-06%2008%3A28%3A36

Thanks for letting me know this!

> 
> Some details:
>   BEGIN;
>   SELECT 1 FROM ft1 LIMIT 1;
> - ?column?
> -----------
> -        1
> -(1 row)
> -
> +ERROR:  could not receive data from server: Software caused connection abort
> +CONTEXT:  remote SQL command: START TRANSACTION ISOLATION LEVEL REPEATABLE READ

This error means that new connection was successfully reestablished
after the cached connection was terminated, and then the above connection
error occurred when issuing "START TRANSACTION" command on that
new connection. There seems no suspicious relevant log messages in the
logfile. So I'm not sure why this error happened, yet.

Per the previous discusson at [1], lorikeet sometimes seems to cause
connection-relation failure in the regression test. So the cause of error
that we faced today also may be lorikeet itself.

[1]
https://www.postgresql.org/message-id/CA+hUKGL3Son9iAeqgjPbXCpU_6hhZhw9X24uNO14mOC4bG0cCA@mail.gmail.com

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-committers by date:

Previous
From: Amit Kapila
Date:
Subject: pgsql: Display the names of missing columns in error during logical rep
Next
From: Fujii Masao
Date:
Subject: Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect