RE: [Proposal] Add foreign-server health checks infrastructure - Mailing list pgsql-hackers

From kuroda.hayato@fujitsu.com
Subject RE: [Proposal] Add foreign-server health checks infrastructure
Date
Msg-id TYAPR01MB5866FC683843ED8BD09505FEF53D9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [Proposal] Add foreign-server health checks infrastructure  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses RE: [Proposal] Add foreign-server health checks infrastructure
List pgsql-hackers
Dear Fujii-san,

Thank you for your quick reviewing! I attached new version.
I found previous patches have wrong name. Sorry.

> The connection check timer is re-scheduled repeatedly even while the backend is
> in idle state or is running a local transaction that doesn't access to any foreign
> servers. I'm not sure if it's really worth checking the connections even in those
> states. Even without the periodic connection checks, if the connections are closed
> in those states, subsequent GetConnection() will detect that closed connection
> and re-establish the connection when starting remote transaction. Thought?

Indeed. We can now control the timer in fdw layer, so disable_timeout() was added
at the bottom of pgfdw_xact_callback(). 

> When a closed connection is detected in idle-in-transaction state and SIGINT is
> raised, nothing happens because there is no query running to be canceled by
> SIGINT. Also in this case the connection check timer gets disabled. So we can still
> execute queries that don't access to foreign servers, in the same transaction, and
> then the transaction commit fails. Is this expected behavior?

It's not happy, but I'm not sure about a good solution. I made a timer reschedule
if connection lost had detected. But if queries in the transaction are quite short,
catching SIGINT may be fail.

> When I shutdowned the foreign server while the local backend is in
> idle-in-transaction state, the connection check timer was triggered and detected
> the closed connection. Then when I executed COMMIT command, I got the
> following WARNING message. Is this a bug?
> 
>      WARNING:  leaked hash_seq_search scan for hash table 0x7fd2ca878f20

Fixed. It is caused because hash_seq_term() was not called when checker detects
a connection lost.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication
Next
From: Joseph Koshakow
Date:
Subject: Re: Extract epoch from Interval weird behavior