Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away |
Date | |
Msg-id | CAExHW5uxHdLk9o9HUrQFPzd9qkUh36buAdAA=hTCpXdv9ZU6HA@mail.gmail.com Whole thread Raw |
In response to | Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Responses |
Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
|
List | pgsql-hackers |
Has this been added to CF, possibly next CF? On Tue, Jul 14, 2020 at 7:27 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > You could get a backend's PID using PQbackendPID and then kill it by > > calling pg_terminate_backend() to kill the remote backend to automate > > scenario of remote backend being killed. > > > > I already added the test case in v2 patch itself(added one more test > case in v3 patch), using the similar approach. > > > > > > For, one of the Ashutosh's review comments from [3] "the fact that the > > > same connection may be used by multiple plan nodes", I tried to have > > > few use cases where there exist joins on two foreign tables on the > > > same remote server, in a single query, so essentially, the same > > > connection was used for multiple plan nodes. In this case we avoid > > > retrying for the second GetConnection() request for the second foreign > > > table, with the check entry->xact_depth <= 0 , xact_depth after the > > > first GetConnection() and the first remote query will become 1 and we > > > don't hit the retry logic and seems like we are safe here. Please add > > > If I'm missing something here. > > > > > > Request the community to consider the patch for further review if the > > > overall idea seems beneficial. > > > > I think this idea will be generally useful if your work on dropping > > stale connection uses idle_connection_timeout or something like that > > on the remote server. > > Assuming we use idle_connection_timeout or some other means(as it is > not yet finalized, I will try to respond in that mail chain) to drop > stale/idle connections from the local backend, I think we have two > options 1) deleting that cached entry from the connection cache > entirely using disconnect_pg_server() and hash table remove. This > frees up some space and we don't have to deal with the connection > invalidations and don't have to bother on resetting cached entry's > other parameters such as xact_depth, have_prep_stmt etc. 2) or we > could just drop the connections using disconnect_pg_server(), retain > the hash entry, reset other parameters, and deal with the > invalidations. This is like, we maintain unnecessary info in the > cached entry, where we actually don't have a connection at all and > keep holding some space for cached entry. > > IMO, if we go with option 1, then it will be good. > > Anyways, this connection retry feature will not have any dependency on > the idle_connection_timeout or dropping stale connection feature, > because there can be many reasons where remote backends go away/get > killed. > > If I'm not sidetracking - if we use something like > idle_session_timeout [1] on the remote server, this retry feature will > be very useful. > > > > > About the patch. It seems we could just catch the error from > > begin_remote_xact() in GetConnection() and retry connection if the > > error is "bad connection". Retrying using PQreset() might be better > > than calling PQConnect* always. > > > > Thanks for the comment, it made life easier. Added the patch with the > changes. Please take a look at the v3 patch and let me know if still > something needs to be done. > > [1] - https://www.postgresql.org/message-id/763A0689-F189-459E-946F-F0EC4458980B%40hotmail.com > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: