Thread: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Bharath Rupireddy
Date:
Hi, As discussed in [1], in postgres_fdw the cached connections to remote servers can stay until the lifetime of the local session without getting a chance to disconnect (connection leak), if the underlying user mapping or foreign server is dropped in another session. Here are few scenarios how this can happen: Use case 1: 1) Run a foreign query in session 1 with server 1 and user mapping 1 2) Drop user mapping 1 in another session 2, an invalidation message gets generated which will have to be processed by all the sessions 3) Run the foreign query again in session 1, at the start of txn the cached entry gets invalidated via pgfdw_inval_callback() (as part of invalidation message processing). Whatever may be the type of foreign query (select, update, explain, delete, insert, analyze etc.), upon next call to GetUserMapping() from postgres_fdw.c, the cache lookup fails(with ERROR: user mapping not found for "XXXX") since the user mapping 1 has been dropped in session 2 and the query will also fail before reaching GetConnection() where the connections associated with the invalidated entries would have got disconnected. So, the connection associated with invalidated entry would remain until the local session exits. Use case 2: 1) Run a foreign query in session 1 with server 1 and user mapping 1 2) Try to drop foreign server 1, then we would not be allowed because of dependency. Use CASCADE so that dependent objects i.e. user mapping 1 and foreign tables get dropped along with foreign server 1. 3) Run the foreign query again in session 1, at the start of txn, the cached entry gets invalidated via pgfdw_inval_callback() and the query fails because there is no foreign table. Note that the remote connection remains open in session 1 until the local session exits. To solve the above connection leak problem, it looks like the right place to close all the invalid connections is pgfdw_xact_callback(), once registered, which gets called at the end of every txn in the current session(by then all the sub txns also would have been finished). Note that if there are too many invalidated entries, then the following txn has to close all of them, but that's okay than having leaked connections and it's a one time job for the following one txn. Attaching a patch for the same. Thoughts? [1] - https://www.postgresql.org/message-id/flat/CALj2ACUOQYs%2BssxkxRvZ6Ja5%2Bsfc6a-s_0e-Nv2A591hEyOgiw%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Attachment
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Zhihong Yu
Date:
Is the following sequence possible ?
In pgfdw_inval_callback():
entry->invalidated = true;
+ have_invalid_connections = true;
+ have_invalid_connections = true;
Then:
+ /* We are done closing all the invalidated connections so reset. */
+ have_invalid_connections = false;
+ have_invalid_connections = false;
At which time, there is still at least one invalid connection but the global flag is off.
On Mon, Dec 14, 2020 at 6:39 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
Hi,
As discussed in [1], in postgres_fdw the cached connections to remote
servers can stay until the lifetime of the local session without
getting a chance to disconnect (connection leak), if the underlying
user mapping or foreign server is dropped in another session. Here are
few scenarios how this can happen:
Use case 1:
1) Run a foreign query in session 1 with server 1 and user mapping 1
2) Drop user mapping 1 in another session 2, an invalidation message
gets generated which will have to be processed by all the sessions
3) Run the foreign query again in session 1, at the start of txn the
cached entry gets invalidated via pgfdw_inval_callback() (as part of
invalidation message processing). Whatever may be the type of foreign
query (select, update, explain, delete, insert, analyze etc.), upon
next call to GetUserMapping() from postgres_fdw.c, the cache lookup
fails(with ERROR: user mapping not found for "XXXX") since the user
mapping 1 has been dropped in session 2 and the query will also fail
before reaching GetConnection() where the connections associated with
the invalidated entries would have got disconnected.
So, the connection associated with invalidated entry would remain
until the local session exits.
Use case 2:
1) Run a foreign query in session 1 with server 1 and user mapping 1
2) Try to drop foreign server 1, then we would not be allowed because
of dependency. Use CASCADE so that dependent objects i.e. user mapping
1 and foreign tables get dropped along with foreign server 1.
3) Run the foreign query again in session 1, at the start of txn, the
cached entry gets invalidated via pgfdw_inval_callback() and the query
fails because there is no foreign table.
Note that the remote connection remains open in session 1 until the
local session exits.
To solve the above connection leak problem, it looks like the right
place to close all the invalid connections is pgfdw_xact_callback(),
once registered, which gets called at the end of every txn in the
current session(by then all the sub txns also would have been
finished). Note that if there are too many invalidated entries, then
the following txn has to close all of them, but that's okay than
having leaked connections and it's a one time job for the following
one txn.
Attaching a patch for the same.
Thoughts?
[1] - https://www.postgresql.org/message-id/flat/CALj2ACUOQYs%2BssxkxRvZ6Ja5%2Bsfc6a-s_0e-Nv2A591hEyOgiw%40mail.gmail.com
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Bharath Rupireddy
Date:
On Tue, Dec 15, 2020 at 8:25 AM Zhihong Yu <zyu@yugabyte.com> wrote: > Is the following sequence possible ? > In pgfdw_inval_callback(): > > entry->invalidated = true; > + have_invalid_connections = true; > > At which time the loop in pgfdw_xact_callback() is already running (but past the above entry). > Then: > > + /* We are done closing all the invalidated connections so reset. */ > + have_invalid_connections = false; > > At which time, there is still at least one invalid connection but the global flag is off. It's not possible, as this backend specific code doesn't run in multiple threads. We can not have pgfdw_inval_callback() and pgfdw_xact_callback() running at the same time, so we are safe there. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
RE: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
"Hou, Zhijie"
Date:
Hi I have an issue about the existing testcase. """ -- Test that alteration of server options causes reconnection SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- shouldwork ALTER SERVER loopback OPTIONS (SET dbname 'no such database'); SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- should fail DO $d$ BEGIN EXECUTE $$ALTER SERVER loopback OPTIONS (SET dbname '$$||current_database()||$$')$$; END; $d$; SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- should work again """ IMO, the above case is designed to test the following code[1]: With the patch, it seems the following code[1] will not work for this case, right? (It seems the connection will be disconnect in pgfdw_xact_callback) I do not know does it matter, or should we add a testcase to cover that? [1] /* * If the connection needs to be remade due to invalidation, disconnect as * soon as we're out of all transactions. */ if (entry->conn != NULL && entry->invalidated && entry->xact_depth == 0) { elog(DEBUG3, "closing connection %p for option changes to take effect", entry->conn); disconnect_pg_server(entry); } Best regards, houzj
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Bharath Rupireddy
Date:
On Fri, Dec 18, 2020 at 6:39 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Dec 18, 2020 at 5:06 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote: > > I have an issue about the existing testcase. > > > > """ > > -- Test that alteration of server options causes reconnection > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- should work > > ALTER SERVER loopback OPTIONS (SET dbname 'no such database'); > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- should fail > > DO $d$ > > BEGIN > > EXECUTE $$ALTER SERVER loopback > > OPTIONS (SET dbname '$$||current_database()||$$')$$; > > END; > > $d$; > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- should work again > > """ > > > > IMO, the above case is designed to test the following code[1]: > > With the patch, it seems the following code[1] will not work for this case, right? > > (It seems the connection will be disconnect in pgfdw_xact_callback) > > > > I do not know does it matter, or should we add a testcase to cover that? > > > > [1] /* > > * If the connection needs to be remade due to invalidation, disconnect as > > * soon as we're out of all transactions. > > */ > > if (entry->conn != NULL && entry->invalidated && entry->xact_depth == 0) > > { > > elog(DEBUG3, "closing connection %p for option changes to take effect", > > entry->conn); > > disconnect_pg_server(entry); > > } > > Yes you are right. With the patch if the server is altered in the same > session in which the connection exists, the connection gets closed at > the end of that alter query txn, not at the beginning of the next > foreign tbl query. So, that part of the code in GetConnection() > doesn't get hit. Having said that, this code gets hit when the alter > query is run in another session and the connection in the current > session gets disconnected at the start of the next foreign tbl query. > > If we want to cover it with a test case, we might have to alter the > foreign server in another session. I'm not sure whether we can open > another session from the current psql session and run a sql command. > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Bharath Rupireddy
Date:
On Fri, Dec 18, 2020 at 6:46 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > On Fri, Dec 18, 2020 at 6:39 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Fri, Dec 18, 2020 at 5:06 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote: > > > I have an issue about the existing testcase. > > > > > > """ > > > -- Test that alteration of server options causes reconnection > > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- should work > > > ALTER SERVER loopback OPTIONS (SET dbname 'no such database'); > > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- should fail > > > DO $d$ > > > BEGIN > > > EXECUTE $$ALTER SERVER loopback > > > OPTIONS (SET dbname '$$||current_database()||$$')$$; > > > END; > > > $d$; > > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1; -- should work again > > > """ > > > > > > IMO, the above case is designed to test the following code[1]: > > > With the patch, it seems the following code[1] will not work for this case, right? > > > (It seems the connection will be disconnect in pgfdw_xact_callback) > > > > > > I do not know does it matter, or should we add a testcase to cover that? > > > > > > [1] /* > > > * If the connection needs to be remade due to invalidation, disconnect as > > > * soon as we're out of all transactions. > > > */ > > > if (entry->conn != NULL && entry->invalidated && entry->xact_depth == 0) > > > { > > > elog(DEBUG3, "closing connection %p for option changes to take effect", > > > entry->conn); > > > disconnect_pg_server(entry); > > > } > > > > Yes you are right. With the patch if the server is altered in the same > > session in which the connection exists, the connection gets closed at > > the end of that alter query txn, not at the beginning of the next > > foreign tbl query. So, that part of the code in GetConnection() > > doesn't get hit. Having said that, this code gets hit when the alter > > query is run in another session and the connection in the current > > session gets disconnected at the start of the next foreign tbl query. > > > > If we want to cover it with a test case, we might have to alter the > > foreign server in another session. I'm not sure whether we can open > > another session from the current psql session and run a sql command. I further checked on how we can add/move the test case( that is altering server options in a different session and see if the connection gets disconnected at the start of the next foreign query in the current session ) to cover the above code. Upon some initial analysis, it looks like it is possible to add that under src/test/isolation tests. Another way could be to add it using the TAP framework under contrib/postgres_fdw. Having said that, currently these two places don't have any postgres_fdw related tests, we will be the first ones to add. I'm not quite sure whether that's okay or is there any better way of doing it. Thoughts? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Fujii Masao
Date:
On 2020/12/15 11:38, Bharath Rupireddy wrote: > Hi, > > As discussed in [1], in postgres_fdw the cached connections to remote > servers can stay until the lifetime of the local session without > getting a chance to disconnect (connection leak), if the underlying > user mapping or foreign server is dropped in another session. Here are > few scenarios how this can happen: > > Use case 1: > 1) Run a foreign query in session 1 with server 1 and user mapping 1 > 2) Drop user mapping 1 in another session 2, an invalidation message > gets generated which will have to be processed by all the sessions > 3) Run the foreign query again in session 1, at the start of txn the > cached entry gets invalidated via pgfdw_inval_callback() (as part of > invalidation message processing). Whatever may be the type of foreign > query (select, update, explain, delete, insert, analyze etc.), upon > next call to GetUserMapping() from postgres_fdw.c, the cache lookup > fails(with ERROR: user mapping not found for "XXXX") since the user > mapping 1 has been dropped in session 2 and the query will also fail > before reaching GetConnection() where the connections associated with > the invalidated entries would have got disconnected. > > So, the connection associated with invalidated entry would remain > until the local session exits. > > Use case 2: > 1) Run a foreign query in session 1 with server 1 and user mapping 1 > 2) Try to drop foreign server 1, then we would not be allowed because > of dependency. Use CASCADE so that dependent objects i.e. user mapping > 1 and foreign tables get dropped along with foreign server 1. > 3) Run the foreign query again in session 1, at the start of txn, the > cached entry gets invalidated via pgfdw_inval_callback() and the query > fails because there is no foreign table. > > Note that the remote connection remains open in session 1 until the > local session exits. > > To solve the above connection leak problem, it looks like the right > place to close all the invalid connections is pgfdw_xact_callback(), > once registered, which gets called at the end of every txn in the > current session(by then all the sub txns also would have been > finished). Note that if there are too many invalidated entries, then > the following txn has to close all of them, but that's okay than > having leaked connections and it's a one time job for the following > one txn. > > Attaching a patch for the same. > > Thoughts? Thanks for making the patch! I agree to make pgfdw_xact_callback() close the connection when entry->invalidated == true. But I think that it's better to get rid of have_invalid_connections flag and make pgfdw_inval_callback() close the connection immediately if entry->xact_depth == 0, to avoid unnecessary scan of the hashtable during COMMIT of transaction not accessing to foreign servers. Attached is the POC patch that I'm thinking. Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Bharath Rupireddy
Date:
On Wed, Dec 23, 2020 at 7:31 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > I agree to make pgfdw_xact_callback() close the connection when > entry->invalidated == true. But I think that it's better to get rid of > have_invalid_connections flag and make pgfdw_inval_callback() close > the connection immediately if entry->xact_depth == 0, to avoid unnecessary > scan of the hashtable during COMMIT of transaction not accessing to > foreign servers. Attached is the POC patch that I'm thinking. Thought? We could do that way as well. It seems okay to me. Now the disconnect code is spread in pgfdw_inval_callback() and pgfdw_xact_callback(). There's also less burden on pgfdw_xact_callback() to close a lot of connections on a single commit. The behaviour is like this - If entry->xact_depth == 0, then the entries wouldn't have got any connection in the current txn so they can be immediately closed in pgfdw_inval_callback() and pgfdw_xact_callback() can exit immediately as xact_got_connection is false. If entry->xact_depth > 0 which means that probably pgfdw_inval_callback() came from a sub txn, we would have got a connection in the txn i.e. xact_got_connection becomes true due to which it can get invalidated in pgfdw_inval_callback() and get closed in pgfdw_xact_callback() at the end of the txn. And there's no chance of entry->xact_depth > 0 and xact_got_connection false. I think we need to change the comment before pgfdw_inval_callback() a bit: * After a change to a pg_foreign_server or pg_user_mapping catalog entry, * mark connections depending on that entry as needing to be remade. * We can't immediately destroy them, since they might be in the midst of * a transaction, but we'll remake them at the next opportunity. to * After a change to a pg_foreign_server or pg_user_mapping catalog entry, * try to close the cached connections associated with them if they are not in the * midst of a transaction otherwise mark them as invalidated. We will destroy the * invalidated connections in pgfdw_xact_callback() at the end of the main xact. * Closed connections will be remade at the next opportunity. Any thoughts on the earlier point in [1] related to a test case(which becomes unnecessary with this patch) coverage? [1] - https://www.postgresql.org/message-id/CALj2ACXymb%3Dd4KeOq%2Bjnh_E6yThn%2Bcf1CDRhtK_crkj0_hVimQ%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Fujii Masao
Date:
On 2020/12/23 23:40, Bharath Rupireddy wrote: > On Wed, Dec 23, 2020 at 7:31 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> I agree to make pgfdw_xact_callback() close the connection when >> entry->invalidated == true. But I think that it's better to get rid of >> have_invalid_connections flag and make pgfdw_inval_callback() close >> the connection immediately if entry->xact_depth == 0, to avoid unnecessary >> scan of the hashtable during COMMIT of transaction not accessing to >> foreign servers. Attached is the POC patch that I'm thinking. Thought? > > We could do that way as well. It seems okay to me. Now the disconnect > code is spread in pgfdw_inval_callback() and pgfdw_xact_callback(). > There's also less burden on pgfdw_xact_callback() to close a lot of > connections on a single commit. The behaviour is like this - If > entry->xact_depth == 0, then the entries wouldn't have got any > connection in the current txn so they can be immediately closed in > pgfdw_inval_callback() and pgfdw_xact_callback() can exit immediately > as xact_got_connection is false. If entry->xact_depth > 0 which means > that probably pgfdw_inval_callback() came from a sub txn, we would > have got a connection in the txn i.e. xact_got_connection becomes true > due to which it can get invalidated in pgfdw_inval_callback() and get > closed in pgfdw_xact_callback() at the end of the txn. > > And there's no chance of entry->xact_depth > 0 and xact_got_connection false. > > I think we need to change the comment before pgfdw_inval_callback() a bit: > > * After a change to a pg_foreign_server or pg_user_mapping catalog entry, > * mark connections depending on that entry as needing to be remade. > * We can't immediately destroy them, since they might be in the midst of > * a transaction, but we'll remake them at the next opportunity. > > to > > * After a change to a pg_foreign_server or pg_user_mapping catalog entry, > * try to close the cached connections associated with them if they > are not in the > * midst of a transaction otherwise mark them as invalidated. We will > destroy the > * invalidated connections in pgfdw_xact_callback() at the end of the main xact. > * Closed connections will be remade at the next opportunity. Yes, I agree that we need to update that comment. > > Any thoughts on the earlier point in [1] related to a test case(which > becomes unnecessary with this patch) coverage? > ISTM that we can leave that test as it is because it's still useful to test the case where the cached connection is discarded in pgfdw_inval_callback(). Thought? By applying the patch, probably we can get rid of the code to discard the invalidated cached connection in GetConnection(). But at least for the back branches, I'd like to leave the code as it is so that we can make sure that the invalidated cached connection doesn't exist when getting new connection. Maybe we can improve that in the master, but I'm not sure if it's really worth doing that against the gain. Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Bharath Rupireddy
Date:
On Thu, Dec 24, 2020 at 7:21 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > On 2020/12/23 23:40, Bharath Rupireddy wrote: > > On Wed, Dec 23, 2020 at 7:31 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > >> I agree to make pgfdw_xact_callback() close the connection when > >> entry->invalidated == true. But I think that it's better to get rid of > >> have_invalid_connections flag and make pgfdw_inval_callback() close > >> the connection immediately if entry->xact_depth == 0, to avoid unnecessary > >> scan of the hashtable during COMMIT of transaction not accessing to > >> foreign servers. Attached is the POC patch that I'm thinking. Thought? > > > > We could do that way as well. It seems okay to me. Now the disconnect > > code is spread in pgfdw_inval_callback() and pgfdw_xact_callback(). > > There's also less burden on pgfdw_xact_callback() to close a lot of > > connections on a single commit. The behaviour is like this - If > > entry->xact_depth == 0, then the entries wouldn't have got any > > connection in the current txn so they can be immediately closed in > > pgfdw_inval_callback() and pgfdw_xact_callback() can exit immediately > > as xact_got_connection is false. If entry->xact_depth > 0 which means > > that probably pgfdw_inval_callback() came from a sub txn, we would > > have got a connection in the txn i.e. xact_got_connection becomes true > > due to which it can get invalidated in pgfdw_inval_callback() and get > > closed in pgfdw_xact_callback() at the end of the txn. > > > > And there's no chance of entry->xact_depth > 0 and xact_got_connection false. > > > > I think we need to change the comment before pgfdw_inval_callback() a bit: > > > > * After a change to a pg_foreign_server or pg_user_mapping catalog entry, > > * mark connections depending on that entry as needing to be remade. > > * We can't immediately destroy them, since they might be in the midst of > > * a transaction, but we'll remake them at the next opportunity. > > > > to > > > > * After a change to a pg_foreign_server or pg_user_mapping catalog entry, > > * try to close the cached connections associated with them if they > > are not in the > > * midst of a transaction otherwise mark them as invalidated. We will > > destroy the > > * invalidated connections in pgfdw_xact_callback() at the end of the main xact. > > * Closed connections will be remade at the next opportunity. > > Yes, I agree that we need to update that comment. > > > > > Any thoughts on the earlier point in [1] related to a test case(which > > becomes unnecessary with this patch) coverage? > > > > ISTM that we can leave that test as it is because it's still useful to test > the case where the cached connection is discarded in pgfdw_inval_callback(). > Thought? Yes, that test case covers the code this patch adds i.e. closing the connection in pgfdw_inval_callback() while committing alter foreign server stmt. > By applying the patch, probably we can get rid of the code to discard > the invalidated cached connection in GetConnection(). But at least for > the back branches, I'd like to leave the code as it is so that we can make > sure that the invalidated cached connection doesn't exist when getting > new connection. Maybe we can improve that in the master, but I'm not > sure if it's really worth doing that against the gain. Thought? +1 to keep that code as is even after this patch is applied(at least it works as an assertion that we don't have any leftover invalid connections). I'm not quite sure, we may need that in some cases, say if we don't hit disconnect_pg_server() code in pgfdw_xact_callback() and pgfdw_inval_callback() due to some errors in between. I can not think of an exact use case though. Attaching v2 patch that adds the comments and the other test case that covers disconnecting code in pgfdw_xact_callback. Please review it. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Attachment
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Fujii Masao
Date:
On 2020/12/24 15:42, Bharath Rupireddy wrote: > On Thu, Dec 24, 2020 at 7:21 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> On 2020/12/23 23:40, Bharath Rupireddy wrote: >>> On Wed, Dec 23, 2020 at 7:31 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >>>> I agree to make pgfdw_xact_callback() close the connection when >>>> entry->invalidated == true. But I think that it's better to get rid of >>>> have_invalid_connections flag and make pgfdw_inval_callback() close >>>> the connection immediately if entry->xact_depth == 0, to avoid unnecessary >>>> scan of the hashtable during COMMIT of transaction not accessing to >>>> foreign servers. Attached is the POC patch that I'm thinking. Thought? >>> >>> We could do that way as well. It seems okay to me. Now the disconnect >>> code is spread in pgfdw_inval_callback() and pgfdw_xact_callback(). >>> There's also less burden on pgfdw_xact_callback() to close a lot of >>> connections on a single commit. The behaviour is like this - If >>> entry->xact_depth == 0, then the entries wouldn't have got any >>> connection in the current txn so they can be immediately closed in >>> pgfdw_inval_callback() and pgfdw_xact_callback() can exit immediately >>> as xact_got_connection is false. If entry->xact_depth > 0 which means >>> that probably pgfdw_inval_callback() came from a sub txn, we would >>> have got a connection in the txn i.e. xact_got_connection becomes true >>> due to which it can get invalidated in pgfdw_inval_callback() and get >>> closed in pgfdw_xact_callback() at the end of the txn. >>> >>> And there's no chance of entry->xact_depth > 0 and xact_got_connection false. >>> >>> I think we need to change the comment before pgfdw_inval_callback() a bit: >>> >>> * After a change to a pg_foreign_server or pg_user_mapping catalog entry, >>> * mark connections depending on that entry as needing to be remade. >>> * We can't immediately destroy them, since they might be in the midst of >>> * a transaction, but we'll remake them at the next opportunity. >>> >>> to >>> >>> * After a change to a pg_foreign_server or pg_user_mapping catalog entry, >>> * try to close the cached connections associated with them if they >>> are not in the >>> * midst of a transaction otherwise mark them as invalidated. We will >>> destroy the >>> * invalidated connections in pgfdw_xact_callback() at the end of the main xact. >>> * Closed connections will be remade at the next opportunity. >> >> Yes, I agree that we need to update that comment. >> >>> >>> Any thoughts on the earlier point in [1] related to a test case(which >>> becomes unnecessary with this patch) coverage? >>> >> >> ISTM that we can leave that test as it is because it's still useful to test >> the case where the cached connection is discarded in pgfdw_inval_callback(). >> Thought? > > Yes, that test case covers the code this patch adds i.e. closing the > connection in pgfdw_inval_callback() while committing alter foreign > server stmt. > >> By applying the patch, probably we can get rid of the code to discard >> the invalidated cached connection in GetConnection(). But at least for >> the back branches, I'd like to leave the code as it is so that we can make >> sure that the invalidated cached connection doesn't exist when getting >> new connection. Maybe we can improve that in the master, but I'm not >> sure if it's really worth doing that against the gain. Thought? > > +1 to keep that code as is even after this patch is applied(at least > it works as an assertion that we don't have any leftover invalid > connections). I'm not quite sure, we may need that in some cases, say > if we don't hit disconnect_pg_server() code in pgfdw_xact_callback() > and pgfdw_inval_callback() due to some errors in between. I can not > think of an exact use case though. > > Attaching v2 patch that adds the comments and the other test case that > covers disconnecting code in pgfdw_xact_callback. Please review it. Thanks for updating the patch! It basically looks good to me except the following minor things. + * After a change to a pg_foreign_server or pg_user_mapping catalog entry, try + * to close the cached connections associated with them if they are not in the + * midst of a transaction otherwise mark them as invalid. We will destroy the + * invalidated connections in pgfdw_xact_callback() at the end of the main + * xact. Closed connections will be remade at the next opportunity. Even when we're in the midst of transaction, if that transaction has not used the cached connections yet, we close them immediately. So, to make the comment more precise, what about updating the comment as follows? --------------------- After a change to a pg_foreign_server or pg_user_mapping catalog entry, close connections depending on that entry immediately if current transaction has not used those connections yet. Otherwise, mark those connections as invalid and then make pgfdw_xact_callback() close them at the end of current transaction, since they cannot be closed in the midst of a transaction using them. Closed connections will be remade at the next opportunity if necessary. --------------------- + /* + * Close the connection if it's not in midst of a xact. Otherwise + * mark it as invalid so that it can be disconnected at the end of + * main xact in pgfdw_xact_callback(). + */ Because of the same reason as the above, what about updating this comment as follows? --------------------- Close the connection immediately if it's not used yet in this transaction. Otherwise mark it as invalid so that pgfdw_xact_callback() can close it at the end of this transaction. --------------------- Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Bharath Rupireddy
Date:
On Thu, Dec 24, 2020 at 7:43 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > Even when we're in the midst of transaction, if that transaction has not used > the cached connections yet, we close them immediately. So, to make the > comment more precise, what about updating the comment as follows? > > --------------------- > After a change to a pg_foreign_server or pg_user_mapping catalog entry, > close connections depending on that entry immediately if current > transaction has not used those connections yet. Otherwise, mark those > connections as invalid and then make pgfdw_xact_callback() close them > at the end of current transaction, since they cannot be closed in the midst > of a transaction using them. Closed connections will be remade at the next > opportunity if necessary. > --------------------- Done. > + /* > + * Close the connection if it's not in midst of a xact. Otherwise > + * mark it as invalid so that it can be disconnected at the end of > + * main xact in pgfdw_xact_callback(). > + */ > > Because of the same reason as the above, what about updating this comment > as follows? > > --------------------- > Close the connection immediately if it's not used yet in this transaction. > Otherwise mark it as invalid so that pgfdw_xact_callback() can close it > at the end of this transaction. > --------------------- Done. Attaching v3 patch. Please have a look. Thanks! With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Attachment
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Fujii Masao
Date:
On 2020/12/24 23:30, Bharath Rupireddy wrote: > On Thu, Dec 24, 2020 at 7:43 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> Even when we're in the midst of transaction, if that transaction has not used >> the cached connections yet, we close them immediately. So, to make the >> comment more precise, what about updating the comment as follows? >> >> --------------------- >> After a change to a pg_foreign_server or pg_user_mapping catalog entry, >> close connections depending on that entry immediately if current >> transaction has not used those connections yet. Otherwise, mark those >> connections as invalid and then make pgfdw_xact_callback() close them >> at the end of current transaction, since they cannot be closed in the midst >> of a transaction using them. Closed connections will be remade at the next >> opportunity if necessary. >> --------------------- > > Done. > >> + /* >> + * Close the connection if it's not in midst of a xact. Otherwise >> + * mark it as invalid so that it can be disconnected at the end of >> + * main xact in pgfdw_xact_callback(). >> + */ >> >> Because of the same reason as the above, what about updating this comment >> as follows? >> >> --------------------- >> Close the connection immediately if it's not used yet in this transaction. >> Otherwise mark it as invalid so that pgfdw_xact_callback() can close it >> at the end of this transaction. >> --------------------- > > Done. > > Attaching v3 patch. Please have a look. Thanks! Thanks a lot! Barring any objection, I will commit this version. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
From
Fujii Masao
Date:
On 2020/12/24 23:45, Fujii Masao wrote: > > > On 2020/12/24 23:30, Bharath Rupireddy wrote: >> On Thu, Dec 24, 2020 at 7:43 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >>> Even when we're in the midst of transaction, if that transaction has not used >>> the cached connections yet, we close them immediately. So, to make the >>> comment more precise, what about updating the comment as follows? >>> >>> --------------------- >>> After a change to a pg_foreign_server or pg_user_mapping catalog entry, >>> close connections depending on that entry immediately if current >>> transaction has not used those connections yet. Otherwise, mark those >>> connections as invalid and then make pgfdw_xact_callback() close them >>> at the end of current transaction, since they cannot be closed in the midst >>> of a transaction using them. Closed connections will be remade at the next >>> opportunity if necessary. >>> --------------------- >> >> Done. >> >>> + /* >>> + * Close the connection if it's not in midst of a xact. Otherwise >>> + * mark it as invalid so that it can be disconnected at the end of >>> + * main xact in pgfdw_xact_callback(). >>> + */ >>> >>> Because of the same reason as the above, what about updating this comment >>> as follows? >>> >>> --------------------- >>> Close the connection immediately if it's not used yet in this transaction. >>> Otherwise mark it as invalid so that pgfdw_xact_callback() can close it >>> at the end of this transaction. >>> --------------------- >> >> Done. >> >> Attaching v3 patch. Please have a look. Thanks! > > Thanks a lot! Barring any objection, I will commit this version. Pushed. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION