Hi,
I tested the v19 new feature CREATE SUBSCRIPTION ... SERVER yesterday, and found an issue: once the old server becomes
broken,the subscription cannot be recovered by switching it to a good server.
Here is a repro:
```
evantest=# CREATE EXTENSION postgres_fdw;
CREATE EXTENSION
# Create two servers
evantest=# CREATE SERVER old_srv FOREIGN DATA WRAPPER postgres_fdw
evantest-# OPTIONS (host '127.0.0.1', dbname 'postgres', port '5432');
CREATE SERVER
evantest=# CREATE SERVER new_srv FOREIGN DATA WRAPPER postgres_fdw
evantest-# OPTIONS (host '127.0.0.1', dbname 'postgres', port '5432');
CREATE SERVER
evantest=# CREATE USER MAPPING FOR CURRENT_USER SERVER old_srv
evantest-# OPTIONS (user 'dummy', password 'dummy');
CREATE USER MAPPING
evantest=# CREATE USER MAPPING FOR CURRENT_USER SERVER new_srv
evantest-# OPTIONS (user 'dummy', password 'dummy');
CREATE USER MAPPING
# Add old_server to a subscription
evantest=# CREATE SUBSCRIPTION sub_bug SERVER old_srv
evantest-# PUBLICATION pub_does_not_matter
evantest-# WITH (connect = false, slot_name = NONE);
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and alter the
subscriptionto refresh publications.
CREATE SUBSCRIPTION
# Break old_srv
evantest=# DROP USER MAPPING FOR CURRENT_USER SERVER old_srv;
DROP USER MAPPING
# Fail to switch to a good server because old_srv is broken
evantest=# ALTER SUBSCRIPTION sub_bug SERVER new_srv;
ERROR: user mapping not found for user "chaol", server "old_srv"
evantest=# ALTER SUBSCRIPTION sub_bug CONNECTION 'host=127.0.0.1 dbname=postgres port=5432 user=dummy password=dummy';
ERROR: user mapping not found for user "chaol", server "old_srv"
```
In AlterSubscription(), this comment made me think this is a bug:
```
/*
* Skip ACL checks on the subscription's foreign server, if any. If
* changing the server (or replacing it with a raw connection), then the
* old one will be removed anyway. If changing something unrelated,
* there's no need to do an additional ACL check here; that will be done
* by the subscription worker anyway.
*/
sub = GetSubscription(subid, false, false);
```
The comment explicitly says to skip ACL checks on the old server because it will be removed anyway.
But after looking into GetSubscription(), I found that when the aclcheck parameter is false, it still calls
ForeignServerConnectionString().I think that is the root cause of the bug.
To fix this, I worked out a solution that stores the server OID in Subscription, and only calls
ForeignServerConnectionString()lazilywhen sub->conninfo is actually needed. I also added a new test case to cover this
scenario.Without the fix, the new test fails.
See attached patch for details.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/