Re: Simplify backend terminate and wait logic in postgres_fdw test - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Simplify backend terminate and wait logic in postgres_fdw test
Date
Msg-id 3908056.1620142689@sss.pgh.pa.us
Whole thread Raw
In response to Re: Simplify backend terminate and wait logic in postgres_fdw test  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Simplify backend terminate and wait logic in postgres_fdw test
Re: Simplify backend terminate and wait logic in postgres_fdw test
List pgsql-hackers
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> On Tue, May 4, 2021 at 4:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The buildfarm is showing that one of these test queries is not stable
>> under CLOBBER_CACHE_ALWAYS:

> I can reproduce the issue with the failing case. Issue is that the
> backend pid will be null in the pg_stat_activity because of the cache
> invalidation that happens at the beginning of the query and hence
> pg_terminate_backend returns null on null input.

No, that's nonsense: if it were happening that way, the query would
return one row with a NULL result, but actually it's returning no
rows.  What's actually happening, it seems, is that because
pgfdw_inval_callback is constantly getting called due to cache
flushes, we invariably drop remote connections immediately during
transaction termination (cf pgfdw_xact_callback).  Thus, by the time
we inspect pg_stat_activity, there is no remote session to terminate.

I don't like your patch because what it effectively does is mask
whether termination happened or not; if there were a bug there
causing that not to happen, the test would still appear to pass.

I think the most expedient fix, if we want to keep this test, is
just to transiently disable debug_invalidate_system_caches_always.
(That option wasn't available before v14, but fortunately we
don't need a fix for the back branches.)

I believe the attached will do the trick, but I'm running the test
with debug_invalidate_system_caches_always turned on to verify
that.  Should be done in an hour or so...

            regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 8e1cc69508..6f533c745d 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9204,6 +9204,12 @@ WARNING:  there is no transaction in progress
 -- Change application_name of remote connection to special one
 -- so that we can easily terminate the connection later.
 ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
+-- If debug_invalidate_system_caches_always is active, it results in
+-- dropping remote connections after every transaction, making it
+-- impossible to test termination meaningfully.  So turn that off
+-- for this test.
+SET debug_invalidate_system_caches_always = 0;
+-- Make sure we have a remote connection.
 SELECT 1 FROM ft1 LIMIT 1;
  ?column?
 ----------
@@ -9227,9 +9233,8 @@ SELECT 1 FROM ft1 LIMIT 1;
         1
 (1 row)

--- If the query detects the broken connection when starting new remote
--- subtransaction, it doesn't reestablish new connection and should fail.
--- The text of the error might vary across platforms, so don't show it.
+-- If we detect the broken connection when starting a new remote
+-- subtransaction, we should fail instead of establishing a new connection.
 -- Terminate the remote connection and wait for the termination to complete.
 SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
     WHERE application_name = 'fdw_retry_check';
@@ -9239,11 +9244,13 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
 (1 row)

 SAVEPOINT s;
+-- The text of the error might vary across platforms, so only show SQLSTATE.
 \set VERBOSITY sqlstate
 SELECT 1 FROM ft1 LIMIT 1;    -- should fail
 ERROR:  08006
 \set VERBOSITY default
 COMMIT;
+RESET debug_invalidate_system_caches_always;
 -- =============================================================================
 -- test connection invalidation cases and postgres_fdw_get_connections function
 -- =============================================================================
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index dcd36a9753..000e2534fc 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -2795,6 +2795,14 @@ ROLLBACK;
 -- Change application_name of remote connection to special one
 -- so that we can easily terminate the connection later.
 ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
+
+-- If debug_invalidate_system_caches_always is active, it results in
+-- dropping remote connections after every transaction, making it
+-- impossible to test termination meaningfully.  So turn that off
+-- for this test.
+SET debug_invalidate_system_caches_always = 0;
+
+-- Make sure we have a remote connection.
 SELECT 1 FROM ft1 LIMIT 1;

 -- Terminate the remote connection and wait for the termination to complete.
@@ -2806,18 +2814,20 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
 BEGIN;
 SELECT 1 FROM ft1 LIMIT 1;

--- If the query detects the broken connection when starting new remote
--- subtransaction, it doesn't reestablish new connection and should fail.
--- The text of the error might vary across platforms, so don't show it.
+-- If we detect the broken connection when starting a new remote
+-- subtransaction, we should fail instead of establishing a new connection.
 -- Terminate the remote connection and wait for the termination to complete.
 SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
     WHERE application_name = 'fdw_retry_check';
 SAVEPOINT s;
+-- The text of the error might vary across platforms, so only show SQLSTATE.
 \set VERBOSITY sqlstate
 SELECT 1 FROM ft1 LIMIT 1;    -- should fail
 \set VERBOSITY default
 COMMIT;

+RESET debug_invalidate_system_caches_always;
+
 -- =============================================================================
 -- test connection invalidation cases and postgres_fdw_get_connections function
 -- =============================================================================

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Unresolved repliaction hang and stop problem.
Next
From: Dilip Kumar
Date:
Subject: Re: .ready and .done files considered harmful