Re: Simplify backend terminate and wait logic in postgres_fdw test - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Simplify backend terminate and wait logic in postgres_fdw test
Date
Msg-id 3854538.1620081771@sss.pgh.pa.us
Whole thread Raw
In response to Re: Simplify backend terminate and wait logic in postgres_fdw test  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Simplify backend terminate and wait logic in postgres_fdw test
List pgsql-hackers
Michael Paquier <michael@paquier.xyz> writes:
> On Tue, Apr 13, 2021 at 04:39:58PM +0900, Michael Paquier wrote:
>> Looks fine to me.  Let's wait a bit first to see if Fujii-san has any
>> objections to this cleanup as that's his code originally, from
>> 32a9c0bd.

> And hearing nothing, done.  The tests of postgres_fdw are getting much
> faster for me now, from basically 6s to 4s (actually that's roughly
> 1.8s of gain as pg_wait_until_termination waits at least 100ms,
> twice), so that's a nice gain.

The buildfarm is showing that one of these test queries is not stable
under CLOBBER_CACHE_ALWAYS:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2021-05-01%2007%3A44%3A47

of which the relevant part is:

diff -U3 /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/expected/postgres_fdw.out
/home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/results/postgres_fdw.out
--- /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/expected/postgres_fdw.out    2021-05-01
03:44:45.022300613-0400 
+++ /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/results/postgres_fdw.out    2021-05-03
09:11:24.051379288-0400 
@@ -9215,8 +9215,7 @@
     WHERE application_name = 'fdw_retry_check';
  pg_terminate_backend
 ----------------------
- t
-(1 row)
+(0 rows)

 -- This query should detect the broken connection when starting new remote
 -- transaction, reestablish new connection, and then succeed.

I can reproduce that locally by setting

alter system set debug_invalidate_system_caches_always = 1;

and running "make installcheck" in contrib/postgres_fdw.
(It takes a good long time to run the whole test script
though, so you might want to see if running just these few
queries is enough.)

There's no evidence of distress in the postmaster log,
so I suspect this might just be a timing instability,
e.g. remote process already gone before local process
looks.  If so, it's probably hopeless to make this
test stable as-is.  Perhaps we should just take it out.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: MaxOffsetNumber for Table AMs
Next
From: David Rowley
Date:
Subject: Re: Performance Evaluation of Result Cache by using TPC-DS