Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To: - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:
Date
Msg-id CA+hUKGLFH=5Wg0cX_oXXwySV3eN9iKkHYtyej3haPyFc7w2JRw@mail.gmail.com
Whole thread Raw
In response to Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:
List pgsql-hackers
On Sat, May 7, 2022 at 9:37 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> So far "grison" failed.  I think it's probably just that the test
> forgot to wait for replay of CREATE EXTENSION before using pg_prewarm
> on the standby, hence "ERROR:  function pg_prewarm(oid) does not exist
> at character 12".  I'll wait for more animals to report before I try
> to fix that tomorrow.

That one was addressed by commit a22652e.  Unfortunately two new kinds
of failure showed up:

Chipmunk, another little early model Raspberry Pi:

error running SQL: 'psql:<stdin>:1: ERROR:  source database
"conflict_db_template" is being accessed by other users
DETAIL:  There is 1 other session using the database.'
while running 'psql -XAtq -d port=57394 host=/tmp/luyJopPv9L
dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'CREATE DATABASE
conflict_db TEMPLATE conflict_db_template OID = 50001;' at
/home/pgbfarm/buildroot/HEAD/pgsql.build/src/test/recovery/../../../src/test/perl/PostgreSQL/Test/Cluster.pm
line 1836.

I think that might imply that when you do two
$node_primary->safe_psql() calls in a row, the backend running the
second one might still see the ghost of the first backend in the
database, even though the first psql process has exited.  Hmmm.

Skink, the valgrind animal, also failed.  After first 8 tests, it times out:

[07:18:26.237](14.827s) ok 8 - standby: post move contents as expected
[07:18:42.877](16.641s) Bail out!  aborting wait: program timed out
stream contents: >><<
pattern searched for: (?^m:warmed_buffers)

That's be in the perl routine cause_eviction(), while waiting for a
pg_prewarm query to return, but I'm not yet sure what's going on (I
have to suspect it's in the perl scripting rather than anything in the
server, given the location of the failure).  Will try to repro
locally.



pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: gitmaster access
Next
From: Andres Freund
Date:
Subject: Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508