Re: Implement waiting for wal lsn replay: reloaded - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Implement waiting for wal lsn replay: reloaded
Date
Msg-id CAPpHfdudx4kzV58T0LNZyj+HdYA8bmsgkB+j5x6d9fxCWpdRMA@mail.gmail.com
Whole thread Raw
In response to Re: Implement waiting for wal lsn replay: reloaded  (Xuneng Zhou <xunengzhou@gmail.com>)
Responses Re: Implement waiting for wal lsn replay: reloaded
List pgsql-hackers
On Tue, Jan 6, 2026 at 9:29 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> On Tue, Jan 6, 2026 at 1:43 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > Could this be causing the recent flapping failures on CI/macOS in
> > recovery/031_recovery_conflict?  I didn't have time to dig personally
> > but f30848cb looks relevant:
> >
> > Waiting for replication conn standby's replay_lsn to pass 0/03467F58 on primary
> > error running SQL: 'psql:<stdin>:1: ERROR:  canceling statement due to
> > conflict with recovery
> > DETAIL:  User was or might have been using tablespace that must be dropped.'
> > while running 'psql --no-psqlrc --no-align --tuples-only --quiet
> > --dbname port=25195
> > host=/var/folders/g9/7rkt8rt1241bwwhd3_s8ndp40000gn/T/LqcCJnsueI
> > dbname='postgres' --file - --variable ON_ERROR_STOP=1' with sql 'WAIT
> > FOR LSN '0/03467F58' WITH (MODE 'standby_replay', timeout '180s',
> > no_throw);' at /Users/admin/pgsql/src/test/perl/PostgreSQL/Test/Cluster.pm
> > line 2300.
> >
> > https://cirrus-ci.com/task/5771274900733952
> >
> > The master branch in time-descending order, macOS tasks only:
> >
> >      task_id      | substring |  status
> > ------------------+-----------+-----------
> >  6460882231754752 | c970bdc0  | FAILED
> >  5771274900733952 | 6ca8506e  | FAILED
> >  6217757068361728 | 63ed3bc7  | FAILED
> >  5980650261446656 | ae283736  | FAILED
> >  6585898394976256 | 5f13999a  | COMPLETED
> >  4527474786172928 | 7f9acc9b  | COMPLETED
> >  4826100842364928 | e8d4e94a  | COMPLETED
> >  4540563027918848 | b9ee5f2d  | FAILED
> >  6358528648019968 | c5af141c  | FAILED
> >  5998005284765696 | e212a0f8  | COMPLETED
> >  6488580526178304 | b85d5dc0  | FAILED
> >  5034091344560128 | 7dc95cc3  | ABORTED
> >  5688692477526016 | bb048e31  | COMPLETED
> >  5481187977723904 | d351063e  | COMPLETED
> >  5101831568752640 | f30848cb  | COMPLETED <-- the change
> >  6395317408497664 | 3f33b63d  | COMPLETED
> >  6741325208354816 | 877ae5db  | COMPLETED
> >  4594007789010944 | de746e0d  | COMPLETED
> >  6497208998035456 | 461b8cc9  | COMPLETED
>
> Thanks for raising this issue. I think it is related to f30848cb after
> some analysis. I'll prepare a follow-up patch to fix it.

Sorry, I've mistakenly referenced this report from commit [1].  I
thought it was related, but it appears to be not.  [1] is related to
the report I've got from Ruikai Peng off-list.

Regarding the present failure, could it happen before ExecWaitStmt()
calls PopActiveSnapshot() and InvalidateCatalogSnapshot()?  If so, we
should do preliminary efforts to release these snapshots.

1. https://git.postgresql.org/pg/commitdiff/bf308639bfcfa38541e24733e074184153a8ab7f

------
Regards,
Alexander Korotkov
Supabase



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Multixid SLRU truncation bugs at wraparound
Next
From: Etsuro Fujita
Date:
Subject: Re: Import Statistics in postgres_fdw before resorting to sampling.