Re: Introduce XID age and inactive timeout based replication slot invalidation - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Introduce XID age and inactive timeout based replication slot invalidation
Date
Msg-id ZfFT7tgWpqx7oZko@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Introduce XID age and inactive timeout based replication slot invalidation  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Introduce XID age and inactive timeout based replication slot invalidation
List pgsql-hackers
Hi,

On Tue, Mar 12, 2024 at 09:19:35PM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > > AFAIR, we don't prevent similar invalidations due to
> > > 'max_slot_wal_keep_size' for sync slots,
> >
> > Right, we'd invalidate them on the standby should the standby sync slot restart_lsn
> > exceeds the limit.
> 
> Right. Help me understand this a bit - is the wal_removed invalidation
> going to conflict with recovery on the standby?

I don't think so, as it's not directly related to recovery. The slot will
be invalided on the standby though.

> Per the discussion upthread, I'm trying to understand what
> invalidation reasons will exactly cause conflict with recovery? Is it
> just rows_removed and wal_level_insufficient invalidations? 

Yes, that's the ones added in be87200efd.

See the error messages on a standby:

== wal removal

postgres=#  SELECT * FROM pg_logical_slot_get_changes('lsub4_slot', NULL, NULL, 'include-xids', '0');
ERROR:  can no longer get changes from replication slot "lsub4_slot"
DETAIL:  This slot has been invalidated because it exceeded the maximum reserved size.

== wal level

postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub5_slot';;
    conflict_reason
------------------------
 wal_level_insufficient
(1 row)

postgres=#  SELECT * FROM pg_logical_slot_get_changes('lsub5_slot', NULL, NULL, 'include-xids', '0');
ERROR:  can no longer get changes from replication slot "lsub5_slot"
DETAIL:  This slot has been invalidated because it was conflicting with recovery.

== rows removal

postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub6_slot';;
 conflict_reason
-----------------
 rows_removed
(1 row)

postgres=#  SELECT * FROM pg_logical_slot_get_changes('lsub6_slot', NULL, NULL, 'include-xids', '0');
ERROR:  can no longer get changes from replication slot "lsub6_slot"
DETAIL:  This slot has been invalidated because it was conflicting with recovery.

As you can see, only wal level and rows removal are mentioning conflict with
recovery.

So, are we already "wrong" mentioning "wal_removed" in conflict_reason?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Xing Guo
Date:
Subject: Re: Disable LLVM bitcode generation with pgxs.mk framework.
Next
From: Heikki Linnakangas
Date:
Subject: Re: Refactoring backend fork+exec code