Thread: Introduce XID age and inactive timeout based replication slot invalidation

Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

11 January 2024, 05:18:13

Hi,

Replication slots in postgres will prevent removal of required
resources when there is no connection using them (inactive). This
consumes storage because neither required WAL nor required rows from
the user tables/system catalogs can be removed by VACUUM as long as
they are required by a replication slot. In extreme cases this could
cause the transaction ID wraparound.

Currently postgres has the ability to invalidate inactive replication
slots based on the amount of WAL (set via max_slot_wal_keep_size GUC)
that will be needed for the slots in case they become active. However,
the wraparound issue isn't effectively covered by
max_slot_wal_keep_size - one can't tell postgres to invalidate a
replication slot if it is blocking VACUUM. Also, it is often tricky to
choose a default value for max_slot_wal_keep_size, because the amount
of WAL that gets generated and allocated storage for the database can
vary.

Therefore, it is often easy for developers to do the following:
a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
billion, after which the slots get invalidated.
b) set a timeout of say 1 or 2 or 3 days, after which the inactive
slots get invalidated.

To implement (a), postgres needs a new GUC called max_slot_xid_age.
The checkpointer then invalidates all the slots whose xmin (the oldest
transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain) has reached the age
specified by this setting.

To implement (b), first postgres needs to track the replication slot
metrics like the time at which the slot became inactive (inactive_at
timestamptz) and the total number of times the slot became inactive in
its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
structure. And, then it needs a new timeout GUC called
inactive_replication_slot_timeout. Whenever a slot becomes inactive,
the current timestamp and inactive count are stored in
ReplicationSlotPersistentData structure and persisted to disk. The
checkpointer then invalidates all the slots that are lying inactive
for about inactive_replication_slot_timeout duration starting from
inactive_at.

In addition to implementing (b), these two new metrics enable
developers to improve their monitoring tools as the metrics are
exposed via pg_replication_slots system view. For instance, one can
build a monitoring tool that signals when replication slots are lying
inactive for a day or so using inactive_at metric, and/or when a
replication slot is becoming inactive too frequently using inactive_at
metric.

I’m attaching the v1 patch set as described below:
0001 - Tracks invalidation_reason in pg_replication_slots. This is
needed because slots now have multiple reasons for slot invalidation.
0002 - Tracks inactive replication slot information inactive_at and
inactive_timeout.
0003 - Adds inactive_timeout based replication slot invalidation.
0004 - Adds XID based replication slot invalidation.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

26 January 2024, 19:48:00

On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Hi,
>
> Replication slots in postgres will prevent removal of required
> resources when there is no connection using them (inactive). This
> consumes storage because neither required WAL nor required rows from
> the user tables/system catalogs can be removed by VACUUM as long as
> they are required by a replication slot. In extreme cases this could
> cause the transaction ID wraparound.
>
> Currently postgres has the ability to invalidate inactive replication
> slots based on the amount of WAL (set via max_slot_wal_keep_size GUC)
> that will be needed for the slots in case they become active. However,
> the wraparound issue isn't effectively covered by
> max_slot_wal_keep_size - one can't tell postgres to invalidate a
> replication slot if it is blocking VACUUM. Also, it is often tricky to
> choose a default value for max_slot_wal_keep_size, because the amount
> of WAL that gets generated and allocated storage for the database can
> vary.
>
> Therefore, it is often easy for developers to do the following:
> a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
> billion, after which the slots get invalidated.
> b) set a timeout of say 1 or 2 or 3 days, after which the inactive
> slots get invalidated.
>
> To implement (a), postgres needs a new GUC called max_slot_xid_age.
> The checkpointer then invalidates all the slots whose xmin (the oldest
> transaction that this slot needs the database to retain) or
> catalog_xmin (the oldest transaction affecting the system catalogs
> that this slot needs the database to retain) has reached the age
> specified by this setting.
>
> To implement (b), first postgres needs to track the replication slot
> metrics like the time at which the slot became inactive (inactive_at
> timestamptz) and the total number of times the slot became inactive in
> its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
> structure. And, then it needs a new timeout GUC called
> inactive_replication_slot_timeout. Whenever a slot becomes inactive,
> the current timestamp and inactive count are stored in
> ReplicationSlotPersistentData structure and persisted to disk. The
> checkpointer then invalidates all the slots that are lying inactive
> for about inactive_replication_slot_timeout duration starting from
> inactive_at.
>
> In addition to implementing (b), these two new metrics enable
> developers to improve their monitoring tools as the metrics are
> exposed via pg_replication_slots system view. For instance, one can
> build a monitoring tool that signals when replication slots are lying
> inactive for a day or so using inactive_at metric, and/or when a
> replication slot is becoming inactive too frequently using inactive_at
> metric.
>
> I’m attaching the v1 patch set as described below:
> 0001 - Tracks invalidation_reason in pg_replication_slots. This is
> needed because slots now have multiple reasons for slot invalidation.
> 0002 - Tracks inactive replication slot information inactive_at and
> inactive_timeout.
> 0003 - Adds inactive_timeout based replication slot invalidation.
> 0004 - Adds XID based replication slot invalidation.
>
> Thoughts?

Needed a rebase due to c393308b. Please find the attached v2 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

31 January 2024, 13:05:00

On Sat, Jan 27, 2024 at 1:18 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Hi,
> >
> > Replication slots in postgres will prevent removal of required
> > resources when there is no connection using them (inactive). This
> > consumes storage because neither required WAL nor required rows from
> > the user tables/system catalogs can be removed by VACUUM as long as
> > they are required by a replication slot. In extreme cases this could
> > cause the transaction ID wraparound.
> >
> > Currently postgres has the ability to invalidate inactive replication
> > slots based on the amount of WAL (set via max_slot_wal_keep_size GUC)
> > that will be needed for the slots in case they become active. However,
> > the wraparound issue isn't effectively covered by
> > max_slot_wal_keep_size - one can't tell postgres to invalidate a
> > replication slot if it is blocking VACUUM. Also, it is often tricky to
> > choose a default value for max_slot_wal_keep_size, because the amount
> > of WAL that gets generated and allocated storage for the database can
> > vary.
> >
> > Therefore, it is often easy for developers to do the following:
> > a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
> > billion, after which the slots get invalidated.
> > b) set a timeout of say 1 or 2 or 3 days, after which the inactive
> > slots get invalidated.
> >
> > To implement (a), postgres needs a new GUC called max_slot_xid_age.
> > The checkpointer then invalidates all the slots whose xmin (the oldest
> > transaction that this slot needs the database to retain) or
> > catalog_xmin (the oldest transaction affecting the system catalogs
> > that this slot needs the database to retain) has reached the age
> > specified by this setting.
> >
> > To implement (b), first postgres needs to track the replication slot
> > metrics like the time at which the slot became inactive (inactive_at
> > timestamptz) and the total number of times the slot became inactive in
> > its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
> > structure. And, then it needs a new timeout GUC called
> > inactive_replication_slot_timeout. Whenever a slot becomes inactive,
> > the current timestamp and inactive count are stored in
> > ReplicationSlotPersistentData structure and persisted to disk. The
> > checkpointer then invalidates all the slots that are lying inactive
> > for about inactive_replication_slot_timeout duration starting from
> > inactive_at.
> >
> > In addition to implementing (b), these two new metrics enable
> > developers to improve their monitoring tools as the metrics are
> > exposed via pg_replication_slots system view. For instance, one can
> > build a monitoring tool that signals when replication slots are lying
> > inactive for a day or so using inactive_at metric, and/or when a
> > replication slot is becoming inactive too frequently using inactive_at
> > metric.
> >
> > I’m attaching the v1 patch set as described below:
> > 0001 - Tracks invalidation_reason in pg_replication_slots. This is
> > needed because slots now have multiple reasons for slot invalidation.
> > 0002 - Tracks inactive replication slot information inactive_at and
> > inactive_timeout.
> > 0003 - Adds inactive_timeout based replication slot invalidation.
> > 0004 - Adds XID based replication slot invalidation.
> >
> > Thoughts?
>
> Needed a rebase due to c393308b. Please find the attached v2 patch set.

Needed a rebase due to commit 776621a (conflict in
src/test/recovery/meson.build for new TAP test file added). Please
find the attached v3 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Mon, Feb 5, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Thanks for the patch and +1 for the idea, I think adding those new
> "invalidation reasons" make sense.

Thanks for looking at it.

> I think it's better to have the XID one being discussed/implemented before the
> inactive_timeout one: what about changing the 0002, 0003 and 0004 ordering?
>
> 0004 -> 0002
> 0002 -> 0003
> 0003 -> 0004

Done that way.

> As far 0001:
>
> "
> This commit renames conflict_reason to
> invalidation_reason, and adds the support to show invalidation
> reasons for both physical and logical slots.
> "
>
> I'm not sure I like the fact that "invalidations" and "conflicts" are merged
> into a single field. I'd vote to keep conflict_reason as it is and add a new
> invalidation_reason (and put "conflict" as value when it is the case). The reason
> is that I think they are 2 different concepts (could be linked though) and that
> it would be easier to check for conflicts (means conflict_reason is not NULL).

So, do you want conflict_reason for only logical slots, and a separate
column for invalidation_reason for both logical and physical slots? Is
there any strong reason to have two properties "conflict" and
"invalidated" for slots? They both are the same internally, so why
confuse the users?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

09 February 2024, 07:42:53

Hi,

On Wed, Feb 07, 2024 at 12:22:07AM +0530, Bharath Rupireddy wrote:
> On Mon, Feb 5, 2024 at 3:15 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > I'm not sure I like the fact that "invalidations" and "conflicts" are merged
> > into a single field. I'd vote to keep conflict_reason as it is and add a new
> > invalidation_reason (and put "conflict" as value when it is the case). The reason
> > is that I think they are 2 different concepts (could be linked though) and that
> > it would be easier to check for conflicts (means conflict_reason is not NULL).
> 
> So, do you want conflict_reason for only logical slots, and a separate
> column for invalidation_reason for both logical and physical slots?

Yes, with "conflict" as value in case of conflicts (and one would need to refer
to the conflict_reason reason to see the reason).

> Is there any strong reason to have two properties "conflict" and
> "invalidated" for slots?

I think "conflict" is an important topic and does contain several reasons. The
slot "first" conflict and then leads to slot "invalidation". 

> They both are the same internally, so why
> confuse the users?

I don't think that would confuse the users, I do think that would be easier to
check for conflicting slots.

I did not look closely at the code, just played a bit with the patch and was able
to produce something like:

postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots;
  slot_name  | slot_type | active | active_pid | wal_status | invalidation_reason
-------------+-----------+--------+------------+------------+---------------------
 rep1        | physical  | f      |            | reserved   |
 master_slot | physical  | t      |    1482441 | unreserved | wal_removed
(2 rows)

does that make sense to have an "active/working" slot "ivalidated"?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

20 February 2024, 06:35:00

On Fri, Feb 9, 2024 at 1:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> I think "conflict" is an important topic and does contain several reasons. The
> slot "first" conflict and then leads to slot "invalidation".
>
> > They both are the same internally, so why
> > confuse the users?
>
> I don't think that would confuse the users, I do think that would be easier to
> check for conflicting slots.

I've added a separate column for invalidation reasons for now. I'll
see how others think on this as the time goes by.

> I did not look closely at the code, just played a bit with the patch and was able
> to produce something like:
>
> postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots;
>   slot_name  | slot_type | active | active_pid | wal_status | invalidation_reason
> -------------+-----------+--------+------------+------------+---------------------
>  rep1        | physical  | f      |            | reserved   |
>  master_slot | physical  | t      |    1482441 | unreserved | wal_removed
> (2 rows)
>
> does that make sense to have an "active/working" slot "ivalidated"?

Thanks. Can you please provide the steps to generate this error? Are
you setting max_slot_wal_keep_size on primary to generate
"wal_removed"?

Attached v5 patch set after rebasing and addressing review comments.
Please review it further.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Thu, Feb 22, 2024 at 1:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > > Does that make sense to you to use "conflict" as value in "invalidation_reason"
> > > when the slot has "conflict_reason" not NULL?
> >
> > I'm thinking the other way around - how about we revert
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5,
> > that is, put in place "conflict" as a boolean and introduce
> > invalidation_reason the text form. So, for logical slots, whenever the
> > "conflict" column is true, the reason is found in invaldiation_reason
> > column? How does it sound?
>
> Yeah, I think that looks fine too. We would need more change (like take care of
> ddd5f4f54a for example).
>
> CC'ing Amit, Hou-San and Shveta to get their point of view (as the ones behind
> 007693f2a3 and ddd5f4f54a).

Yeah, let's wait for what others think about it.

FWIW, I've had to rebase the patches due to 943f7ae1c. Please see the
attached v6 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:
> On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote:
>> Do you have any thoughts on reverting 007693f and introducing
>> invalidation_reason?
>
> Unless I am misinterpreting some details, ISTM we could rename this column
> to invalidation_reason and use it for both logical and physical slots.  I'm
> not seeing a strong need for another column.  Perhaps I am missing
> something...

And also, please don't be hasty in taking a decision that would
involve a revert of 007693f without informing the committer of this
commit about that.  I am adding Amit Kapila in CC of this thread for
awareness.
--
Michael

Attachment

signature.asc

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

04 March 2024, 08:41:01

Hi,

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:
> On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote:
> > On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >> Would you ever see "conflict" as false and "invalidation_reason" as
> >> non-null for a logical slot?
> > 
> > No. Because both conflict and invalidation_reason are decided based on
> > the invalidation reason i.e. value of slot_contents.data.invalidated.
> > IOW, a logical slot that reports conflict as true must have been
> > invalidated.
> > 
> > Do you have any thoughts on reverting 007693f and introducing
> > invalidation_reason?
> 
> Unless I am misinterpreting some details, ISTM we could rename this column
> to invalidation_reason and use it for both logical and physical slots.  I'm
> not seeing a strong need for another column.

Yeah having two columns was more for convenience purpose. Without the "conflict"
one, a slot conflicting with recovery would be "a logical slot having a non NULL
invalidation_reason".

I'm also fine with one column if most of you prefer that way.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

05 March 2024, 19:20:38

On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:
> > On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote:
> > > On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> > >> Would you ever see "conflict" as false and "invalidation_reason" as
> > >> non-null for a logical slot?
> > >
> > > No. Because both conflict and invalidation_reason are decided based on
> > > the invalidation reason i.e. value of slot_contents.data.invalidated.
> > > IOW, a logical slot that reports conflict as true must have been
> > > invalidated.
> > >
> > > Do you have any thoughts on reverting 007693f and introducing
> > > invalidation_reason?
> >
> > Unless I am misinterpreting some details, ISTM we could rename this column
> > to invalidation_reason and use it for both logical and physical slots.  I'm
> > not seeing a strong need for another column.
>
> Yeah having two columns was more for convenience purpose. Without the "conflict"
> one, a slot conflicting with recovery would be "a logical slot having a non NULL
> invalidation_reason".
>
> I'm also fine with one column if most of you prefer that way.

While we debate on the above, please find the attached v7 patch set
after rebasing.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote:
> > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote:
> > > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:
> > >> > Unless I am misinterpreting some details, ISTM we could rename this column
> > >> > to invalidation_reason and use it for both logical and physical slots.  I'm
> > >> > not seeing a strong need for another column.
> > >>
> > >> Yeah having two columns was more for convenience purpose. Without the "conflict"
> > >> one, a slot conflicting with recovery would be "a logical slot having a non NULL
> > >> invalidation_reason".
> > >>
> > >> I'm also fine with one column if most of you prefer that way.
> > >
> > > While we debate on the above, please find the attached v7 patch set
> > > after rebasing.
> >
> > It looks like Bertrand is okay with reusing the same column for both
> > logical and physical slots
>
> Yeah, I'm okay with one column.

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy
> >
> > Yes, there will be some sort of duplicity if we emit conflict_reason
> > as a text field. However, I still think the better way is to turn
> > conflict_reason text to conflict boolean and set it to true only on
> > rows_removed and wal_level_insufficient invalidations. When conflict
> > boolean is true, one (including all the tests that we've added
> > recently) can look for invalidation_reason text field for the reason.
> > This sounds reasonable to me as opposed to we just mentioning in the
> > docs that "if invalidation_reason is rows_removed or
> > wal_level_insufficient it's the reason for conflict with recovery".
> >
> Fair point. I think we can go either way. Bertrand, Nathan, and
> others, do you have an opinion on this matter?

While we wait to hear from others on this, I'm attaching the v9 patch
set implementing the above idea (check 0001 patch). Please have a
look. I'll come back to the other review comments soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Fri, Mar 15, 2024 at 10:15 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> > > > wal_level_insufficient it's the reason for conflict with recovery".
>
> +1 on maintaining both conflicting and invalidation_reason

Thanks.

> Thanks for the patch. JFYI, patch09 does not apply to HEAD, some
> recent commit caused the conflict.

Yep, the conflict is in src/test/recovery/meson.build and is because
of e6927270cd18d535b77cbe79c55c6584351524be.

> Some trivial comments on patch001 (yet to review other patches)

Thanks for looking into this.

> 1)
> info.c:
>
> - "%s as caught_up, conflict_reason IS NOT NULL as invalid "
> + "%s as caught_up, invalidation_reason IS NOT NULL as invalid "
>
> Can we revert back to 'conflicting as invalid' since it is a query for
> logical slots only.

I guess, no. There the intention is to check for invalid logical slots
not just for the conflicting ones. The logical slots can get
invalidated due to other reasons as well.

> 2)
> 040_standby_failover_slots_sync.pl:
>
> - q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM
> pg_replication_slots WHERE slot_name = 'lsub1_slot';}
> + q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary
> FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
>
> Here too, can we have 'NOT conflicting' instead of '
> invalidation_reason IS NULL' as it is a logical slot test.

I guess no. The tests are ensuring the slot on the standby isn't invalidated.

In general, one needs to use the 'conflicting' column from
pg_replication_slots when the intention is to look for reasons for
conflicts, otherwise use the 'invalidation_reason' column for
invalidations.

Please see the attached v10 patch set after resolving the merge
conflict and fixing an indentation warning in the TAP test file.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Fri, Mar 15, 2024 at 12:49 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> patch002:
>
> 1)
> I would like to understand the purpose of 'inactive_count'? Is it only
> for users for monitoring purposes? We are not using it anywhere
> internally.

inactive_count metric helps detect unstable replication slots
connections that have a lot of disconnections. It's not used for the
inactive_timeout based slot invalidation mechanism.

> I shutdown the instance 5 times and found that 'inactive_count' became
> 5 for all the slots created on that instance. Is this intentional?

Yes, it's incremented on shutdown (and for that matter upon every slot
release) for all the slots that are tied to walsenders.

> I mean we can not really use them if the instance is down.  I felt it
> should increment the inactive_count only if during the span of
> instance, they were actually inactive i.e. no streaming or replication
> happening through them.

inactive_count is persisted to disk- upon clean shutdown, so, once the
slots become active again, one gets to see the metric and deduce some
info on disconnections.

Having said that, I'm okay to hear from others on the inactive_count
metric being added.

> 2)
> slot.c:
> + case RS_INVAL_XID_AGE:
>
> Can we optimize this code? It has duplicate code for processing
> s->data.catalog_xmin and s->data.xmin. Can we create a sub-function
> for this purpose and call it twice here?

Good idea. Done that way.

> 2)
> The msg for patch 3 says:
> --------------
> a) when replication slots is lying inactive for a day or so using
> last_inactive_at metric,
> b) when a replication slot is becoming inactive too frequently using
> last_inactive_at metric.
> --------------
>  I think in b, you want to refer to inactive_count instead of last_inactive_at?

Right. Changed.

> 3)
> I do not see invalidation_reason updated for 2 new reasons in system-views.sgml

Nice catch. Added them now.

I've also responded to Bertrand's comments here.

On Wed, Mar 6, 2024 at 3:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> A few comments:
>
> 1 ===
>
> +       The reason for the slot's invalidation. <literal>NULL</literal> if the
> +       slot is currently actively being used.
>
> s/currently actively being used/not invalidated/ ? (I mean it could be valid
> and not being used).

Changed.

> 3 ===
>
>         res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
> -                                                       "%s as caught_up, conflict_reason IS NOT NULL as invalid "
> +                                                       "%s as caught_up, invalidation_reason IS NOT NULL as invalid
"
>                                                         "FROM pg_catalog.pg_replication_slots "
> -                                                       "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
> +                                                       "(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
>
> Yeah that's fine because there is logical slot filtering here.

Right. And, we really are looking for invalid slots there, so use of
invalidation_reason is much more correct than conflicting.

> 4 ===
>
> -GetSlotInvalidationCause(const char *conflict_reason)
> +GetSlotInvalidationCause(const char *invalidation_reason)
>
> Should we change the comment "Maps a conflict reason" above this function?

Changed.

> 5 ===
>
> -# Check conflict_reason is NULL for physical slot
> +# Check invalidation_reason is NULL for physical slot
>  $res = $node_primary->safe_psql(
>         'postgres', qq[
> -                SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
> +                SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
>  );
>
>
> I don't think this test is needed anymore: it does not make that much sense since
> it's done after the primary database initialization and startup.

It is now turned into a test verifying 'conflicting boolean' is null
for the physical slot. Isn't that okay?

> 6 ===
>
>         'Logical slots are reported as non conflicting');
>
> What about?
>
> "
> # Verify slots are reported as valid in pg_replication_slots
>     'Logical slots are reported as valid');
> "

Changed.

Please see the attached v11 patch set with all the above review
comments addressed.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > > Hm. Are you suggesting inactive_timeout to be a slot level parameter
> > > similar to 'failover' property added recently by
> > > c393308b69d229b664391ac583b9e07418d411b6 and
> > > 73292404370c9900a96e2bebdc7144f7010339cf?
> >
> > Yeah, I have something like that in mind. You can prepare the patch
> > but it would be good if others involved in this thread can also share
> > their opinion.
>
> I think it makes sense to put the inactive_timeout granularity at the slot
> level (as the activity could vary a lot say between one slot linked to a
> subcription and one linked to some plugins). As far max_slot_xid_age I've the
> feeling that a new GUC is good enough.

Well, here I'm implementing the above idea. The attached v12 patches
majorly have the following changes:

1. inactive_timeout is now slot-level, that is, one can set it while
creating the slot either via SQL functions or via replication commands
or via subscription.
2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.
3. last_inactive_at is now set even for non-walsenders whenever the
slot is released as opposed to initial versions of the patches setting
it only for walsenders.
4. slot's inactive_timeout parameter is now migrated to the new
cluster with pg_upgrade.
5. slot's inactive_timeout parameter is now synced to the standby when
failover is enabled for the slot.
6. Test cases are added to cover most of the above cases including new
invalidation mechanisms.

Following are some open points:

1. Where to do inactive_timeout invalidation exactly if not the checkpointer.
2. Where to do XID age invalidation exactly if not the checkpointer.
3. How to go about recomputing XID horizons based on max_slot_xid_age.
Does the slot's horizon's need to be adjusted in ComputeXidHorizons()?
4. New invalidation mechanisms interaction with slot sync feature.
5. Review comments on 0001 from Bertrand.

Please see the attached v12 patches.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Wed, Mar 20, 2024 at 1:04 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Wed, Mar 20, 2024 at 08:58:05AM +0530, Amit Kapila wrote:
> > On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > Following are some open points:
> > >
> > > 1. Where to do inactive_timeout invalidation exactly if not the checkpointer.
> > >
> > I have suggested to do it at the time of CheckpointReplicationSlots()
> > and Bertrand suggested to do it whenever we resume using the slot. I
> > think we should follow both the suggestions.
>
> Agree. I also think that pg_get_replication_slots() would be a good place, so
> that queries would return the right invalidation status.

I've addressed review comments and attaching the v13 patches with the
following changes:

1. Invalidate replication slot due to inactive_timeout:
1.1 In CheckpointReplicationSlots() to help with automatic invalidation.
1.2 In pg_get_replication_slots to help readers see the latest slot information.
1.3 In ReplicationSlotAcquire for walsenders as typically walsenders
are the ones that use slots for longer durations for streaming
standbys and logical subscribers.
1.4 In ReplicationSlotAcquire when called from
pg_logical_slot_get_changes_guts to help with logical decoding clients
to disallow decoding from invalidated slots.
1.5 In ReplicationSlotAcquire when called from
pg_replication_slot_advance to help with disallowing advancing
invalidated slots.
2. Have a new input parameter bool check_for_invalidation for
ReplicationSlotAcquire(). When true, check for the inactive_timeout
invalidation, if invalidated, error out.
3. Have a new function to just do inactive_timeout invalidation.
4. Do not update last_inactive_at for failover slots on standby to not
invalidate failover slots on the standby.
5. In ReplicationSlotAcquire(), invalidate the slot before making it active.
6. Make last_inactive_at a shared-memory parameter as opposed to an
on-disk parameter to help not count the server downtime for inactive
time.
7. Let the failover slot on standby and pg_upgraded slots get
inactive_timeout parameter from the primary and old cluster
respectively.

Please see the attached v13 patches.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Thu, Mar 21, 2024 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> This makes sense to me. Apart from this, few more comments on 0001.

Thanks for looking into it.

> 1.
> - "%s as caught_up, conflict_reason IS NOT NULL as invalid "
> + "%s as caught_up, invalidation_reason IS NOT NULL as invalid "
>   live_check ? "FALSE" :
> - "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
> + "(CASE WHEN conflicting THEN FALSE "
>
> I think here at both places we need to change 'conflict_reason' to
> 'conflicting'.

Basically, the idea there is to not live_check for invalidated logical
slots. It has nothing to do with conflicting. Up until now,
conflict_reason is also reporting wal_removed (although wrongly
including rows_removed, wal_level_insufficient, the two reasons for
conflicts). So, I think invalidation_reason is right for invalid
column. Also, I think we need to change conflicting to
invalidation_reason for live_check. So, I've changed that to use
invalidation_reason for both columns.

> 2.
>
> Can the reasons 'rows_removed' and 'wal_level_insufficient' appear for
> physical slots?

No. They can only occur for logical slots, check
InvalidatePossiblyObsoleteSlot, only the logical slots get
invalidated.

> If not, then it is not clear from above text.

I've stated that "It is set only for logical slots." for rows_removed
and wal_level_insufficient. Other reasons can occur for both slots.

> 3.
> -# Verify slots are reported as non conflicting in pg_replication_slots
> +# Verify slots are reported as valid in pg_replication_slots
>  is( $node_standby->safe_psql(
>   'postgres',
>   q[select bool_or(conflicting) from
> -   (select conflict_reason is not NULL as conflicting
> -    from pg_replication_slots WHERE slot_type = 'logical')]),
> +   (select conflicting from pg_replication_slots
> + where slot_type = 'logical')]),
>   'f',
> - 'Logical slots are reported as non conflicting');
> + 'Logical slots are reported as valid');
>
> I don't think we need to change the comment or success message in this test.

Yes. There the intention of the test case is to verify logical slots
are reported as non conflicting. So, I changed them.

Please find the v14-0001 patch for now. I'll post the other patches soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v14-0001-Track-invalidation_reason-in-pg_replication_slot.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

22 March 2024, 05:19:17

On Thu, Mar 21, 2024 at 11:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
>
> Please find the v14-0001 patch for now. I'll post the other patches soon.
>

LGTM. Let's wait for Bertrand to see if he has more comments on 0001
and then I'll push it.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

22 March 2024, 07:09:14

Hi,

On Fri, Mar 22, 2024 at 10:49:17AM +0530, Amit Kapila wrote:
> On Thu, Mar 21, 2024 at 11:21 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> >
> > Please find the v14-0001 patch for now.

Thanks!

> LGTM. Let's wait for Bertrand to see if he has more comments on 0001
> and then I'll push it.

LGTM too.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

22 March 2024, 08:15:01

On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > > Please find the v14-0001 patch for now.
>
> Thanks!
>
> > LGTM. Let's wait for Bertrand to see if he has more comments on 0001
> > and then I'll push it.
>
> LGTM too.

Thanks. Here I'm implementing the following:

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired  till inactive_timeout, checkpointer will
invalidate the slot.
2. Ensure with pg_alter_replication_slot one could "only" alter the
timeout property for the time being, if not that could lead to the
subscription inconsistency.
3. Have some notes in the CREATE and ALTER SUBSCRIPTION docs about
using an existing slot to leverage inactive_timeout feature.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.
5. We don't set last_inactive_at to GetCurrentTimestamp() for failover slots.
6. Leave the patch that added support for inactive_timeout in subscriptions.

Please see the attached v14 patch set. No change in the attached
v14-0001 from the previous patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Fri, Mar 22, 2024 at 7:15 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > > Please find the v14-0001 patch for now.
>
> Thanks!
>
> > LGTM. Let's wait for Bertrand to see if he has more comments on 0001
> > and then I'll push it.
>
> LGTM too.

Thanks. Here I'm implementing the following:

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
2. Ensure with pg_alter_replication_slot one could "only" alter the
timeout property for the time being, if not that could lead to the
subscription inconsistency.
3. Have some notes in the CREATE and ALTER SUBSCRIPTION docs about
using an existing slot to leverage inactive_timeout feature.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.
5. We don't set last_inactive_at to GetCurrentTimestamp() for failover slots.
6. Leave the patch that added support for inactive_timeout in subscriptions.

Please see the attached v14 patch set. No change in the attached
v14-0001 from the previous patch.

Some comments:

1. In patch 0005:

In ReplicationSlotAlter():
+ lock_acquired = false;
if (MyReplicationSlot->data.failover != failover)
{
SpinLockAcquire(&MyReplicationSlot->mutex);
+ lock_acquired = true;
MyReplicationSlot->data.failover = failover;
+ }
+
+ if (MyReplicationSlot->data.inactive_timeout != inactive_timeout)
+ {
+ if (!lock_acquired)
+ {
+ SpinLockAcquire(&MyReplicationSlot->mutex);
+ lock_acquired = true;
+ }
+
+ MyReplicationSlot->data.inactive_timeout = inactive_timeout;
+ }
+
+ if (lock_acquired)
+ {
SpinLockRelease(&MyReplicationSlot->mutex);

Can't you make it shorter like below:
lock_acquired = false;

if (MyReplicationSlot->data.failover != failover || MyReplicationSlot->data.inactive_timeout != inactive_timeout) {
SpinLockAcquire(&MyReplicationSlot->mutex);
lock_acquired = true;
}

if (MyReplicationSlot->data.failover != failover) {
MyReplicationSlot->data.failover = failover;
}

if (MyReplicationSlot->data.inactive_timeout != inactive_timeout) {
MyReplicationSlot->data.inactive_timeout = inactive_timeout;
}

if (lock_acquired) {
SpinLockRelease(&MyReplicationSlot->mutex);
ReplicationSlotMarkDirty();
ReplicationSlotSave();
}

2. In patch 0005: why change walrcv_alter_slot option? it doesn't seem to be used anywhere, any use case for it? If required, would the intention be to add this as a Create Subscription option?

regards,

Ajin Cherian

Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

22 March 2024, 12:32:11

On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote:
> > On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:
> > >
> > > 1 ===
> > >
> > > @@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
> > >                 ConditionVariableBroadcast(&slot->active_cv);
> > >         }
> > >
> > > +       if (slot->data.persistency == RS_PERSISTENT)
> > > +       {
> > > +               SpinLockAcquire(&slot->mutex);
> > > +               slot->last_inactive_at = GetCurrentTimestamp();
> > > +               SpinLockRelease(&slot->mutex);
> > > +       }
> > >
> > > I'm not sure we should do system calls while we're holding a spinlock.
> > > Assign a variable before?
> > >
> > > 2 ===
> > >
> > > Also, what about moving this here?
> > >
> > > "
> > >     if (slot->data.persistency == RS_PERSISTENT)
> > >     {
> > >         /*
> > >          * Mark persistent slot inactive.  We're not freeing it, just
> > >          * disconnecting, but wake up others that may be waiting for it.
> > >          */
> > >         SpinLockAcquire(&slot->mutex);
> > >         slot->active_pid = 0;
> > >         SpinLockRelease(&slot->mutex);
> > >         ConditionVariableBroadcast(&slot->active_cv);
> > >     }
> > > "
> > >
> > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".
> > >
> >
> > That sounds like a good idea. Also, don't we need to consider physical
> > slots where we don't reserve WAL during slot creation? I don't think
> > there is a need to set inactive_at for such slots.
>
> If the slot is not active, why shouldn't we set inactive_at? I can understand
> that such a slots do not present "any risks" but I think we should still set
> inactive_at (also to not give the false impression that the slot is active).
>

But OTOH, there is a chance that we will invalidate such slots even
though they have never reserved WAL in the first place which doesn't
appear to be a good thing.

> > > 5 ===
> > >
> > > Most of the fields that reflect a time (not duration) in the system views are
> > > xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
> > > something like "last_inactive_time"?
> > >
> >
> > How about naming it as last_active_time? This will indicate the time
> > at which the slot was last active.
>
> I thought about it too but I think it could be missleading as one could think that
> it should be updated each time WAL record decoding is happening.
>

Fair enough.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

22 March 2024, 13:47:34

Hi,

On Fri, Mar 22, 2024 at 06:02:11PM +0530, Amit Kapila wrote:
> On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> > On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote:
> > > >
> > > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".
> > > >
> > >
> > > That sounds like a good idea. Also, don't we need to consider physical
> > > slots where we don't reserve WAL during slot creation? I don't think
> > > there is a need to set inactive_at for such slots.
> >
> > If the slot is not active, why shouldn't we set inactive_at? I can understand
> > that such a slots do not present "any risks" but I think we should still set
> > inactive_at (also to not give the false impression that the slot is active).
> >
> 
> But OTOH, there is a chance that we will invalidate such slots even
> though they have never reserved WAL in the first place which doesn't
> appear to be a good thing.

That's right but I don't think it is not a good thing. I think we should treat
inactive_at as an independent field (like if the timeout one does not exist at
all) and just focus on its meaning (slot being inactive). If one sets a timeout
(> 0) and gets an invalidation then I think it works as designed (even if the
slot does not present any "risk" as it does not hold any rows or WAL). 

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

22 March 2024, 21:32:26

On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Looking at v14-0002:

Thanks for reviewing. I agree that 0002 with last_inactive_at can go
independently and be of use on its own in addition to helping
implement inactive_timeout based invalidation.

> 1 ===
>
> @@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
>                 ConditionVariableBroadcast(&slot->active_cv);
>         }
>
> +       if (slot->data.persistency == RS_PERSISTENT)
> +       {
> +               SpinLockAcquire(&slot->mutex);
> +               slot->last_inactive_at = GetCurrentTimestamp();
> +               SpinLockRelease(&slot->mutex);
> +       }
>
> I'm not sure we should do system calls while we're holding a spinlock.
> Assign a variable before?

Can do that. Then, the last_inactive_at = current_timestamp + mutex
acquire time. But, that shouldn't be a problem than doing system calls
while holding the mutex. So, done that way.

> 2 ===
>
> Also, what about moving this here?
>
> "
>     if (slot->data.persistency == RS_PERSISTENT)
>     {
>         /*
>          * Mark persistent slot inactive.  We're not freeing it, just
>          * disconnecting, but wake up others that may be waiting for it.
>          */
>         SpinLockAcquire(&slot->mutex);
>         slot->active_pid = 0;
>         SpinLockRelease(&slot->mutex);
>         ConditionVariableBroadcast(&slot->active_cv);
>     }
> "
>
> That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".

Ugh. Done that now.

> 3 ===
>
> @@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name)
>
>                 slot->in_use = true;
>                 slot->active_pid = 0;
> +               slot->last_inactive_at = 0;
>
> I think we should put GetCurrentTimestamp() here. It's done in v14-0006 but I
> think it's better to do it in 0002 (and not taking care of inactive_timeout).

Done.

> 4 ===
>
>     Track last_inactive_at in pg_replication_slots
>
>  doc/src/sgml/system-views.sgml       | 11 +++++++++++
>  src/backend/catalog/system_views.sql |  3 ++-
>  src/backend/replication/slot.c       | 16 ++++++++++++++++
>  src/backend/replication/slotfuncs.c  |  7 ++++++-
>  src/include/catalog/pg_proc.dat      |  6 +++---
>  src/include/replication/slot.h       |  3 +++
>  src/test/regress/expected/rules.out  |  5 +++--
>  7 files changed, 44 insertions(+), 7 deletions(-)
>
> Worth to add some tests too (or we postpone them in future commits because we're
> confident enough they will follow soon)?

Yes. Added some tests in a new TAP test file named
src/test/recovery/t/043_replslot_misc.pl. This new file can be used to
add miscellaneous replication tests in future as well. I couldn't find
a better place in existing test files - tried having the new tests for
physical slots in t/001_stream_rep.pl and I didn't find a right place
for logical slots.

> 5 ===
>
> Most of the fields that reflect a time (not duration) in the system views are
> xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
> something like "last_inactive_time"?

Yeah, I can see that. So, I changed it to last_inactive_time.

I agree with treating last_inactive_time as a separate property of the
slot having its own use in addition to helping implement
inactive_timeout based invalidation. I think it can go separately.

I tried to address the review comments received for this patch alone
and attached v15-0001. I'll post other patches soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v15-0001-Track-last_inactive_time-in-pg_replication_slots.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

23 March 2024, 05:06:18

On Fri, Mar 22, 2024 at 7:17 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Fri, Mar 22, 2024 at 06:02:11PM +0530, Amit Kapila wrote:
> > On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > > On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote:
> > > > >
> > > > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".
> > > > >
> > > >
> > > > That sounds like a good idea. Also, don't we need to consider physical
> > > > slots where we don't reserve WAL during slot creation? I don't think
> > > > there is a need to set inactive_at for such slots.
> > >
> > > If the slot is not active, why shouldn't we set inactive_at? I can understand
> > > that such a slots do not present "any risks" but I think we should still set
> > > inactive_at (also to not give the false impression that the slot is active).
> > >
> >
> > But OTOH, there is a chance that we will invalidate such slots even
> > though they have never reserved WAL in the first place which doesn't
> > appear to be a good thing.
>
> That's right but I don't think it is not a good thing. I think we should treat
> inactive_at as an independent field (like if the timeout one does not exist at
> all) and just focus on its meaning (slot being inactive). If one sets a timeout
> (> 0) and gets an invalidation then I think it works as designed (even if the
> slot does not present any "risk" as it does not hold any rows or WAL).
>

Fair point.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

23 March 2024, 05:57:20

On Sat, Mar 23, 2024 at 3:02 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> >
> > Worth to add some tests too (or we postpone them in future commits because we're
> > confident enough they will follow soon)?
>
> Yes. Added some tests in a new TAP test file named
> src/test/recovery/t/043_replslot_misc.pl. This new file can be used to
> add miscellaneous replication tests in future as well. I couldn't find
> a better place in existing test files - tried having the new tests for
> physical slots in t/001_stream_rep.pl and I didn't find a right place
> for logical slots.
>

How about adding the test in 019_replslot_limit? It is not a direct
fit but I feel later we can even add 'invalid_timeout' related tests
in this file which will use last_inactive_time feature. It is also
possible that some of the tests added by the 'invalid_timeout' feature
will obviate the need for some of these tests.

Review of v15
==============
1.
@@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS
             L.conflicting,
             L.invalidation_reason,
             L.failover,
-            L.synced
+            L.synced,
+            L.last_inactive_time
     FROM pg_get_replication_slots() AS L

As mentioned previously, let's keep these new fields before
conflicting and after two_phase.

2.
+# Get last_inactive_time value after slot's creation. Note that the
slot is still
+# inactive unless it's used by the standby below.
+my $last_inactive_time_1 = $primary->safe_psql('postgres',
+ qq(SELECT last_inactive_time FROM pg_replication_slots WHERE
slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;)
+);

We should check $last_inactive_time_1 to be a valid value and add a
similar check for logical slots.

3. BTW, why don't we set last_inactive_time for temporary slots
(RS_TEMPORARY) as well? Don't we even invalidate temporary slots? If
so, then I think we should set last_inactive_time for those as well
and later allow them to be invalidated based on timeout parameter.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

23 March 2024, 07:41:50

On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> How about adding the test in 019_replslot_limit? It is not a direct
> fit but I feel later we can even add 'invalid_timeout' related tests
> in this file which will use last_inactive_time feature.

I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl
can have last_inactive_time tests, and later invalid_timeout ones too.
This way 019_replslot_limit.pl is not cluttered.

> It is also
> possible that some of the tests added by the 'invalid_timeout' feature
> will obviate the need for some of these tests.

Might be. But, I prefer to keep both these tests separate but in the
same file 043_replslot_misc.pl. Because we cover some corner cases the
last_inactive_time is set upon loading the slot from disk.

> Review of v15
> ==============
> 1.
> @@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS
>              L.conflicting,
>              L.invalidation_reason,
>              L.failover,
> -            L.synced
> +            L.synced,
> +            L.last_inactive_time
>      FROM pg_get_replication_slots() AS L
>
> As mentioned previously, let's keep these new fields before
> conflicting and after two_phase.

Sorry, I forgot to notice that comment (out of a flood of comments
really :)). Now, done that way.

> 2.
> +# Get last_inactive_time value after slot's creation. Note that the
> slot is still
> +# inactive unless it's used by the standby below.
> +my $last_inactive_time_1 = $primary->safe_psql('postgres',
> + qq(SELECT last_inactive_time FROM pg_replication_slots WHERE
> slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;)
> +);
>
> We should check $last_inactive_time_1 to be a valid value and add a
> similar check for logical slots.

That's taken care by the type cast we do, right? Isn't that enough?

is( $primary->safe_psql(
        'postgres',
        qq[SELECT last_inactive_time >
'$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE
slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;]
    ),
    't',
    'last inactive time for an inactive physical slot is updated correctly');

For instance, setting last_inactive_time_1 to an invalid value fails
with the following error:

error running SQL: 'psql:<stdin>:1: ERROR:  invalid input syntax for
type timestamp with time zone: "foo"
LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli...

> 3. BTW, why don't we set last_inactive_time for temporary slots
> (RS_TEMPORARY) as well? Don't we even invalidate temporary slots? If
> so, then I think we should set last_inactive_time for those as well
> and later allow them to be invalidated based on timeout parameter.

WFM. Done that way.

Please see the attached v16 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v16-0001-Track-last_inactive_time-in-pg_replication_slots.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

23 March 2024, 09:04:45

Hi,

On Sat, Mar 23, 2024 at 01:11:50PM +0530, Bharath Rupireddy wrote:
> On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > How about adding the test in 019_replslot_limit? It is not a direct
> > fit but I feel later we can even add 'invalid_timeout' related tests
> > in this file which will use last_inactive_time feature.
> 
> I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl
> can have last_inactive_time tests, and later invalid_timeout ones too.
> This way 019_replslot_limit.pl is not cluttered.

I share the same opinion as Amit: I think 019_replslot_limit would be a better
place, because I see the timeout as another kind of limit.

> 
> > It is also
> > possible that some of the tests added by the 'invalid_timeout' feature
> > will obviate the need for some of these tests.
> 
> Might be. But, I prefer to keep both these tests separate but in the
> same file 043_replslot_misc.pl. Because we cover some corner cases the
> last_inactive_time is set upon loading the slot from disk.

Right but I think that this test does not necessary have to be in the same .pl
as the one testing the timeout. Could be added in one of the existing .pl like
001_stream_rep.pl for example.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

24 March 2024, 02:30:00

On Sat, Mar 23, 2024 at 2:34 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > > How about adding the test in 019_replslot_limit? It is not a direct
> > > fit but I feel later we can even add 'invalid_timeout' related tests
> > > in this file which will use last_inactive_time feature.
> >
> > I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl
> > can have last_inactive_time tests, and later invalid_timeout ones too.
> > This way 019_replslot_limit.pl is not cluttered.
>
> I share the same opinion as Amit: I think 019_replslot_limit would be a better
> place, because I see the timeout as another kind of limit.

Hm. Done that way.

Please see the attached v17 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v17-0001-Track-last_inactive_time-in-pg_replication_slots.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

24 March 2024, 05:10:19

On Sat, Mar 23, 2024 at 1:12 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 2.
> > +# Get last_inactive_time value after slot's creation. Note that the
> > slot is still
> > +# inactive unless it's used by the standby below.
> > +my $last_inactive_time_1 = $primary->safe_psql('postgres',
> > + qq(SELECT last_inactive_time FROM pg_replication_slots WHERE
> > slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;)
> > +);
> >
> > We should check $last_inactive_time_1 to be a valid value and add a
> > similar check for logical slots.
>
> That's taken care by the type cast we do, right? Isn't that enough?
>
> is( $primary->safe_psql(
>         'postgres',
>         qq[SELECT last_inactive_time >
> '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE
> slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;]
>     ),
>     't',
>     'last inactive time for an inactive physical slot is updated correctly');
>
> For instance, setting last_inactive_time_1 to an invalid value fails
> with the following error:
>
> error running SQL: 'psql:<stdin>:1: ERROR:  invalid input syntax for
> type timestamp with time zone: "foo"
> LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli...
>

It would be found at a later point. It would be probably better to
verify immediately after the test that fetches the last_inactive_time
value.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

24 March 2024, 09:35:44

On Sun, Mar 24, 2024 at 10:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > For instance, setting last_inactive_time_1 to an invalid value fails
> > with the following error:
> >
> > error running SQL: 'psql:<stdin>:1: ERROR:  invalid input syntax for
> > type timestamp with time zone: "foo"
> > LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli...
> >
>
> It would be found at a later point. It would be probably better to
> verify immediately after the test that fetches the last_inactive_time
> value.

Agree. I've added a few more checks explicitly to verify the
last_inactive_time is sane with the following:

        qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0)
AND '$last_inactive_time'::timestamptz >
'$slot_creation_time'::timestamptz;]

I've attached the v18 patch set here. I've also addressed earlier
review comments from Amit, Ajin Cherian. Note that I've added new
invalidation mechanism tests in a separate TAP test file just because
I don't want to clutter or bloat any of the existing files and spread
tests for physical slots and logical slots into separate existing TAP
files.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

25 March 2024, 04:18:23

On Sun, Mar 24, 2024 at 3:05 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Sun, Mar 24, 2024 at 10:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > For instance, setting last_inactive_time_1 to an invalid value fails
> > > with the following error:
> > >
> > > error running SQL: 'psql:<stdin>:1: ERROR:  invalid input syntax for
> > > type timestamp with time zone: "foo"
> > > LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli...
> > >
> >
> > It would be found at a later point. It would be probably better to
> > verify immediately after the test that fetches the last_inactive_time
> > value.
>
> Agree. I've added a few more checks explicitly to verify the
> last_inactive_time is sane with the following:
>
>         qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0)
> AND '$last_inactive_time'::timestamptz >
> '$slot_creation_time'::timestamptz;]
>

Such a test looks reasonable but shall we add equal to in the second
part of the test (like '$last_inactive_time'::timestamptz >=
> '$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the same
time,the test shouldn't fail. I think it won't matter for correctness as well. 


--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

25 March 2024, 04:58:31

On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> Such a test looks reasonable but shall we add equal to in the second
> part of the test (like '$last_inactive_time'::timestamptz >=
> > '$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the
sametime, the test shouldn't fail. I think it won't matter for correctness as well. 
>

Apart from this, I have made minor changes in the comments. See and
let me know what you think of attached.

--
With Regards,
Amit Kapila.

Attachment

v18_0001_diff_amit.patch.txt

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

25 March 2024, 05:03:30

On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> I've attached the v18 patch set here.

Thanks for the patches. Please find few comments:

patch 001:
--------

1)
slot.h:

+ /* The time at which this slot become inactive */
+ TimestampTz last_inactive_time;

become -->became

---------
patch 002:

2)
slotsync.c:

  ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
    remote_slot->two_phase,
    remote_slot->failover,
-   true);
+   true, 0);

+ slot->data.inactive_timeout = remote_slot->inactive_timeout;

Is there a reason we are not passing 'remote_slot->inactive_timeout'
to ReplicationSlotCreate() directly?

---------

3)
slotfuncs.c
pg_create_logical_replication_slot():
+ int inactive_timeout = PG_GETARG_INT32(5);

Can we mention here that timeout is in seconds either in comment or
rename variable to inactive_timeout_secs?

Please do this for create_physical_replication_slot(),
create_logical_replication_slot(),
pg_create_physical_replication_slot() as well.

---------
4)
+ int inactive_timeout; /* The amount of time in seconds the slot
+ * is allowed to be inactive. */
 } LogicalSlotInfo;

 Do we need to mention "before getting invalided" like other places
(in last patch)?

----------

 5)
Same at these two places. "before getting invalided" to be added in
the last patch otherwise the info is incompleted.

+
+ /* The amount of time in seconds the slot is allowed to be inactive */
+ int inactive_timeout;
 } ReplicationSlotPersistentData;

+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
 Same here. "before getting invalidated" ?

--------

Reviewing more..

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

25 March 2024, 06:23:53

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > I've attached the v18 patch set here.
>

I have a question. Don't we allow creating subscriptions on an
existing slot with a non-null 'inactive_timeout' set where
'inactive_timeout' of the slot is retained even after subscription
creation?

I tried this:

===================
--On publisher, create slot with 120sec inactive_timeout:
SELECT * FROM pg_create_logical_replication_slot('logical_slot1',
'pgoutput', false, true, true, 120);

--On subscriber, create sub using logical_slot1
create subscription mysubnew1_1  connection 'dbname=newdb1
host=localhost user=shveta port=5433' publication mypubnew1_1 WITH
(failover = true, create_slot=false, slot_name='logical_slot1');

--Before creating sub, pg_replication_slots output:
   slot_name   | failover | synced | active | temp | conf |
   lat                | inactive_timeout
---------------+----------+--------+--------+------+------+----------------------------------+------------------
 logical_slot1 | t        | f      | f      | f    | f    | 2024-03-25
11:11:55.375736+05:30 |              120

--After creating sub pg_replication_slots output:  (inactive_timeout is 0 now):
   slot_name   |failover | synced | active | temp | conf | | lat |
inactive_timeout
---------------+---------+--------+--------+------+------+-+-----+------------------
 logical_slot1 |t        | f      | t      | f    | f    | |     |
           0
===================

In CreateSubscription, we call  'walrcv_alter_slot()' /
'ReplicationSlotAlter()' when create_slot is false. This call ends up
setting active_timeout from 120sec to 0. Is it intentional?

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

25 March 2024, 06:55:21

On Mon, Mar 25, 2024 at 10:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Such a test looks reasonable but shall we add equal to in the second
> > part of the test (like '$last_inactive_time'::timestamptz >=
> > > '$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the
sametime, the test shouldn't fail. I think it won't matter for correctness as well. 

Agree. I added that in v19 patch. I was having that concern in my
mind. That's the reason I wasn't capturing current_time something like
below for the same worry that current_timestamp might be the same (or
nearly the same) as the slot creation time. That's why I ended up
capturing current_timestamp in a separate query than clubbing it up
with pg_create_physical_replication_slot.

SELECT current_timestamp FROM pg_create_physical_replication_slot('foo');

> Apart from this, I have made minor changes in the comments. See and
> let me know what you think of attached.

LGTM. I've merged the diff into v19 patch.

Please find the attached v19 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v19-0001-Track-last_inactive_time-in-pg_replication_slots.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

25 March 2024, 07:13:19

On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > I've attached the v18 patch set here.

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set. Let's say I bring down primary for promotion of standby and then
promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time'). I
tried with smaller unit of inactive_timeout:

--Shutdown primary to prepare for planned promotion.

--On standby, one synced slot with last_inactive_time (lat) as 12:21
   slot_name   | failover | synced | active | temp | conf | res |
         lat                                        | inactive_timeout
---------------+----------+--------+--------+------+------+-----+----------------------------------+------------------
 logical_slot1 | t           | t              | f         | f       |
f       |       | 2024-03-25 12:21:09.020757+05:30 |              60

--wait for some time, now the time is 12:24
postgres=# select now();
               now
----------------------------------
 2024-03-25 12:24:17.616716+05:30

-- promote immediately:
./pg_ctl -D ../../standbydb/ promote -w

--on promoted standby:
postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f

--synced slot is invalidated immediately on promotion.
   slot_name   | failover | synced | active | temp | conf
  |       res                |               lat                |
inactive_timeout

---------------+----------+--------+--------+------+------+------------------+----------------------------------+--------
 logical_slot1 | t             | t           | f         | f
| f                    | inactive_timeout | 2024-03-25
12:21:09.020757+05:30 |


thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

25 March 2024, 07:29:52

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > >
> > > > I've attached the v18 patch set here.
>
> I have one concern, for synced slots on standby, how do we disallow
> invalidation due to inactive-timeout immediately after promotion?
>
> For synced slots, last_inactive_time and inactive_timeout are both
> set. Let's say I bring down primary for promotion of standby and then
> promote standby, there are chances that it may end up invalidating
> synced slots (considering standby is not brought down during promotion
> and thus inactive_timeout may already be past 'last_inactive_time').
>

This raises the question of whether we need to set
'last_inactive_time' synced slots on the standby?

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

25 March 2024, 07:35:45

Hi,

On Mon, Mar 25, 2024 at 12:25:21PM +0530, Bharath Rupireddy wrote:
> On Mon, Mar 25, 2024 at 10:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Such a test looks reasonable but shall we add equal to in the second
> > > part of the test (like '$last_inactive_time'::timestamptz >=
> > > > '$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the
sametime, the test shouldn't fail. I think it won't matter for correctness as well.
 
> 
> Agree. I added that in v19 patch. I was having that concern in my
> mind. That's the reason I wasn't capturing current_time something like
> below for the same worry that current_timestamp might be the same (or
> nearly the same) as the slot creation time. That's why I ended up
> capturing current_timestamp in a separate query than clubbing it up
> with pg_create_physical_replication_slot.
> 
> SELECT current_timestamp FROM pg_create_physical_replication_slot('foo');
> 
> > Apart from this, I have made minor changes in the comments. See and
> > let me know what you think of attached.
> 

Thanks!

v19-0001 LGTM, just one Nit comment for 019_replslot_limit.pl:

The code for "Get last_inactive_time value after the slot's creation" and 
"Check that the captured time is sane" is somehow duplicated: is it worth creating
2 functions?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

25 March 2024, 08:07:35

Hi,

On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:
> On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > > >
> > > > > I've attached the v18 patch set here.
> >
> > I have one concern, for synced slots on standby, how do we disallow
> > invalidation due to inactive-timeout immediately after promotion?
> >
> > For synced slots, last_inactive_time and inactive_timeout are both
> > set.

Yeah, and I can see last_inactive_time is moving on the standby (while not the
case on the primary), probably due to the sync worker slot acquisition/release
which does not seem right.

> Let's say I bring down primary for promotion of standby and then
> > promote standby, there are chances that it may end up invalidating
> > synced slots (considering standby is not brought down during promotion
> > and thus inactive_timeout may already be past 'last_inactive_time').
> >
> 
> This raises the question of whether we need to set
> 'last_inactive_time' synced slots on the standby?

Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
standby because such slots are not usable anyway (until the standby gets promoted).

So, I think that last_inactive_time does not make sense if the slot never had
the chance to be active.

OTOH I think the timeout invalidation (if any) should be synced from primary.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

25 March 2024, 08:37:21

On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:
> > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > > > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > > > >
> > > > > > I've attached the v18 patch set here.
> > >
> > > I have one concern, for synced slots on standby, how do we disallow
> > > invalidation due to inactive-timeout immediately after promotion?
> > >
> > > For synced slots, last_inactive_time and inactive_timeout are both
> > > set.
>
> Yeah, and I can see last_inactive_time is moving on the standby (while not the
> case on the primary), probably due to the sync worker slot acquisition/release
> which does not seem right.
>
> > Let's say I bring down primary for promotion of standby and then
> > > promote standby, there are chances that it may end up invalidating
> > > synced slots (considering standby is not brought down during promotion
> > > and thus inactive_timeout may already be past 'last_inactive_time').
> > >
> >
> > This raises the question of whether we need to set
> > 'last_inactive_time' synced slots on the standby?
>
> Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
> standby because such slots are not usable anyway (until the standby gets promoted).
>
> So, I think that last_inactive_time does not make sense if the slot never had
> the chance to be active.
>
> OTOH I think the timeout invalidation (if any) should be synced from primary.

Yes, even I feel that last_inactive_time makes sense only when the
slot is available to be used. Synced slots are not available to be
used until standby is promoted and thus last_inactive_time can be
skipped to be set for synced_slots. But once primay is invalidated due
to inactive-timeout, that invalidation should be synced to standby
(which is happening currently).

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

25 March 2024, 08:51:11

Hi,

On Mon, Mar 25, 2024 at 02:07:21PM +0530, shveta malik wrote:
> On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:
> > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > > >
> > > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > > > > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > > > > >
> > > > > > > I've attached the v18 patch set here.
> > > >
> > > > I have one concern, for synced slots on standby, how do we disallow
> > > > invalidation due to inactive-timeout immediately after promotion?
> > > >
> > > > For synced slots, last_inactive_time and inactive_timeout are both
> > > > set.
> >
> > Yeah, and I can see last_inactive_time is moving on the standby (while not the
> > case on the primary), probably due to the sync worker slot acquisition/release
> > which does not seem right.
> >
> > > Let's say I bring down primary for promotion of standby and then
> > > > promote standby, there are chances that it may end up invalidating
> > > > synced slots (considering standby is not brought down during promotion
> > > > and thus inactive_timeout may already be past 'last_inactive_time').
> > > >
> > >
> > > This raises the question of whether we need to set
> > > 'last_inactive_time' synced slots on the standby?
> >
> > Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
> > standby because such slots are not usable anyway (until the standby gets promoted).
> >
> > So, I think that last_inactive_time does not make sense if the slot never had
> > the chance to be active.
> >
> > OTOH I think the timeout invalidation (if any) should be synced from primary.
> 
> Yes, even I feel that last_inactive_time makes sense only when the
> slot is available to be used. Synced slots are not available to be
> used until standby is promoted and thus last_inactive_time can be
> skipped to be set for synced_slots. But once primay is invalidated due
> to inactive-timeout, that invalidation should be synced to standby
> (which is happening currently).
> 

yeah, syncing the invalidation and always keeping last_inactive_time to zero 
for synced slots looks right to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

25 March 2024, 09:09:50

On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> Yeah, and I can see last_inactive_time is moving on the standby (while not the
> case on the primary), probably due to the sync worker slot acquisition/release
> which does not seem right.
>

Yes, you are right, last_inactive_time keeps on moving for synced
slots on standby.  Once I disabled slot-sync worker, then it is
constant. Then it only changes if I call pg_sync_replication_slots().

On a  different note, I noticed that we allow altering
inactive_timeout for synced-slots on standby. And again overwrite it
with the primary's value in the next sync cycle. Steps:

====================
--Check pg_replication_slots for synced slot on standby, inactive_timeout is 120
   slot_name   | failover | synced | active | inactive_timeout
---------------+----------+--------+--------+------------------
 logical_slot1 | t        | t      | f      |              120

--Alter on standby
SELECT 'alter' FROM pg_alter_replication_slot('logical_slot1', 900);

--Check pg_replication_slots:
   slot_name   | failover | synced | active | inactive_timeout
---------------+----------+--------+--------+------------------
 logical_slot1 | t        | t      | f      |              900

--Run sync function
SELECT pg_sync_replication_slots();

--check again, inactive_timeout is set back to primary's value.
   slot_name   | failover | synced | active | inactive_timeout
---------------+----------+--------+--------+------------------
 logical_slot1 | t        | t      | f      |              120

 ====================

I feel altering synced slot's inactive_timeout should be prohibited on
standby. It should be in sync with primary always. Thoughts?

I am listing the concerns raised by me:
1) create-subscription with create_slot=false overwriting
inactive_timeout of existing slot  ([1])
2) last_inactive_time set for synced slots may result in invalidation
of slot on promotion.  ([2])
3) alter replication slot to alter inactive_timout for synced slots on
standby, should this be allowed?

[1]: https://www.postgresql.org/message-id/CAJpy0uAqBi%2BGbNn2ngJ-A_Z905CD3ss896bqY2ACUjGiF1Gkng%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAJpy0uCLu%2BmqAwAMum%3DpXE9YYsy0BE7hOSw_Wno5vjwpFY%3D63g%40mail.gmail.com

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

25 March 2024, 09:23:53

Hi,

On Mon, Mar 25, 2024 at 02:39:50PM +0530, shveta malik wrote:
> I am listing the concerns raised by me:
> 3) alter replication slot to alter inactive_timout for synced slots on
> standby, should this be allowed?

I don't think it should be allowed.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

25 March 2024, 10:01:15

On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > > I have one concern, for synced slots on standby, how do we disallow
> > > invalidation due to inactive-timeout immediately after promotion?
> > >
> > > For synced slots, last_inactive_time and inactive_timeout are both
> > > set.
>
> Yeah, and I can see last_inactive_time is moving on the standby (while not the
> case on the primary), probably due to the sync worker slot acquisition/release
> which does not seem right.
>
> > Let's say I bring down primary for promotion of standby and then
> > > promote standby, there are chances that it may end up invalidating
> > > synced slots (considering standby is not brought down during promotion
> > > and thus inactive_timeout may already be past 'last_inactive_time').
> > >
> >
> > This raises the question of whether we need to set
> > 'last_inactive_time' synced slots on the standby?
>
> Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
> standby because such slots are not usable anyway (until the standby gets promoted).
>
> So, I think that last_inactive_time does not make sense if the slot never had
> the chance to be active.

Right. Done that way i.e. not setting the last_inactive_time for slots
both while releasing the slot and restoring from the disk.

Also, I've added a TAP function to check if the captured times are
sane per Bertrand's review comment.

Please see the attached v20 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v20-0001-Track-last_inactive_time-in-pg_replication_slots.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

25 March 2024, 10:38:59

On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Right. Done that way i.e. not setting the last_inactive_time for slots
> both while releasing the slot and restoring from the disk.
>
> Also, I've added a TAP function to check if the captured times are
> sane per Bertrand's review comment.
>
> Please see the attached v20 patch.

Thanks for the patch. The issue of unnecessary invalidation of synced
slots on promotion is resolved in this patch.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

25 March 2024, 11:33:59

On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Right. Done that way i.e. not setting the last_inactive_time for slots
> both while releasing the slot and restoring from the disk.
>
> Also, I've added a TAP function to check if the captured times are
> sane per Bertrand's review comment.
>
> Please see the attached v20 patch.
>

Pushed, after minor changes.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

25 March 2024, 11:40:11

On Mon, Mar 25, 2024 at 2:40 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > Yeah, and I can see last_inactive_time is moving on the standby (while not the
> > case on the primary), probably due to the sync worker slot acquisition/release
> > which does not seem right.
> >
>
> Yes, you are right, last_inactive_time keeps on moving for synced
> slots on standby.  Once I disabled slot-sync worker, then it is
> constant. Then it only changes if I call pg_sync_replication_slots().
>
> On a  different note, I noticed that we allow altering
> inactive_timeout for synced-slots on standby. And again overwrite it
> with the primary's value in the next sync cycle. Steps:
>
> ====================
> --Check pg_replication_slots for synced slot on standby, inactive_timeout is 120
>    slot_name   | failover | synced | active | inactive_timeout
> ---------------+----------+--------+--------+------------------
>  logical_slot1 | t        | t      | f      |              120
>
> --Alter on standby
> SELECT 'alter' FROM pg_alter_replication_slot('logical_slot1', 900);
>

I think we should keep pg_alter_replication_slot() as the last
priority among the remaining patches for this release. Let's try to
first finish the primary functionality of inactive_timeout patch.
Otherwise, I agree that the problem reported by you should be fixed.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

25 March 2024, 11:54:25

On Mon, Mar 25, 2024 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I think we should keep pg_alter_replication_slot() as the last
> priority among the remaining patches for this release. Let's try to
> first finish the primary functionality of inactive_timeout patch.
> Otherwise, I agree that the problem reported by you should be fixed.

Noted. Will focus on v18-002 patch now.

I was debugging the flow and just noticed that RecoveryInProgress()
always returns 'true' during
StartupReplicationSlots()-->RestoreSlotFromDisk() (even on primary) as
'xlogctl->SharedRecoveryState' is always 'RECOVERY_STATE_CRASH' at
that time. The 'xlogctl->SharedRecoveryState' is changed  to
'RECOVERY_STATE_DONE' on primary and to 'RECOVERY_STATE_ARCHIVE' on
standby at a later stage in StartupXLOG() (after we are done loading
slots).

The impact of this is, the condition in RestoreSlotFromDisk() in v20-001:

if (!(RecoveryInProgress() && slot->data.synced))
     slot->last_inactive_time = GetCurrentTimestamp();

is merely equivalent to:

if (!slot->data.synced)
    slot->last_inactive_time = GetCurrentTimestamp();

Thus on primary, after restart, last_inactive_at is set correctly,
while on promoted standby (new primary), last_inactive_at is always
NULL after restart for the synced slots.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nathan Bossart

Date:

25 March 2024, 19:54:43

I apologize that I haven't been able to keep up with this thread for a
while, but I'm happy to see the continued interest in $SUBJECT.

On Sun, Mar 24, 2024 at 03:05:44PM +0530, Bharath Rupireddy wrote:
> This commit particularly lets one specify the inactive_timeout for
> a slot via SQL functions pg_create_physical_replication_slot and
> pg_create_logical_replication_slot.

Off-list, Bharath brought to my attention that the current proposal was to
set the timeout at the slot level.  While I think that is an entirely
reasonable thing to support, the main use-case I have in mind for this
feature is for an administrator that wants to prevent inactive slots from
causing problems (e.g., transaction ID wraparound) on a server or a number
of servers.  For that use-case, I think a GUC would be much more
convenient.  Perhaps there could be a default inactive slot timeout GUC
that would be used in the absence of a slot-level setting.  Thoughts?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

26 March 2024, 04:00:32

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> I have one concern, for synced slots on standby, how do we disallow
> invalidation due to inactive-timeout immediately after promotion?
>
> For synced slots, last_inactive_time and inactive_timeout are both
> set. Let's say I bring down primary for promotion of standby and then
> promote standby, there are chances that it may end up invalidating
> synced slots (considering standby is not brought down during promotion
> and thus inactive_timeout may already be past 'last_inactive_time').
>

On standby, if we decide to maintain valid last_inactive_time for
synced slots, then invalidation is correctly restricted in
InvalidateSlotForInactiveTimeout() for synced slots using the check:

        if (RecoveryInProgress() && slot->data.synced)
                return false;

But immediately after promotion, we can not rely on the above check
and thus possibility of synced slots invalidation is there. To
maintain consistent behavior regarding the setting of
last_inactive_time for synced slots, similar to user slots, one
potential solution to prevent this invalidation issue is to update the
last_inactive_time of all synced slots within the ShutDownSlotSync()
function during FinishWalRecovery(). This approach ensures that
promotion doesn't immediately invalidate slots, and henceforth, we
possess a correct last_inactive_time as a basis for invalidation going
forward. This will be equivalent to updating last_inactive_time during
restart (but without actual restart during promotion).
The plus point of maintaining last_inactive_time for synced slots
could be, this can provide data to the user on when last time the sync
was attempted on that particular slot by background slot sync worker
or SQl function. Thoughts?

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

26 March 2024, 04:43:55

On Tue, Mar 26, 2024 at 1:24 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
>
> On Sun, Mar 24, 2024 at 03:05:44PM +0530, Bharath Rupireddy wrote:
> > This commit particularly lets one specify the inactive_timeout for
> > a slot via SQL functions pg_create_physical_replication_slot and
> > pg_create_logical_replication_slot.
>
> Off-list, Bharath brought to my attention that the current proposal was to
> set the timeout at the slot level.  While I think that is an entirely
> reasonable thing to support, the main use-case I have in mind for this
> feature is for an administrator that wants to prevent inactive slots from
> causing problems (e.g., transaction ID wraparound) on a server or a number
> of servers.  For that use-case, I think a GUC would be much more
> convenient.  Perhaps there could be a default inactive slot timeout GUC
> that would be used in the absence of a slot-level setting.  Thoughts?
>

Yeah, that is a valid point. One of the reasons for keeping it at slot
level was to allow different subscribers/output plugins to have a
different setting for invalid_timeout for their respective slots based
on their usage. Now, having it as a GUC also has some valid use cases
as pointed out by you but I am not sure having both at slot level and
at GUC level is required. I was a bit inclined to have it at slot
level for now and then based on some field usage report we can later
add GUC as well.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

26 March 2024, 05:37:51

On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > I have one concern, for synced slots on standby, how do we disallow
> > invalidation due to inactive-timeout immediately after promotion?
> >
> > For synced slots, last_inactive_time and inactive_timeout are both
> > set. Let's say I bring down primary for promotion of standby and then
> > promote standby, there are chances that it may end up invalidating
> > synced slots (considering standby is not brought down during promotion
> > and thus inactive_timeout may already be past 'last_inactive_time').
> >
>
> On standby, if we decide to maintain valid last_inactive_time for
> synced slots, then invalidation is correctly restricted in
> InvalidateSlotForInactiveTimeout() for synced slots using the check:
>
>         if (RecoveryInProgress() && slot->data.synced)
>                 return false;
>
> But immediately after promotion, we can not rely on the above check
> and thus possibility of synced slots invalidation is there. To
> maintain consistent behavior regarding the setting of
> last_inactive_time for synced slots, similar to user slots, one
> potential solution to prevent this invalidation issue is to update the
> last_inactive_time of all synced slots within the ShutDownSlotSync()
> function during FinishWalRecovery(). This approach ensures that
> promotion doesn't immediately invalidate slots, and henceforth, we
> possess a correct last_inactive_time as a basis for invalidation going
> forward. This will be equivalent to updating last_inactive_time during
> restart (but without actual restart during promotion).
> The plus point of maintaining last_inactive_time for synced slots
> could be, this can provide data to the user on when last time the sync
> was attempted on that particular slot by background slot sync worker
> or SQl function. Thoughts?

Please find the attached v21 patch implementing the above idea. It
also has changes for renaming last_inactive_time to inactive_since.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v21-0001-Fix-review-comments-for-slot-s-last_inactive_tim.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

26 March 2024, 05:55:11

Hi,

On Tue, Mar 26, 2024 at 09:30:32AM +0530, shveta malik wrote:
> On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > I have one concern, for synced slots on standby, how do we disallow
> > invalidation due to inactive-timeout immediately after promotion?
> >
> > For synced slots, last_inactive_time and inactive_timeout are both
> > set. Let's say I bring down primary for promotion of standby and then
> > promote standby, there are chances that it may end up invalidating
> > synced slots (considering standby is not brought down during promotion
> > and thus inactive_timeout may already be past 'last_inactive_time').
> >
> 
> On standby, if we decide to maintain valid last_inactive_time for
> synced slots, then invalidation is correctly restricted in
> InvalidateSlotForInactiveTimeout() for synced slots using the check:
> 
>         if (RecoveryInProgress() && slot->data.synced)
>                 return false;

Right.

> But immediately after promotion, we can not rely on the above check
> and thus possibility of synced slots invalidation is there. To
> maintain consistent behavior regarding the setting of
> last_inactive_time for synced slots, similar to user slots, one
> potential solution to prevent this invalidation issue is to update the
> last_inactive_time of all synced slots within the ShutDownSlotSync()
> function during FinishWalRecovery(). This approach ensures that
> promotion doesn't immediately invalidate slots, and henceforth, we
> possess a correct last_inactive_time as a basis for invalidation going
> forward. This will be equivalent to updating last_inactive_time during
> restart (but without actual restart during promotion).
> The plus point of maintaining last_inactive_time for synced slots
> could be, this can provide data to the user on when last time the sync
> was attempted on that particular slot by background slot sync worker
> or SQl function. Thoughts?

Yeah, another plus point is that if the primary is down then one could look
at the synced "active_since" on the standby to get an idea of it (depends of the
last sync though).

The issue that I can see with your proposal is: what if one synced the slots
manually (with pg_sync_replication_slots()) but does not use the sync worker?
Then I think ShutDownSlotSync() is not going to help in that case.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

26 March 2024, 05:56:38

On Sun, Mar 24, 2024 at 3:05 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> I've attached the v18 patch set here. I've also addressed earlier
> review comments from Amit, Ajin Cherian. Note that I've added new
> invalidation mechanism tests in a separate TAP test file just because
> I don't want to clutter or bloat any of the existing files and spread
> tests for physical slots and logical slots into separate existing TAP
> files.
>

Review comments on v18_0002 and v18_0005
=======================================
1.
 ReplicationSlotCreate(const char *name, bool db_specific,
    ReplicationSlotPersistency persistency,
-   bool two_phase, bool failover, bool synced)
+   bool two_phase, bool failover, bool synced,
+   int inactive_timeout)
 {
  ReplicationSlot *slot = NULL;
  int i;
@@ -345,6 +348,18 @@ ReplicationSlotCreate(const char *name, bool db_specific,
  errmsg("cannot enable failover for a temporary replication slot"));
  }

+ if (inactive_timeout > 0)
+ {
+ /*
+ * Do not allow users to set inactive_timeout for temporary slots,
+ * because temporary slots will not be saved to the disk.
+ */
+ if (persistency == RS_TEMPORARY)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot set inactive_timeout for a temporary replication slot"));
+ }

We have decided to update inactive_since for temporary slots. So,
unless there is some reason, we should allow inactive_timeout to also
be set for temporary slots.

2.
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1024,6 +1024,7 @@ CREATE VIEW pg_replication_slots AS
             L.safe_wal_size,
             L.two_phase,
             L.last_inactive_time,
+            L.inactive_timeout,

Shall we keep inactive_timeout before
last_inactive_time/inactive_since? I don't have any strong reason to
propose that way apart from that the former is provided by the user.

3.
@@ -287,6 +288,13 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  slot_contents = *slot;
  SpinLockRelease(&slot->mutex);

+ /*
+ * Here's an opportunity to invalidate inactive replication slots
+ * based on timeout, so let's do it.
+ */
+ if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
+ invalidated = true;

I don't think we should try to invalidate the slots in
pg_get_replication_slots. This function's purpose is to get the
current information on slots and has no intention to perform any work
for slots. Any error due to invalidation won't be what the user would
be expecting here.

4.
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+ bool need_control_lock,
+ bool need_mutex)
{
...
...
+ if (need_control_lock)
+ LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ /*
+ * Check if the slot needs to be invalidated due to inactive_timeout. We
+ * do this with the spinlock held to avoid race conditions -- for example
+ * the restart_lsn could move forward, or the slot could be dropped.
+ */
+ if (need_mutex)
+ SpinLockAcquire(&slot->mutex);
...

I find this combination of parameters a bit strange. Because, say if
need_mutex is false and need_control_lock is true then that means this
function will acquire LWlock after acquiring spinlock which is
unacceptable. Now, this may not happen in practice as the callers
won't pass such a combination but still, this functionality should be
improved.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

26 March 2024, 06:06:12

Hi,

On Tue, Mar 26, 2024 at 05:55:11AM +0000, Bertrand Drouvot wrote:
> Hi,
> 
> On Tue, Mar 26, 2024 at 09:30:32AM +0530, shveta malik wrote:
> > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > I have one concern, for synced slots on standby, how do we disallow
> > > invalidation due to inactive-timeout immediately after promotion?
> > >
> > > For synced slots, last_inactive_time and inactive_timeout are both
> > > set. Let's say I bring down primary for promotion of standby and then
> > > promote standby, there are chances that it may end up invalidating
> > > synced slots (considering standby is not brought down during promotion
> > > and thus inactive_timeout may already be past 'last_inactive_time').
> > >
> > 
> > On standby, if we decide to maintain valid last_inactive_time for
> > synced slots, then invalidation is correctly restricted in
> > InvalidateSlotForInactiveTimeout() for synced slots using the check:
> > 
> >         if (RecoveryInProgress() && slot->data.synced)
> >                 return false;
> 
> Right.
> 
> > But immediately after promotion, we can not rely on the above check
> > and thus possibility of synced slots invalidation is there. To
> > maintain consistent behavior regarding the setting of
> > last_inactive_time for synced slots, similar to user slots, one
> > potential solution to prevent this invalidation issue is to update the
> > last_inactive_time of all synced slots within the ShutDownSlotSync()
> > function during FinishWalRecovery(). This approach ensures that
> > promotion doesn't immediately invalidate slots, and henceforth, we
> > possess a correct last_inactive_time as a basis for invalidation going
> > forward. This will be equivalent to updating last_inactive_time during
> > restart (but without actual restart during promotion).
> > The plus point of maintaining last_inactive_time for synced slots
> > could be, this can provide data to the user on when last time the sync
> > was attempted on that particular slot by background slot sync worker
> > or SQl function. Thoughts?
> 
> Yeah, another plus point is that if the primary is down then one could look
> at the synced "active_since" on the standby to get an idea of it (depends of the
> last sync though).
> 
> The issue that I can see with your proposal is: what if one synced the slots
> manually (with pg_sync_replication_slots()) but does not use the sync worker?
> Then I think ShutDownSlotSync() is not going to help in that case.

It looks like ShutDownSlotSync() is always called (even if sync_replication_slots = off),
so that sounds ok to me (I should have checked the code, I was under the impression
ShutDownSlotSync() was not called if sync_replication_slots = off).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

26 March 2024, 06:20:45

On Tue, Mar 26, 2024 at 11:36 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
> >
> > The issue that I can see with your proposal is: what if one synced the slots
> > manually (with pg_sync_replication_slots()) but does not use the sync worker?
> > Then I think ShutDownSlotSync() is not going to help in that case.
>
> It looks like ShutDownSlotSync() is always called (even if sync_replication_slots = off),
> so that sounds ok to me (I should have checked the code, I was under the impression
> ShutDownSlotSync() was not called if sync_replication_slots = off).

Right, it is called irrespective of sync_replication_slots.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

26 March 2024, 06:34:26

On Tue, Mar 26, 2024 at 11:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > I have one concern, for synced slots on standby, how do we disallow
> > > invalidation due to inactive-timeout immediately after promotion?
> > >
> > > For synced slots, last_inactive_time and inactive_timeout are both
> > > set. Let's say I bring down primary for promotion of standby and then
> > > promote standby, there are chances that it may end up invalidating
> > > synced slots (considering standby is not brought down during promotion
> > > and thus inactive_timeout may already be past 'last_inactive_time').
> > >
> >
> > On standby, if we decide to maintain valid last_inactive_time for
> > synced slots, then invalidation is correctly restricted in
> > InvalidateSlotForInactiveTimeout() for synced slots using the check:
> >
> >         if (RecoveryInProgress() && slot->data.synced)
> >                 return false;
> >
> > But immediately after promotion, we can not rely on the above check
> > and thus possibility of synced slots invalidation is there. To
> > maintain consistent behavior regarding the setting of
> > last_inactive_time for synced slots, similar to user slots, one
> > potential solution to prevent this invalidation issue is to update the
> > last_inactive_time of all synced slots within the ShutDownSlotSync()
> > function during FinishWalRecovery(). This approach ensures that
> > promotion doesn't immediately invalidate slots, and henceforth, we
> > possess a correct last_inactive_time as a basis for invalidation going
> > forward. This will be equivalent to updating last_inactive_time during
> > restart (but without actual restart during promotion).
> > The plus point of maintaining last_inactive_time for synced slots
> > could be, this can provide data to the user on when last time the sync
> > was attempted on that particular slot by background slot sync worker
> > or SQl function. Thoughts?
>
> Please find the attached v21 patch implementing the above idea. It
> also has changes for renaming last_inactive_time to inactive_since.
>

Thanks for the patch. I have tested this patch alone, and it does what
it says. One additional thing which I noticed is that now it sets
inactive_since for temp slots as well, but that idea looks fine to me.

I could not test 'invalidation on promotion bug' with this change, as
that needed rebasing of the rest of the patches.

Few trivial things:

1)
Commti msg:

ensures the value is set to current timestamp during the
shutdown to help correctly interpret the time if the standby gets
promoted without a restart.

shutdown --> shutdown of slot sync worker   (as it was not clear if it
is instance shutdown or something else)

2)
'The time since the slot has became inactive'.

has became-->has become
or just became

Please check it in all the files. There are multiple places.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

26 March 2024, 07:45:19

Hi,

On Tue, Mar 26, 2024 at 11:07:51AM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote:
> > But immediately after promotion, we can not rely on the above check
> > and thus possibility of synced slots invalidation is there. To
> > maintain consistent behavior regarding the setting of
> > last_inactive_time for synced slots, similar to user slots, one
> > potential solution to prevent this invalidation issue is to update the
> > last_inactive_time of all synced slots within the ShutDownSlotSync()
> > function during FinishWalRecovery(). This approach ensures that
> > promotion doesn't immediately invalidate slots, and henceforth, we
> > possess a correct last_inactive_time as a basis for invalidation going
> > forward. This will be equivalent to updating last_inactive_time during
> > restart (but without actual restart during promotion).
> > The plus point of maintaining last_inactive_time for synced slots
> > could be, this can provide data to the user on when last time the sync
> > was attempted on that particular slot by background slot sync worker
> > or SQl function. Thoughts?
> 
> Please find the attached v21 patch implementing the above idea. It
> also has changes for renaming last_inactive_time to inactive_since.

Thanks!

A few comments:

1 ===

One trailing whitespace:

Applying: Fix review comments for slot's last_inactive_time property
.git/rebase-apply/patch:433: trailing whitespace.
# got a valid inactive_since value representing the last slot sync time.
warning: 1 line adds whitespace errors.

2 ===

It looks like inactive_since is set to the current timestamp on the standby
each time the sync worker does a cycle:

primary:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
  slot_name  |        inactive_since
-------------+-------------------------------
 lsub27_slot | 2024-03-26 07:39:19.745517+00
 lsub28_slot | 2024-03-26 07:40:24.953826+00

standby:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
  slot_name  |        inactive_since
-------------+-------------------------------
 lsub27_slot | 2024-03-26 07:43:56.387324+00
 lsub28_slot | 2024-03-26 07:43:56.387338+00

I don't think that should be the case.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

26 March 2024, 08:07:21

On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> 2 ===
>
> It looks like inactive_since is set to the current timestamp on the standby
> each time the sync worker does a cycle:
>
> primary:
>
> postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
>   slot_name  |        inactive_since
> -------------+-------------------------------
>  lsub27_slot | 2024-03-26 07:39:19.745517+00
>  lsub28_slot | 2024-03-26 07:40:24.953826+00
>
> standby:
>
> postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
>   slot_name  |        inactive_since
> -------------+-------------------------------
>  lsub27_slot | 2024-03-26 07:43:56.387324+00
>  lsub28_slot | 2024-03-26 07:43:56.387338+00
>
> I don't think that should be the case.
>

But why? This is exactly what we discussed in another thread where we
agreed to update inactive_since even for sync slots. In each sync
cycle, we acquire/release the slot, so the inactive_since gets
updated. See synchronize_one_slot().

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

26 March 2024, 08:24:00

Hi,

On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:
> On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > 2 ===
> >
> > It looks like inactive_since is set to the current timestamp on the standby
> > each time the sync worker does a cycle:
> >
> > primary:
> >
> > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
> >   slot_name  |        inactive_since
> > -------------+-------------------------------
> >  lsub27_slot | 2024-03-26 07:39:19.745517+00
> >  lsub28_slot | 2024-03-26 07:40:24.953826+00
> >
> > standby:
> >
> > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
> >   slot_name  |        inactive_since
> > -------------+-------------------------------
> >  lsub27_slot | 2024-03-26 07:43:56.387324+00
> >  lsub28_slot | 2024-03-26 07:43:56.387338+00
> >
> > I don't think that should be the case.
> >
> 
> But why? This is exactly what we discussed in another thread where we
> agreed to update inactive_since even for sync slots.

Hum, I thought we agreed to "sync" it and to "update it to current time"
only at promotion time.

I don't think updating inactive_since to current time during each cycle makes
sense (I mean I understand the use case: being able to say when slots have been
sync, but if this is what we want then we should consider an extra view or an
extra field but not relying on the inactive_since one).

If the primary goes down, not updating inactive_since to the current time could
also provide benefit such as knowing the inactive_since of the primary slots
(from the standby) the last time it has been synced. If we update it to the current
time then this information is lost.

> In each sync
> cycle, we acquire/release the slot, so the inactive_since gets
> updated. See synchronize_one_slot().

Right, and I think we should put an extra condition if in recovery.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

26 March 2024, 08:57:17

On Tue, Mar 26, 2024 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Review comments on v18_0002 and v18_0005
> =======================================
>
> 1.
> We have decided to update inactive_since for temporary slots. So,
> unless there is some reason, we should allow inactive_timeout to also
> be set for temporary slots.

WFM. A temporary slot that's inactive for a long time before even the
server isn't shutdown can utilize this inactive_timeout based
invalidation mechanism. And, I'd also vote for we being consistent for
temporary and synced slots.

>              L.last_inactive_time,
> +            L.inactive_timeout,
>
> Shall we keep inactive_timeout before
> last_inactive_time/inactive_since? I don't have any strong reason to
> propose that way apart from that the former is provided by the user.

Done.

> + if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
> + invalidated = true;
>
> I don't think we should try to invalidate the slots in
> pg_get_replication_slots. This function's purpose is to get the
> current information on slots and has no intention to perform any work
> for slots. Any error due to invalidation won't be what the user would
> be expecting here.

Agree. Removed.

> 4.
> +static bool
> +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
> + bool need_control_lock,
> + bool need_mutex)
> {
> ...
> ...
> + if (need_control_lock)
> + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
> +
> + Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
> +
> + /*
> + * Check if the slot needs to be invalidated due to inactive_timeout. We
> + * do this with the spinlock held to avoid race conditions -- for example
> + * the restart_lsn could move forward, or the slot could be dropped.
> + */
> + if (need_mutex)
> + SpinLockAcquire(&slot->mutex);
> ...
>
> I find this combination of parameters a bit strange. Because, say if
> need_mutex is false and need_control_lock is true then that means this
> function will acquire LWlock after acquiring spinlock which is
> unacceptable. Now, this may not happen in practice as the callers
> won't pass such a combination but still, this functionality should be
> improved.

Right. Either we need two locks or not. So, changed it to use just one
bool need_locks, upon set both control lock and spin lock are acquired
and released.

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> patch 002:
>
> 2)
> slotsync.c:
>
>   ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
>     remote_slot->two_phase,
>     remote_slot->failover,
> -   true);
> +   true, 0);
>
> + slot->data.inactive_timeout = remote_slot->inactive_timeout;
>
> Is there a reason we are not passing 'remote_slot->inactive_timeout'
> to ReplicationSlotCreate() directly?

The slot there gets created temporarily for which we were not
supporting inactive_timeout being set. But, in the latest v22 patch we
are supporting, so passing the remote_slot->inactive_timeout directly.

> 3)
> slotfuncs.c
> pg_create_logical_replication_slot():
> + int inactive_timeout = PG_GETARG_INT32(5);
>
> Can we mention here that timeout is in seconds either in comment or
> rename variable to inactive_timeout_secs?
>
> Please do this for create_physical_replication_slot(),
> create_logical_replication_slot(),
> pg_create_physical_replication_slot() as well.

Added /* in seconds */ next the variable declaration.

> ---------
> 4)
> + int inactive_timeout; /* The amount of time in seconds the slot
> + * is allowed to be inactive. */
>  } LogicalSlotInfo;
>
>  Do we need to mention "before getting invalided" like other places
> (in last patch)?

Done.

>  5)
> Same at these two places. "before getting invalided" to be added in
> the last patch otherwise the info is incompleted.
>
> +
> + /* The amount of time in seconds the slot is allowed to be inactive */
> + int inactive_timeout;
>  } ReplicationSlotPersistentData;
>
>
> + * inactive_timeout: The amount of time in seconds the slot is allowed to be
> + *     inactive.
>   */
>  void
>  ReplicationSlotCreate(const char *name, bool db_specific,
>  Same here. "before getting invalidated" ?

Done.

On Tue, Mar 26, 2024 at 12:04 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> > Please find the attached v21 patch implementing the above idea. It
> > also has changes for renaming last_inactive_time to inactive_since.
>
> Thanks for the patch. I have tested this patch alone, and it does what
> it says. One additional thing which I noticed is that now it sets
> inactive_since for temp slots as well, but that idea looks fine to me.

Right. Let's be consistent by treating all slots the same.

> I could not test 'invalidation on promotion bug' with this change, as
> that needed rebasing of the rest of the patches.

Please use the v22 patch set.

> Few trivial things:
>
> 1)
> Commti msg:
>
> ensures the value is set to current timestamp during the
> shutdown to help correctly interpret the time if the standby gets
> promoted without a restart.
>
> shutdown --> shutdown of slot sync worker   (as it was not clear if it
> is instance shutdown or something else)

Changed it to "shutdown of slot sync machinery" to be consistent with
the comments.

> 2)
> 'The time since the slot has became inactive'.
>
> has became-->has become
> or just became
>
> Please check it in all the files. There are multiple places.

Fixed.

Please see the attached v23 patches. I've addressed all the review
comments received so far from Amit and Shveta.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:
> > On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > 2 ===
> > >
> > > It looks like inactive_since is set to the current timestamp on the standby
> > > each time the sync worker does a cycle:
> > >
> > > primary:
> > >
> > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
> > >   slot_name  |        inactive_since
> > > -------------+-------------------------------
> > >  lsub27_slot | 2024-03-26 07:39:19.745517+00
> > >  lsub28_slot | 2024-03-26 07:40:24.953826+00
> > >
> > > standby:
> > >
> > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
> > >   slot_name  |        inactive_since
> > > -------------+-------------------------------
> > >  lsub27_slot | 2024-03-26 07:43:56.387324+00
> > >  lsub28_slot | 2024-03-26 07:43:56.387338+00
> > >
> > > I don't think that should be the case.
> > >
> >
> > But why? This is exactly what we discussed in another thread where we
> > agreed to update inactive_since even for sync slots.
>
> Hum, I thought we agreed to "sync" it and to "update it to current time"
> only at promotion time.

I think there may have been some misunderstanding here. But now if I
rethink this, I am fine with 'inactive_since' getting synced from
primary to standby. But if we do that, we need to add docs stating
"inactive_since" represents primary's inactivity and not standby's
slots inactivity for synced slots. The reason for this clarification
is that the synced slot might be generated much later, yet
'inactive_since' is synced from the primary, potentially indicating a
time considerably earlier than when the synced slot was actually
created.

Another approach could be that "inactive_since" for synced slot
actually gives its own inactivity data rather than giving primary's
slot data. We update inactive_since on standby only at 3 occasions:
1) at the time of creation of the synced slot.
2) during standby restart.
3) during promotion of standby.

I have attached a sample patch for this idea as.txt file.

I am fine with any of these approaches.  One gives data synced from
primary for synced slots, while another gives actual inactivity data
of synced slots.

thanks
Shveta

Attachment

v1-0001-inactive_since-for-synced-slots.patch.txt

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

26 March 2024, 10:20:50

Hi,

On Tue, Mar 26, 2024 at 03:17:36PM +0530, shveta malik wrote:
> On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:
> > > On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > 2 ===
> > > >
> > > > It looks like inactive_since is set to the current timestamp on the standby
> > > > each time the sync worker does a cycle:
> > > >
> > > > primary:
> > > >
> > > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
> > > >   slot_name  |        inactive_since
> > > > -------------+-------------------------------
> > > >  lsub27_slot | 2024-03-26 07:39:19.745517+00
> > > >  lsub28_slot | 2024-03-26 07:40:24.953826+00
> > > >
> > > > standby:
> > > >
> > > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
> > > >   slot_name  |        inactive_since
> > > > -------------+-------------------------------
> > > >  lsub27_slot | 2024-03-26 07:43:56.387324+00
> > > >  lsub28_slot | 2024-03-26 07:43:56.387338+00
> > > >
> > > > I don't think that should be the case.
> > > >
> > >
> > > But why? This is exactly what we discussed in another thread where we
> > > agreed to update inactive_since even for sync slots.
> >
> > Hum, I thought we agreed to "sync" it and to "update it to current time"
> > only at promotion time.
> 
> I think there may have been some misunderstanding here.

Indeed ;-)

> But now if I
> rethink this, I am fine with 'inactive_since' getting synced from
> primary to standby. But if we do that, we need to add docs stating
> "inactive_since" represents primary's inactivity and not standby's
> slots inactivity for synced slots.

Yeah sure.

> The reason for this clarification
> is that the synced slot might be generated much later, yet
> 'inactive_since' is synced from the primary, potentially indicating a
> time considerably earlier than when the synced slot was actually
> created.

Right.

> Another approach could be that "inactive_since" for synced slot
> actually gives its own inactivity data rather than giving primary's
> slot data. We update inactive_since on standby only at 3 occasions:
> 1) at the time of creation of the synced slot.
> 2) during standby restart.
> 3) during promotion of standby.
> 
> I have attached a sample patch for this idea as.txt file.

Thanks!

> I am fine with any of these approaches.  One gives data synced from
> primary for synced slots, while another gives actual inactivity data
> of synced slots.

What about another approach?: inactive_since gives data synced from primary for
synced slots and another dedicated field (could be added later...) could
represent what you suggest as the other option.

Another cons of updating inactive_since at the current time during each slot
sync cycle is that calling GetCurrentTimestamp() very frequently
(during each sync cycle of very active slots) could be too costly.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

26 March 2024, 10:21:02

On Tue, Mar 26, 2024 at 3:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Tue, Mar 26, 2024 at 02:27:17PM +0530, Bharath Rupireddy wrote:
> > Please use the v22 patch set.
>
> Thanks!
>
> 1 ===
>
> +reset_synced_slots_info(void)
>
> I'm not sure "reset" is the right word, what about slot_sync_shutdown_update()?
>

*shutdown_update() sounds generic. How about
update_synced_slots_inactive_time()? I think it is a bit longer but
conveys the meaning.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Ajin Cherian

Date:

26 March 2024, 10:22:25

On Tue, Mar 26, 2024 at 7:57 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Please see the attached v23 patches. I've addressed all the review
comments received so far from Amit and Shveta.

In patch 0003:
+ SpinLockAcquire(&slot->mutex);
+ }
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ if (slot->inactive_since > 0 &&
+ slot->data.inactive_timeout > 0)
+ {
+ TimestampTz now;
+
+ /* inactive_since is only tracked for inactive slots */
+ Assert(slot->active_pid == 0);
+
+ now = GetCurrentTimestamp();
+ if (TimestampDifferenceExceeds(slot->inactive_since, now,
+ slot->data.inactive_timeout * 1000))
+ inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+ }
+
+ if (need_locks)
+ {
+ SpinLockRelease(&slot->mutex);

Here, GetCurrentTimestamp() is still called with SpinLock held. Maybe do this prior to acquiring the spinlock.

regards,

Ajin Cherian

Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

26 March 2024, 10:47:53

On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> > I think there may have been some misunderstanding here.
>
> Indeed ;-)
>
> > But now if I
> > rethink this, I am fine with 'inactive_since' getting synced from
> > primary to standby. But if we do that, we need to add docs stating
> > "inactive_since" represents primary's inactivity and not standby's
> > slots inactivity for synced slots.
>
> Yeah sure.
>
> > The reason for this clarification
> > is that the synced slot might be generated much later, yet
> > 'inactive_since' is synced from the primary, potentially indicating a
> > time considerably earlier than when the synced slot was actually
> > created.
>
> Right.
>
> > Another approach could be that "inactive_since" for synced slot
> > actually gives its own inactivity data rather than giving primary's
> > slot data. We update inactive_since on standby only at 3 occasions:
> > 1) at the time of creation of the synced slot.
> > 2) during standby restart.
> > 3) during promotion of standby.
> >
> > I have attached a sample patch for this idea as.txt file.
>
> Thanks!
>
> > I am fine with any of these approaches.  One gives data synced from
> > primary for synced slots, while another gives actual inactivity data
> > of synced slots.
>
> What about another approach?: inactive_since gives data synced from primary for
> synced slots and another dedicated field (could be added later...) could
> represent what you suggest as the other option.

Yes, okay with me. I think there is some confusion here as well. In my
second approach above, I have not suggested anything related to
sync-worker. We can think on that later if we really need another
field which give us sync time.  In my second approach, I have tried to
avoid updating inactive_since for synced slots during sync process. We
update that field during creation of synced slot so that
inactive_since reflects correct info even for synced slots (rather
than copying from primary). Please have a look at my patch and let me
know your thoughts. I am fine with copying it from primary as well and
documenting this behaviour.

> Another cons of updating inactive_since at the current time during each slot
> sync cycle is that calling GetCurrentTimestamp() very frequently
> (during each sync cycle of very active slots) could be too costly.

Right.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

26 March 2024, 11:05:16

On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> > What about another approach?: inactive_since gives data synced from primary for
> > synced slots and another dedicated field (could be added later...) could
> > represent what you suggest as the other option.
>
> Yes, okay with me. I think there is some confusion here as well. In my
> second approach above, I have not suggested anything related to
> sync-worker. We can think on that later if we really need another
> field which give us sync time.  In my second approach, I have tried to
> avoid updating inactive_since for synced slots during sync process. We
> update that field during creation of synced slot so that
> inactive_since reflects correct info even for synced slots (rather
> than copying from primary). Please have a look at my patch and let me
> know your thoughts. I am fine with copying it from primary as well and
> documenting this behaviour.

I took a look at your patch.

--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
         SpinLockAcquire(&slot->mutex);
         slot->effective_catalog_xmin = xmin_horizon;
         slot->data.catalog_xmin = xmin_horizon;
+        slot->inactive_since = GetCurrentTimestamp();
         SpinLockRelease(&slot->mutex);

If we just sync inactive_since value for synced slots while in
recovery from the primary, so be it. Why do we need to update it to
the current time when the slot is being created? We don't expose slot
creation time, no? Aren't we fine if we just sync the value from
primary and document that fact? After the promotion, we can reset it
to the current time so that it gets its own time. Do you see any
issues with it?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

26 March 2024, 11:19:18

On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > > What about another approach?: inactive_since gives data synced from primary for
> > > synced slots and another dedicated field (could be added later...) could
> > > represent what you suggest as the other option.
> >
> > Yes, okay with me. I think there is some confusion here as well. In my
> > second approach above, I have not suggested anything related to
> > sync-worker. We can think on that later if we really need another
> > field which give us sync time.  In my second approach, I have tried to
> > avoid updating inactive_since for synced slots during sync process. We
> > update that field during creation of synced slot so that
> > inactive_since reflects correct info even for synced slots (rather
> > than copying from primary). Please have a look at my patch and let me
> > know your thoughts. I am fine with copying it from primary as well and
> > documenting this behaviour.
>
> I took a look at your patch.
>
> --- a/src/backend/replication/logical/slotsync.c
> +++ b/src/backend/replication/logical/slotsync.c
> @@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
> remote_dbid)
>          SpinLockAcquire(&slot->mutex);
>          slot->effective_catalog_xmin = xmin_horizon;
>          slot->data.catalog_xmin = xmin_horizon;
> +        slot->inactive_since = GetCurrentTimestamp();
>          SpinLockRelease(&slot->mutex);
>
> If we just sync inactive_since value for synced slots while in
> recovery from the primary, so be it. Why do we need to update it to
> the current time when the slot is being created?

If we update inactive_since  at synced slot's creation or during
restart (skipping setting it during sync), then this time reflects
actual 'inactive_since' for that particular synced slot.  Isn't that a
clear info for the user and in alignment of what the name
'inactive_since' actually suggests?

> We don't expose slot
> creation time, no?

No, we don't. But for synced slot, that is the time since that slot is
inactive  (unless promoted), so we are exposing inactive_since and not
creation time.

>Aren't we fine if we just sync the value from
> primary and document that fact? After the promotion, we can reset it
> to the current time so that it gets its own time. Do you see any
> issues with it?

Yes, we can do that. But curious to know, do we see any additional
benefit of reflecting primary's inactive_since at standby which I
might be missing?

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

26 March 2024, 12:31:08

Hi,

On Tue, Mar 26, 2024 at 04:49:18PM +0530, shveta malik wrote:
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > > What about another approach?: inactive_since gives data synced from primary for
> > > > synced slots and another dedicated field (could be added later...) could
> > > > represent what you suggest as the other option.
> > >
> > > Yes, okay with me. I think there is some confusion here as well. In my
> > > second approach above, I have not suggested anything related to
> > > sync-worker. We can think on that later if we really need another
> > > field which give us sync time.  In my second approach, I have tried to
> > > avoid updating inactive_since for synced slots during sync process. We
> > > update that field during creation of synced slot so that
> > > inactive_since reflects correct info even for synced slots (rather
> > > than copying from primary). Please have a look at my patch and let me
> > > know your thoughts. I am fine with copying it from primary as well and
> > > documenting this behaviour.
> >
> > I took a look at your patch.
> >
> > --- a/src/backend/replication/logical/slotsync.c
> > +++ b/src/backend/replication/logical/slotsync.c
> > @@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
> > remote_dbid)
> >          SpinLockAcquire(&slot->mutex);
> >          slot->effective_catalog_xmin = xmin_horizon;
> >          slot->data.catalog_xmin = xmin_horizon;
> > +        slot->inactive_since = GetCurrentTimestamp();
> >          SpinLockRelease(&slot->mutex);
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot is being created?
> 
> If we update inactive_since  at synced slot's creation or during
> restart (skipping setting it during sync), then this time reflects
> actual 'inactive_since' for that particular synced slot.  Isn't that a
> clear info for the user and in alignment of what the name
> 'inactive_since' actually suggests?
> 
> > We don't expose slot
> > creation time, no?
> 
> No, we don't. But for synced slot, that is the time since that slot is
> inactive  (unless promoted), so we are exposing inactive_since and not
> creation time.
> 
> >Aren't we fine if we just sync the value from
> > primary and document that fact? After the promotion, we can reset it
> > to the current time so that it gets its own time. Do you see any
> > issues with it?
> 
> Yes, we can do that. But curious to know, do we see any additional
> benefit of reflecting primary's inactive_since at standby which I
> might be missing?

In case the primary goes down, then one could use the value on the standby
to get the value coming from the primary. I think that could be useful info to
have.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

26 March 2024, 17:52:12

Hi,

On Tue, Mar 26, 2024 at 09:59:23PM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot is being created? We don't expose slot
> > creation time, no? Aren't we fine if we just sync the value from
> > primary and document that fact? After the promotion, we can reset it
> > to the current time so that it gets its own time.
> 
> I'm attaching v24 patches. It implements the above idea proposed
> upthread for synced slots. I've now separated
> s/last_inactive_time/inactive_since and synced slots behaviour. Please
> have a look.

Thanks!

==== v24-0001

It's now pure mechanical changes and it looks good to me.

==== v24-0002

1 ===

    This commit does two things:
    1) Updates inactive_since for sync slots with the value
    received from the primary's slot.

Tested it and it does that.

2 ===

    2) Ensures the value is set to current timestamp during the
    shutdown of slot sync machinery to help correctly interpret the
    time if the standby gets promoted without a restart.

Tested it and it does that.

3 ===

+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+update_synced_slots_inactive_time(void)

Looks like the comment "reset" is not matching the name of the function and
what it does.

4 ===

+                       /*
+                        * We get the current time beforehand and only once to avoid
+                        * system calls overhead while holding the lock.
+                        */
+                       if (now == 0)
+                               now = GetCurrentTimestamp();

Also +1 of having GetCurrentTimestamp() just called one time within the loop.

5 ===

-               if (!(RecoveryInProgress() && slot->data.synced))
+               if (!(InRecovery && slot->data.synced))
                        slot->inactive_since = GetCurrentTimestamp();
                else
                        slot->inactive_since = 0;

Not related to this change but more the way RestoreSlotFromDisk() behaves here:

For a sync slot on standby it will be set to zero and then later will be
synchronized with the one coming from the primary. I think that's fine to have
it to zero for this window of time.

Now, if the standby is down and one sets sync_replication_slots to off,
then inactive_since will be set to zero on the standby at startup and not 
synchronized (unless one triggers a manual sync). I also think that's fine but
it might be worth to document this behavior (that after a standby startup
inactive_since is zero until the next sync...). 

6 ===

+       print "HI  $slot_name $name $inactive_since $slot_creation_time\n";

garbage?

7 ===

+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+       my ($node, $slot_name, $slot_creation_time) = @_;
+       my $name = $node->name;

We know have capture_and_validate_slot_inactive_since at 2 places:
040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.

Worth to create a sub in Cluster.pm?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

27 March 2024, 03:31:50

On Tue, Mar 26, 2024 at 9:59 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot is being created? We don't expose slot
> > creation time, no? Aren't we fine if we just sync the value from
> > primary and document that fact? After the promotion, we can reset it
> > to the current time so that it gets its own time.
>
> I'm attaching v24 patches. It implements the above idea proposed
> upthread for synced slots. I've now separated
> s/last_inactive_time/inactive_since and synced slots behaviour. Please
> have a look.

Thanks for the patches. Few trivial comments for v24-002:

1)
slot.c:
+ * data from the remote slot. We use InRecovery flag instead of
+ * RecoveryInProgress() as it always returns true even for normal
+ * server startup.

a) Not clear what 'it' refers to. Better to use 'the latter'
b) Is it better to mention the primary here:
 'as the latter always returns true even on the primary server during startup'.


2)
update_local_synced_slot():

- strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+ strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+ remote_slot->inactive_since == slot->inactive_since)

When this code was written initially, the intent was to do strcmp at
the end (only if absolutely needed). It will be good if we maintain
the same and add new checks before strcmp.

3)
update_synced_slots_inactive_time():

This assert is removed, is it intentional?
Assert(s->active_pid == 0);


4)
040_standby_failover_slots_sync.pl:

+# Capture the inactive_since of the slot from the standby the logical failover
+# slots are synced/created on the standby.

The comment is unclear, something seems missing.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

27 March 2024, 04:38:33

On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > I'm attaching v24 patches. It implements the above idea proposed
> > upthread for synced slots.
>
> ==== v24-0002
>
> 1 ===
>
>     This commit does two things:
>     1) Updates inactive_since for sync slots with the value
>     received from the primary's slot.
>
> Tested it and it does that.

Thanks. I've added a test case for this.

> 2 ===
>
>     2) Ensures the value is set to current timestamp during the
>     shutdown of slot sync machinery to help correctly interpret the
>     time if the standby gets promoted without a restart.
>
> Tested it and it does that.

Thanks. I've added a test case for this.

> 3 ===
>
> +/*
> + * Reset the synced slots info such as inactive_since after shutting
> + * down the slot sync machinery.
> + */
> +static void
> +update_synced_slots_inactive_time(void)
>
> Looks like the comment "reset" is not matching the name of the function and
> what it does.

Changed. I've also changed the function name to
update_synced_slots_inactive_since to be precise on what it exactly
does.

> 4 ===
>
> +                       /*
> +                        * We get the current time beforehand and only once to avoid
> +                        * system calls overhead while holding the lock.
> +                        */
> +                       if (now == 0)
> +                               now = GetCurrentTimestamp();
>
> Also +1 of having GetCurrentTimestamp() just called one time within the loop.

Right.

> 5 ===
>
> -               if (!(RecoveryInProgress() && slot->data.synced))
> +               if (!(InRecovery && slot->data.synced))
>                         slot->inactive_since = GetCurrentTimestamp();
>                 else
>                         slot->inactive_since = 0;
>
> Not related to this change but more the way RestoreSlotFromDisk() behaves here:
>
> For a sync slot on standby it will be set to zero and then later will be
> synchronized with the one coming from the primary. I think that's fine to have
> it to zero for this window of time.

Right.

> Now, if the standby is down and one sets sync_replication_slots to off,
> then inactive_since will be set to zero on the standby at startup and not
> synchronized (unless one triggers a manual sync). I also think that's fine but
> it might be worth to document this behavior (that after a standby startup
> inactive_since is zero until the next sync...).

Isn't this behaviour applicable for other slot parameters that the
slot syncs from the remote slot on the primary?

I've added the following note in the comments when we update
inactive_since in RestoreSlotFromDisk.

         * Note that for synced slots after the standby starts up (i.e. after
         * the slots are loaded from the disk), the inactive_since will remain
         * zero until the next slot sync cycle.
         */
        if (!(InRecovery && slot->data.synced))
            slot->inactive_since = GetCurrentTimestamp();
        else
            slot->inactive_since = 0;

> 6 ===
>
> +       print "HI  $slot_name $name $inactive_since $slot_creation_time\n";
>
> garbage?

Removed.

> 7 ===
>
> +# Capture and validate inactive_since of a given slot.
> +sub capture_and_validate_slot_inactive_since
> +{
> +       my ($node, $slot_name, $slot_creation_time) = @_;
> +       my $name = $node->name;
>
> We know have capture_and_validate_slot_inactive_since at 2 places:
> 040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.
>
> Worth to create a sub in Cluster.pm?

I'd second that thought for now. We might have to debate first if it's
useful for all the nodes even without replication, and if yes, the
naming stuff and all that. Historically, we've had such duplicated
functions until recently, for instance advance_wal and log_contains.
We
moved them over to a common perl library Cluster.pm very recently. I'm
sure we can come back later to move it to Cluster.pm.

On Wed, Mar 27, 2024 at 9:02 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> 1)
> slot.c:
> + * data from the remote slot. We use InRecovery flag instead of
> + * RecoveryInProgress() as it always returns true even for normal
> + * server startup.
>
> a) Not clear what 'it' refers to. Better to use 'the latter'
> b) Is it better to mention the primary here:
>  'as the latter always returns true even on the primary server during startup'.

Modified.

> 2)
> update_local_synced_slot():
>
> - strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
> + strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
> + remote_slot->inactive_since == slot->inactive_since)
>
> When this code was written initially, the intent was to do strcmp at
> the end (only if absolutely needed). It will be good if we maintain
> the same and add new checks before strcmp.

Done.

> 3)
> update_synced_slots_inactive_time():
>
> This assert is removed, is it intentional?
> Assert(s->active_pid == 0);

Yes, the slot can get acquired in the corner case when someone runs
pg_sync_replication_slots concurrently at this time. I'm referring to
the issue reported upthread. We don't prevent one running
pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
Maybe we should prevent that otherwise some of the slots are synced
and the standby gets promoted while others are yet-to-be-synced.

> 4)
> 040_standby_failover_slots_sync.pl:
>
> +# Capture the inactive_since of the slot from the standby the logical failover
> +# slots are synced/created on the standby.
>
> The comment is unclear, something seems missing.

Nice catch. Yes, that was wrong. I've modified it now.

Please find the attached v25-0001 (made this 0001 patch now as
inactive_since patch is committed) patch with the above changes.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v25-0001-Maintain-inactive_since-for-synced-slots-correct.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

27 March 2024, 04:52:30

On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
>
> > 3)
> > update_synced_slots_inactive_time():
> >
> > This assert is removed, is it intentional?
> > Assert(s->active_pid == 0);
>
> Yes, the slot can get acquired in the corner case when someone runs
> pg_sync_replication_slots concurrently at this time. I'm referring to
> the issue reported upthread. We don't prevent one running
> pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
> Maybe we should prevent that otherwise some of the slots are synced
> and the standby gets promoted while others are yet-to-be-synced.
>

We should do something about it but that shouldn't be done in this
patch. We can handle it separately and then add such an assert.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

27 March 2024, 04:54:32

On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> >
> > > 3)
> > > update_synced_slots_inactive_time():
> > >
> > > This assert is removed, is it intentional?
> > > Assert(s->active_pid == 0);
> >
> > Yes, the slot can get acquired in the corner case when someone runs
> > pg_sync_replication_slots concurrently at this time. I'm referring to
> > the issue reported upthread. We don't prevent one running
> > pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
> > Maybe we should prevent that otherwise some of the slots are synced
> > and the standby gets promoted while others are yet-to-be-synced.
> >
>
> We should do something about it but that shouldn't be done in this
> patch. We can handle it separately and then add such an assert.

Agreed. Once this patch is concluded, I can fix the slot sync shutdown
issue and will also add this 'assert' back.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

27 March 2024, 04:56:01

On Wed, Mar 27, 2024 at 10:24 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > > 3)
> > > > update_synced_slots_inactive_time():
> > > >
> > > > This assert is removed, is it intentional?
> > > > Assert(s->active_pid == 0);
> > >
> > > Yes, the slot can get acquired in the corner case when someone runs
> > > pg_sync_replication_slots concurrently at this time. I'm referring to
> > > the issue reported upthread. We don't prevent one running
> > > pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
> > > Maybe we should prevent that otherwise some of the slots are synced
> > > and the standby gets promoted while others are yet-to-be-synced.
> > >
> >
> > We should do something about it but that shouldn't be done in this
> > patch. We can handle it separately and then add such an assert.
>
> Agreed. Once this patch is concluded, I can fix the slot sync shutdown
> issue and will also add this 'assert' back.

Agreed. Thanks.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

27 March 2024, 05:25:29

On Tue, Mar 26, 2024 at 6:05 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
>
> > We can think on that later if we really need another
> > field which give us sync time.
>
> I think that calling GetCurrentTimestamp() so frequently could be too costly, so
> I'm not sure we should.

Agreed.

> > In my second approach, I have tried to
> > avoid updating inactive_since for synced slots during sync process. We
> > update that field during creation of synced slot so that
> > inactive_since reflects correct info even for synced slots (rather
> > than copying from primary).
>
> Yeah, and I think we could create a dedicated field with this information
> if we feel the need.

Okay.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

27 March 2024, 05:35:04

On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Please find the attached v25-0001 (made this 0001 patch now as
> inactive_since patch is committed) patch with the above changes.

Fixed an issue in synchronize_slots where DatumGetLSN is being used in
place of DatumGetTimestampTz. Found this via CF bot member [1], not on
my dev system.

Please find the attached v6 patch.


[1]
[05:14:39.281] #7  DatumGetLSN (X=<optimized out>) at
../src/include/utils/pg_lsn.h:24
[05:14:39.281] No locals.
[05:14:39.281] #8  synchronize_slots (wrconn=wrconn@entry=0x583cd170)
at ../src/backend/replication/logical/slotsync.c:757
[05:14:39.281]         isnull = false
[05:14:39.281]         remote_slot = 0x583ce1a8
[05:14:39.281]         d = <optimized out>
[05:14:39.281]         col = 10
[05:14:39.281]         slotRow = {25, 25, 3220, 3220, 28, 16, 16, 25, 25, 1184}
[05:14:39.281]         res = 0x583cd1b8
[05:14:39.281]         tupslot = 0x583ce11c
[05:14:39.281]         remote_slot_list = 0x0
[05:14:39.281]         some_slot_updated = false
[05:14:39.281]         started_tx = false
[05:14:39.281]         query = 0x57692bc4 "SELECT slot_name, plugin,
confirmed_flush_lsn, restart_lsn, catalog_xmin, two_phase, failover,
database, invalidation_reason, inactive_since FROM
pg_catalog.pg_replication_slots WHERE failover and NOT"...
[05:14:39.281]         __func__ = "synchronize_slots"
[05:14:39.281] #9  0x56ff9d1e in SyncReplicationSlots
(wrconn=0x583cd170) at
../src/backend/replication/logical/slotsync.c:1504

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v26-0001-Maintain-inactive_since-for-synced-slots-correct.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

27 March 2024, 05:48:46

Hi,

On Wed, Mar 27, 2024 at 10:08:33AM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > -               if (!(RecoveryInProgress() && slot->data.synced))
> > +               if (!(InRecovery && slot->data.synced))
> >                         slot->inactive_since = GetCurrentTimestamp();
> >                 else
> >                         slot->inactive_since = 0;
> >
> > Not related to this change but more the way RestoreSlotFromDisk() behaves here:
> >
> > For a sync slot on standby it will be set to zero and then later will be
> > synchronized with the one coming from the primary. I think that's fine to have
> > it to zero for this window of time.
> 
> Right.
> 
> > Now, if the standby is down and one sets sync_replication_slots to off,
> > then inactive_since will be set to zero on the standby at startup and not
> > synchronized (unless one triggers a manual sync). I also think that's fine but
> > it might be worth to document this behavior (that after a standby startup
> > inactive_since is zero until the next sync...).
> 
> Isn't this behaviour applicable for other slot parameters that the
> slot syncs from the remote slot on the primary?

No they are persisted on disk. If not, we'd not know where to resume the decoding
from on the standby in case primary is down and/or sync is off.

> I've added the following note in the comments when we update
> inactive_since in RestoreSlotFromDisk.
> 
>          * Note that for synced slots after the standby starts up (i.e. after
>          * the slots are loaded from the disk), the inactive_since will remain
>          * zero until the next slot sync cycle.
>          */
>         if (!(InRecovery && slot->data.synced))
>             slot->inactive_since = GetCurrentTimestamp();
>         else
>             slot->inactive_since = 0;

I think we should add some words in the doc too and also about what the meaning
of inactive_since on the standby is (as suggested by Shveta in [1]).

[1]:
https://www.postgresql.org/message-id/CAJpy0uDkTW%2Bt1k3oPkaipFBzZePfFNB5DmiA%3D%3DpxRGcAdpF%3DPg%40mail.gmail.com

> > 7 ===
> >
> > +# Capture and validate inactive_since of a given slot.
> > +sub capture_and_validate_slot_inactive_since
> > +{
> > +       my ($node, $slot_name, $slot_creation_time) = @_;
> > +       my $name = $node->name;
> >
> > We know have capture_and_validate_slot_inactive_since at 2 places:
> > 040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.
> >
> > Worth to create a sub in Cluster.pm?
> 
> I'd second that thought for now. We might have to debate first if it's
> useful for all the nodes even without replication, and if yes, the
> naming stuff and all that. Historically, we've had such duplicated
> functions until recently, for instance advance_wal and log_contains.
> We
> moved them over to a common perl library Cluster.pm very recently. I'm
> sure we can come back later to move it to Cluster.pm.

I thought that would be the right time not to introduce duplicated code.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

27 March 2024, 06:09:04

On Wed, Mar 27, 2024 at 11:05 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Fixed an issue in synchronize_slots where DatumGetLSN is being used in
> place of DatumGetTimestampTz. Found this via CF bot member [1], not on
> my dev system.
>
> Please find the attached v6 patch.

Thanks for the patch. Few trivial things:

----------
1)
system-views.sgml:

a) "Note that the slots" --> "Note that the slots on the standbys,"
--it is good to mention "standbys" as synced could be true on primary
as well (promoted standby)

b) If you plan to add more info which Bertrand suggested, then it will
be better to make a <note> section instead of using "Note"

2)
commit msg:

"The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.
"
Sentence looks broken.
---------

Apart from the above trivial things, v26-001 looks good to me.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

27 March 2024, 09:25:17

On Wed, Mar 27, 2024 at 11:39 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thanks for the patch. Few trivial things:

Thanks for reviewing.

> ----------
> 1)
> system-views.sgml:
>
> a) "Note that the slots" --> "Note that the slots on the standbys,"
> --it is good to mention "standbys" as synced could be true on primary
> as well (promoted standby)

Done.

> b) If you plan to add more info which Bertrand suggested, then it will
> be better to make a <note> section instead of using "Note"

I added the note that Bertrand specified upthread. But, I couldn't
find an instance of adding <note> ... </note> within a table. Hence
with "Note that ...." statments just like any other notes in the
system-views.sgml. pg_replication_slot in system-vews.sgml renders as
table, so having <note> ... </note> may not be a great idea.

> 2)
> commit msg:
>
> "The impact of this
> on a promoted standby inactive_since is always NULL for all
> synced slots even after server restart.
> "
> Sentence looks broken.
> ---------

Reworded.

> Apart from the above trivial things, v26-001 looks good to me.

Please check the attached v27 patch which also has Bertrand's comment
on deduplicating the TAP function. I've now moved it to Cluster.pm.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v27-0001-Maintain-inactive_since-for-synced-slots-correct.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

27 March 2024, 10:11:58

Hi,

On Wed, Mar 27, 2024 at 02:55:17PM +0530, Bharath Rupireddy wrote:
> Please check the attached v27 patch which also has Bertrand's comment
> on deduplicating the TAP function. I've now moved it to Cluster.pm.

Thanks!

1 ===

+        Note that the slots on the standbys that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), will get the
+        <structfield>inactive_since</structfield> value from the
+        corresponding remote slot on the primary. Also, note that for the
+        synced slots on the standby, after the standby starts up (i.e. after
+        the slots are loaded from the disk), the inactive_since will remain
+        zero until the next slot sync cycle.

Not sure we should mention the "(i.e. after the slots are loaded from the disk)"
and also "cycle" (as that does not sound right in case of manual sync).

My proposal (in text) but feel free to reword it:

Note that the slots on the standbys that are being synced from a
primary server (whose synced field is true), will get the inactive_since value
from the corresponding remote slot on the primary. Also, after the standby starts
up, the inactive_since (for such synced slots) will remain zero until the next
synchronization.

2 ===

+=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)

get_slot_inactive_since_value instead?

3 ===

+against given reference time.

s/given reference/optional given reference/?

Apart from the above, LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

27 March 2024, 10:13:22

On Wed, Mar 27, 2024 at 2:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Mar 27, 2024 at 11:39 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Thanks for the patch. Few trivial things:
>
> Thanks for reviewing.
>
> > ----------
> > 1)
> > system-views.sgml:
> >
> > a) "Note that the slots" --> "Note that the slots on the standbys,"
> > --it is good to mention "standbys" as synced could be true on primary
> > as well (promoted standby)
>
> Done.
>
> > b) If you plan to add more info which Bertrand suggested, then it will
> > be better to make a <note> section instead of using "Note"
>
> I added the note that Bertrand specified upthread. But, I couldn't
> find an instance of adding <note> ... </note> within a table. Hence
> with "Note that ...." statments just like any other notes in the
> system-views.sgml. pg_replication_slot in system-vews.sgml renders as
> table, so having <note> ... </note> may not be a great idea.
>
> > 2)
> > commit msg:
> >
> > "The impact of this
> > on a promoted standby inactive_since is always NULL for all
> > synced slots even after server restart.
> > "
> > Sentence looks broken.
> > ---------
>
> Reworded.
>
> > Apart from the above trivial things, v26-001 looks good to me.
>
> Please check the attached v27 patch which also has Bertrand's comment
> on deduplicating the TAP function. I've now moved it to Cluster.pm.
>

Thanks for the patch. Regarding doc, I have few comments.

+        Note that the slots on the standbys that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), will get the
+        <structfield>inactive_since</structfield> value from the
+        corresponding remote slot on the primary. Also, note that for the
+        synced slots on the standby, after the standby starts up (i.e. after
+        the slots are loaded from the disk), the inactive_since will remain
+        zero until the next slot sync cycle.

a)  "inactive_since will remain  zero"
Since it is user exposed info and the user finds it NULL in
pg_replication_slots, shall we mention NULL instead of 0?

b) Since we are referring to the sync cycle here, I feel it will be
good to give a link to that page.
+        zero until the next slot sync cycle (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/> for
+        slot synchronization details).

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

27 March 2024, 12:25:05

On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> 1 ===
>
> My proposal (in text) but feel free to reword it:
>
> Note that the slots on the standbys that are being synced from a
> primary server (whose synced field is true), will get the inactive_since value
> from the corresponding remote slot on the primary. Also, after the standby starts
> up, the inactive_since (for such synced slots) will remain zero until the next
> synchronization.

WFM.

> 2 ===
>
> +=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)
>
> get_slot_inactive_since_value instead?

Ugh. Changed.

> 3 ===
>
> +against given reference time.
>
> s/given reference/optional given reference/?

Done.

> Apart from the above, LGTM.

Thanks for reviewing.

On Wed, Mar 27, 2024 at 3:43 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thanks for the patch. Regarding doc, I have few comments.

Thanks for reviewing.

> a)  "inactive_since will remain  zero"
> Since it is user exposed info and the user finds it NULL in
> pg_replication_slots, shall we mention NULL instead of 0?

Right. Changed.

> b) Since we are referring to the sync cycle here, I feel it will be
> good to give a link to that page.
> +        zero until the next slot sync cycle (see
> +        <xref linkend="logicaldecoding-replication-slots-synchronization"/> for
> +        slot synchronization details).

WFM.

Please see the attached v28 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v28-0001-Maintain-inactive_since-for-synced-slots-correct.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

27 March 2024, 13:24:52

Hi,

On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote:
> On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
> Please see the attached v28 patch.

Thanks!

1 === sorry I missed it in the previous review

        if (!(RecoveryInProgress() && slot->data.synced))
+       {
                now = GetCurrentTimestamp();
+               update_inactive_since = true;
+       }
+       else
+               update_inactive_since = false;

I think update_inactive_since is not needed, we could rely on (now > 0) instead.

2 ===

+=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{

shouldn't be "=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)"
instead?

Apart from the above, LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

27 March 2024, 15:30:37

On Wed, Mar 27, 2024 at 6:54 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote:
> > On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
> > Please see the attached v28 patch.
>
> Thanks!
>
> 1 === sorry I missed it in the previous review
>
>         if (!(RecoveryInProgress() && slot->data.synced))
> +       {
>                 now = GetCurrentTimestamp();
> +               update_inactive_since = true;
> +       }
> +       else
> +               update_inactive_since = false;
>
> I think update_inactive_since is not needed, we could rely on (now > 0) instead.

Thought of using it, but, at the expense of readability. I prefer to
use a variable instead. However, I changed the variable to be more
meaningful to is_slot_being_synced.

> 2 ===
>
> +=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname)
> +
> +Get inactive_since column value for a given replication slot validating it
> +against optional reference time.
> +
> +=cut
> +
> +sub get_slot_inactive_since_value
> +{
>
> shouldn't be "=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)"
> instead?

Ugh. Changed.

> Apart from the above, LGTM.

Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
standby for sync slots. 0002 implementing inactive timeout GUC based
invalidation mechanism.

Please have a look.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Thu, Mar 28, 2024 at 3:13 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Regarding 0002:

Thanks for reviewing it.

> Some testing:
>
> T1 ===
>
> When the slot is invalidated on the primary, then the reason is propagated to
> the sync slot (if any). That's fine but we are loosing the inactive_since on the
> standby:
>
> Primary:
>
> postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where
slot_name='lsub29_slot';
>   slot_name  |        inactive_since         | conflicting | invalidation_reason
> -------------+-------------------------------+-------------+---------------------
>  lsub29_slot | 2024-03-28 08:24:51.672528+00 | f           | inactive_timeout
> (1 row)
>
> Standby:
>
> postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where
slot_name='lsub29_slot';
>   slot_name  | inactive_since | conflicting | invalidation_reason
> -------------+----------------+-------------+---------------------
>  lsub29_slot |                | f           | inactive_timeout
> (1 row)
>
> I think in this case it should always reflect the value from the primary (so
> that one can understand why it is invalidated).

I'll come back to this as soon as we all agree on inactive_since
behavior for synced slots.

> T2 ===
>
> And it is set to a value during promotion:
>
> postgres=# select pg_promote();
>  pg_promote
> ------------
>  t
> (1 row)
>
> postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where
slot_name='lsub29_slot';
>   slot_name  |        inactive_since        | conflicting | invalidation_reason
> -------------+------------------------------+-------------+---------------------
>  lsub29_slot | 2024-03-28 08:30:11.74505+00 | f           | inactive_timeout
> (1 row)
>
> I think when it is invalidated it should always reflect the value from the
> primary (so that one can understand why it is invalidated).

I'll come back to this as soon as we all agree on inactive_since
behavior for synced slots.

> T3 ===
>
> As far the slot invalidation on the primary:
>
> postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub29_slot', NULL, NULL, 'include-xids', '0');
> ERROR:  cannot acquire invalidated replication slot "lsub29_slot"
>
> Can we make the message more consistent with what can be found in CreateDecodingContext()
> for example?

Hm, that makes sense because slot acquisition and release is something
internal to the server.

> T4 ===
>
> Also, it looks like querying pg_replication_slots() does not trigger an
> invalidation: I think it should if the slot is not invalidated yet (and matches
> the invalidation criteria).

There's a different opinion on this, check comment #3 from
https://www.postgresql.org/message-id/CAA4eK1LLj%2BeaMN-K8oeOjfG%2BUuzTY%3DL5PXbcMJURZbFm%2B_aJSA%40mail.gmail.com.

> Code review:
>
> CR1 ===
>
> +        Invalidate replication slots that are inactive for longer than this
> +        amount of time. If this value is specified without units, it is taken
>
> s/Invalidate/Invalidates/?

Done.

> Should we mention the relationship with inactive_since?

Done.

> CR2 ===
>
> + *
> + * If check_for_invalidation is true, the slot is checked for invalidation
> + * based on replication_slot_inactive_timeout GUC and an error is raised after making the slot ours.
>   */
>  void
> -ReplicationSlotAcquire(const char *name, bool nowait)
> +ReplicationSlotAcquire(const char *name, bool nowait,
> +                                          bool check_for_invalidation)
>
>
> s/check_for_invalidation/check_for_timeout_invalidation/?

Done.

> CR3 ===
>
> +       if (slot->inactive_since == 0 ||
> +               replication_slot_inactive_timeout == 0)
> +               return false;
>
> Better to test replication_slot_inactive_timeout first? (I mean there is no
> point of testing inactive_since if replication_slot_inactive_timeout == 0)
>
> CR4 ===
>
> +       if (slot->inactive_since > 0 &&
> +               replication_slot_inactive_timeout > 0)
> +       {
>
> Same.
>
> So, instead of CR3 === and CR4 ===, I wonder if it wouldn't be better to do
> something like:
>
> if (replication_slot_inactive_timeout == 0)
>         return false;
> else if (slot->inactive_since > 0)
> .
> else
>         return false;
>
> That would avoid checking replication_slot_inactive_timeout and inactive_since
> multiple times.

Done.

> CR5 ===
>
> +        * held to avoid race conditions -- for example the restart_lsn could move
> +        * forward, or the slot could be dropped.
>
> Does the restart_lsn example makes sense here?

No, it doesn't. Modified that.

> CR6 ===
>
> +static bool
> +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
> +{
>
> InvalidatePossiblyInactiveSlot() maybe?

I think we will lose the essence i.e. timeout from the suggested
function name, otherwise just the inactive doesn't give a clearer
meaning. I kept it that way unless anyone suggests otherwise.

> CR7 ===
>
> +       /* Make sure the invalidated state persists across server restart */
> +       slot->just_dirtied = true;
> +       slot->dirty = true;
> +       SpinLockRelease(&slot->mutex);
>
> Maybe we could create a new function say MarkGivenReplicationSlotDirty()
> with a slot as parameter, that ReplicationSlotMarkDirty could call too?

Done that.

> Then maybe we could set slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT in
> InvalidateSlotForInactiveTimeout()? (to avoid multiple SpinLockAcquire/SpinLockRelease).

Done that.

> CR8 ===
>
> +       if (persist_state)
> +       {
> +               char            path[MAXPGPATH];
> +
> +               sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
> +               SaveSlotToPath(slot, path, ERROR);
> +       }
>
> Maybe we could create a new function say GivenReplicationSlotSave()
> with a slot as parameter, that ReplicationSlotSave() could call too?

Done that.

> CR9 ===
>
> +       if (check_for_invalidation)
> +       {
> +               /* The slot is ours by now */
> +               Assert(s->active_pid == MyProcPid);
> +
> +               /*
> +                * Well, the slot is not yet ours really unless we check for the
> +                * invalidation below.
> +                */
> +               s->active_pid = 0;
> +               if (InvalidateReplicationSlotForInactiveTimeout(s, true, true))
> +               {
> +                       /*
> +                        * If the slot has been invalidated, recalculate the resource
> +                        * limits.
> +                        */
> +                       ReplicationSlotsComputeRequiredXmin(false);
> +                       ReplicationSlotsComputeRequiredLSN();
> +
> +                       /* Might need it for slot clean up on error, so restore it */
> +                       s->active_pid = MyProcPid;
> +                       ereport(ERROR,
> +                                       (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +                                        errmsg("cannot acquire invalidated replication slot \"%s\"",
> +                                                       NameStr(MyReplicationSlot->data.name))));
> +               }
> +               s->active_pid = MyProcPid;
>
> Are we not missing some SpinLockAcquire/Release on the slot's mutex here? (the
> places where we set the active_pid).

Hm, yes. But, shall I acquire the mutex, set active_pid to 0 for a
moment just to satisfy Assert(slot->active_pid == 0); in
InvalidateReplicationSlotForInactiveTimeout and
InvalidateSlotForInactiveTimeout? I just removed the assertions
because being replication_slot_inactive_timeout > 0 and inactive_since
> 0 is enough for these functions to think and decide on inactive
timeout invalidation.

> CR10 ===
>
> @@ -1628,6 +1674,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
>                                         if (SlotIsLogical(s))
>                                                 invalidation_cause = cause;
>                                         break;
> +                               case RS_INVAL_INACTIVE_TIMEOUT:
> +                                       if (InvalidateReplicationSlotForInactiveTimeout(s, false, false))
> +                                               invalidation_cause = cause;
> +                                       break;
>
> InvalidatePossiblyObsoleteSlot() is not called with such a reason, better to use
> an Assert here and in the caller too?

Done.

> CR11 ===
>
> +++ b/src/test/recovery/t/050_invalidate_slots.pl
>
> why not using 019_replslot_limit.pl?

I understand that 019_replslot_limit covers wal_removed related
invalidations. But, I don't want to kludge it with a bunch of other
tests. The new tests anyway need a bunch of new nodes and a couple of
helper functions. Any future invalidation mechanisms can be added here
in this new file. Also, having a separate file quickly helps isolate
any test failures that BF animals might report in future. I don't
think a separate test file here hurts anyone unless there's a strong
reason against it.

Please see the attached v30 patch. 0002 is where all of the above
review comments have been addressed.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> > > Or a simple solution is that the slotsync worker updates
> > > inactive_since as it does for non-synced slots, and disables
> > > timeout-based slot invalidation for synced slots.
>
> I like this idea better, it takes care of such a case too when the
> user is relying on sync-function rather than worker and does not want
> to get the slots invalidated in between 2 sync function calls.

Please find the attached v31 patches implementing the above idea:

- synced slots get their on inactive_since just like any other slot
- synced slots don't get invalidated due to inactive timeout because
such slots not considered active at all as they don't perform logical
decoding (of course, they will perform in fast_forward mode to fix the
other data loss issue, but they don't generate changes for them to be
called as *active* slots)
- synced slots inactive_since is set to current timestamp after the
standby gets promoted to help inactive_since interpret correctly just
like any other slot.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Wed, Apr 3, 2024 at 12:20 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > Please find the attached v31 patches implementing the above idea:
>
> Some comments related to v31-0001:
>
> === testing the behavior
>
> T1 ===
>
> > - synced slots get their on inactive_since just like any other slot
>
> It behaves as described.
>
> T2 ===
>
> > - synced slots inactive_since is set to current timestamp after the
> > standby gets promoted to help inactive_since interpret correctly just
> > like any other slot.
>
> It behaves as described.

Thanks for testing.

> CR1 ===
>
> +        <structfield>inactive_since</structfield> value will get updated
> +        after every synchronization
>
> indicates the last synchronization time? (I think that after every synchronization
> could lead to confusion).

Done.

> CR2 ===
>
> +                       /*
> +                        * Set the time since the slot has become inactive after shutting
> +                        * down slot sync machinery. This helps correctly interpret the
> +                        * time if the standby gets promoted without a restart.
> +                        */
>
> It looks to me that this comment is not at the right place because there is
> nothing after the comment that indicates that we shutdown the "slot sync machinery".
>
> Maybe a better place is before the function definition and mention that this is
> currently called when we shutdown the "slot sync machinery"?

Done.

> CR3 ===
>
> +                        * We get the current time beforehand and only once to avoid
> +                        * system calls overhead while holding the lock.
>
> s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/?

Done.

> CR4 ===
>
> +        * Set the time since the slot has become inactive. We get the current
> +        * time beforehand to avoid system call overhead while holding the lock
>
> Same.

Done.

> CR5 ===
>
> +       # Check that the captured time is sane
> +       if (defined $reference_time)
> +       {
>
> s/Check that the captured time is sane/Check that the inactive_since is sane/?
>
> Sorry if some of those comments could have been done while I did review v29-0001.

Done.

On Wed, Apr 3, 2024 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thanks for the patches, please find few comments:
>
> v31-001:
>
> 1)
> system-views.sgml:
> value will get updated  after every synchronization from the
> corresponding remote slot on the primary.
>
> --This is confusing. It will be good to rephrase it.

Done as per Bertrand's suggestion.

> 2)
> update_synced_slots_inactive_since()
>
> --May be, we should mention in the header that this function is called
> only during promotion.

Done as per Bertrand's suggestion.

> 3) 040_standby_failover_slots_sync.pl:
> We capture inactive_since_on_primary when we do this for the first time at #175
> ALTER SUBSCRIPTION regress_mysub1 DISABLE"
>
> But we again recreate the sub and disable it at line #280.
> Do you think we shall get inactive_since_on_primary again here, to be
> compared with inactive_since_on_new_primary later?

Hm. Done that. Recapturing both slot_creation_time_on_primary and
inactive_since_on_primary before and after CREATE SUBSCRIPTION creates
the slot again on the primary/publisher.

On Wed, Apr 3, 2024 at 3:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > CR2 ===
> >
> > +                       /*
> > +                        * Set the time since the slot has become inactive after shutting
> > +                        * down slot sync machinery. This helps correctly interpret the
> > +                        * time if the standby gets promoted without a restart.
> > +                        */
> >
> > It looks to me that this comment is not at the right place because there is
> > nothing after the comment that indicates that we shutdown the "slot sync machinery".
> >
> > Maybe a better place is before the function definition and mention that this is
> > currently called when we shutdown the "slot sync machinery"?
> >
> Won't it be better to have an assert for SlotSyncCtx->pid? IIRC, we
> have some existing issues where we don't ensure that no one is running
> sync API before shutdown is complete but I think we can deal with that
> separately and here we can still have an Assert.

That can work to ensure the slot sync worker isn't running as
SlotSyncCtx->pid gets updated only for the slot sync worker. I added
this assertion for now.

We need to ensure (in a separate patch and thread) there is no backend
acquiring it and performing sync while the slot sync worker is
shutting down. Otherwise, some of the slots can get resynced and some
are not while we are shutting down the slot sync worker as part of the
standby promotion which might leave the slots in an inconsistent
state.

> > CR3 ===
> >
> > +                        * We get the current time beforehand and only once to avoid
> > +                        * system calls overhead while holding the lock.
> >
> > s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/?
> >
> Is it valid to say that there is overhead of this call while holding
> spinlock? Because I don't think at the time of promotion we expect any
> other concurrent slot activity. The first reason seems good enough.

No slot activity but why GetCurrentTimestamp needs to be called every
time in a loop.

> One other observation:
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -42,6 +42,7 @@
>  #include "access/transam.h"
>  #include "access/xlog_internal.h"
>  #include "access/xlogrecovery.h"
> +#include "access/xlogutils.h"
>
> Is there a reason for this inclusion? I don't see any change which
> should need this one.

Not anymore. It was earlier needed for using the InRecovery flag in
the then approach.

On Wed, Apr 3, 2024 at 4:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > 3) 040_standby_failover_slots_sync.pl:
> > We capture inactive_since_on_primary when we do this for the first time at #175
> > ALTER SUBSCRIPTION regress_mysub1 DISABLE"
> >
> > But we again recreate the sub and disable it at line #280.
> > Do you think we shall get inactive_since_on_primary again here, to be
> > compared with inactive_since_on_new_primary later?
> >
>
> I think so.

Modified this to recapture the times before and after the slot gets recreated.

> Few additional comments on tests:
> 1.
> +is( $standby1->safe_psql(
> + 'postgres',
> + "SELECT '$inactive_since_on_primary'::timestamptz <
> '$inactive_since_on_standby'::timestamptz AND
> + '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;"
>
> Shall we do <= check as we are doing in the main function
> get_slot_inactive_since_value as the time duration is less so it can
> be the same as well? Similarly, please check other tests.

I get you. If the tests are so fast that losing a bit of precision
might cause tests to fail. So, I'v added equality check for all the
tests.

> 2.
> +=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)
> +
> +Get inactive_since column value for a given replication slot validating it
> +against optional reference time.
> +
> +=cut
> +
> +sub get_slot_inactive_since_value
>
> I see that all callers validate against reference time. It is better
> to name it validate_slot_inactive_since rather than using get_* as the
> main purpose is to validate the passed value.

Existing callers yes. Also, I've removed the reference time as an
optional parameter.

Per an offlist chat with Amit, I've added the following note in
synchronize_one_slot:

@@ -584,6 +585,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
          * overwriting 'invalidated' flag to remote_slot's value. See
          * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
          * if the slot is not acquired by other processes.
+         *
+         * XXX: If it ever turns out that slot acquire/release is costly for
+         * cases when none of the slot property is changed then we can do a
+         * pre-check to ensure that at least one of the slot property is
+         * changed before acquiring the slot.
          */
         ReplicationSlotAcquire(remote_slot->name, true);

Please find the attached v32-0001 patch with the above review comments
addressed. I'm working on review comments for 0002.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v32-0001-Allow-synced-slots-to-have-their-own-inactive_si.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

03 April 2024, 13:16:25

Hi,

On Wed, Apr 03, 2024 at 05:12:12PM +0530, Bharath Rupireddy wrote:
> On Wed, Apr 3, 2024 at 4:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > + 'postgres',
> > + "SELECT '$inactive_since_on_primary'::timestamptz <
> > '$inactive_since_on_standby'::timestamptz AND
> > + '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;"
> >
> > Shall we do <= check as we are doing in the main function
> > get_slot_inactive_since_value as the time duration is less so it can
> > be the same as well? Similarly, please check other tests.
> 
> I get you. If the tests are so fast that losing a bit of precision
> might cause tests to fail. So, I'v added equality check for all the
> tests.

> Please find the attached v32-0001 patch with the above review comments
> addressed.

Thanks!

Just one comment on v32-0001:

+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+               'postgres',
+               "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND
+                       '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;"
+       ),
+       "t",
+       'synchronized slot has got its own inactive_since');
+

By using <= we are not testing that it must get its own inactive_since (as we
allow them to be equal in the test). I think we should just add some usleep()
where appropriate and deny equality during the tests on inactive_since.

Except for the above, v32-0001 LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

03 April 2024, 14:58:04

On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Just one comment on v32-0001:
>
> +# Synced slot on the standby must get its own inactive_since.
> +is( $standby1->safe_psql(
> +               'postgres',
> +               "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND
> +                       '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;"
> +       ),
> +       "t",
> +       'synchronized slot has got its own inactive_since');
> +
>
> By using <= we are not testing that it must get its own inactive_since (as we
> allow them to be equal in the test). I think we should just add some usleep()
> where appropriate and deny equality during the tests on inactive_since.

Thanks. It looks like we can ignore the equality in all of the
inactive_since comparisons. IIUC, all the TAP tests do run with
primary and standbys on the single BF animals. And, it looks like
assigning the inactive_since timestamps to perl variables is giving
the microseconds precision level
(./tmp_check/log/regress_log_040_standby_failover_slots_sync:inactive_since
2024-04-03 14:30:09.691648+00). FWIW, we already have some TAP and SQL
tests relying on stats_reset timestamps without equality. So, I've
left the equality for the inactive_since tests.

> Except for the above, v32-0001 LGTM.

Thanks. Please see the attached v33-0001 patch after removing equality
on inactive_since TAP tests.

On Wed, Apr 3, 2024 at 1:47 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Some comments regarding v31-0002:
>
> === testing the behavior
>
> T1 ===
>
> > - synced slots don't get invalidated due to inactive timeout because
> > such slots not considered active at all as they don't perform logical
> > decoding (of course, they will perform in fast_forward mode to fix the
> > other data loss issue, but they don't generate changes for them to be
> > called as *active* slots)
>
> It behaves as described. OTOH non synced logical slots on the standby and
> physical slots on the standby are invalidated which is what is expected.

Right.

> T2 ===
>
> In case the slot is invalidated on the primary,
>
> primary:
>
> postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1';
>  slot_name |        inactive_since         | invalidation_reason
> -----------+-------------------------------+---------------------
>  s1        | 2024-04-03 06:56:28.075637+00 | inactive_timeout
>
> then on the standby we get:
>
> standby:
>
> postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1';
>  slot_name |        inactive_since        | invalidation_reason
> -----------+------------------------------+---------------------
>  s1        | 2024-04-03 07:06:43.37486+00 | inactive_timeout
>
> shouldn't the slot be dropped/recreated instead of updating inactive_since?

The sync slots that are invalidated on the primary aren't dropped and
recreated on the standby. There's no point in doing so because
invalidated slots on the primary can't be made useful. However, I
found that the synced slot is acquired and released unnecessarily
after the invalidation_reason is synced from the primary. I added a
skip check in synchronize_one_slot to skip acquiring and releasing the
slot if it's locally found inactive. With this, inactive_since won't
get updated for invalidated sync slots on the standby as we don't
acquire and release the slot.

> === code
>
> CR1 ===
>
> +        Invalidates replication slots that are inactive for longer the
> +        specified amount of time
>
> s/for longer the/for longer that/?

Fixed.

> CR2 ===
>
> +        <literal>true</literal>) as such synced slots don't actually perform
> +        logical decoding.
>
> We're switching in fast forward logical due to [1], so I'm not sure that's 100%
> accurate here. I'm not sure we need to specify a reason.

Fixed.

> CR3 ===
>
> + errdetail("This slot has been invalidated because it was inactive for more than the time specified by
replication_slot_inactive_timeoutparameter."))); 
>
> I think we can remove "parameter" (see for example the error message in
> validate_remote_info()) and reduce it a bit, something like?
>
> "This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout"?

Done.

> CR4 ===
>
> + appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by
replication_slot_inactive_timeoutparameter.")); 
>
> Same.

Done. Changed it to "The slot has been inactive for more than
replication_slot_inactive_timeout."

> CR5 ===
>
> +       /*
> +        * This function isn't expected to be called for inactive timeout based
> +        * invalidation. A separate function InvalidateInactiveReplicationSlot is
> +        * to be used for that.
>
> Do you think it's worth to explain why?

Hm, I just wanted to point out the actual function here. I modified it
to something like the following, if others feel we don't need that, I
can remove it.

    /*
     * Use InvalidateInactiveReplicationSlot for inactive timeout based
     * invalidation.
     */

> CR6 ===
>
> +       if (replication_slot_inactive_timeout == 0)
> +               return false;
> +       else if (slot->inactive_since > 0)
>
> "else" is not needed here.

Nothing wrong there, but removed.

> CR7 ===
>
> +               SpinLockAcquire(&slot->mutex);
> +
> +               /*
> +                * Check if the slot needs to be invalidated due to
> +                * replication_slot_inactive_timeout GUC. We do this with the spinlock
> +                * held to avoid race conditions -- for example the inactive_since
> +                * could change, or the slot could be dropped.
> +                */
> +               now = GetCurrentTimestamp();
>
> We should not call GetCurrentTimestamp() while holding a spinlock.

I was thinking why to add up the wait time to acquire
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);. Now that I
moved it up before the spinlock but after the LWLockAcquire.

> CR8 ===
>
> +# Testcase start: Invalidate streaming standby's slot as well as logical
> +# failover slot on primary due to inactive timeout GUC. Also, check the logical
>
> s/inactive timeout GUC/replication_slot_inactive_timeout/?

Done.

> CR9 ===
>
> +# Start: Helper functions used for this test file
> +# End: Helper functions used for this test file
>
> I think that's the first TAP test with this comment. Not saying we should not but
> why did you feel the need to add those?

Hm. Removed.

> [1]:
https://www.postgresql.org/message-id/OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com


On Wed, Apr 3, 2024 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> v31-002:
> (I had reviewed v29-002 but missed to post comments,  I think these
> are still applicable)
>
> 1) I think replication_slot_inactivity_timeout was recommended here
> (instead of replication_slot_inactive_timeout, so please give it a
> thought):
> https://www.postgresql.org/message-id/202403260739.udlp7lxixktx%40alvherre.pgsql

Yeah. It's synonymous with inactive_since. If others have an opinion
to have replication_slot_inactivity_timeout, I'm fine with it.

> 2) Commit msg:
> a)
> "It is often easy for developers to set a timeout of say 1
> or 2 or 3 days at slot level, after which the inactive slots get
> dropped."
>
> Shall we say invalidated rather than dropped?

Right. Done that.

> b)
> "To achieve the above, postgres introduces a GUC allowing users
> set inactive timeout and then a slot stays inactive for this much
> amount of time it invalidates the slot."
>
> Broken sentence.

Reworded it a bit.

Please find the attached v33 patches.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

On Wed, Apr 3, 2024 at 8:28 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Just one comment on v32-0001:
> >
> > +# Synced slot on the standby must get its own inactive_since.
> > +is( $standby1->safe_psql(
> > +               'postgres',
> > +               "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND
> > +                       '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;"
> > +       ),
> > +       "t",
> > +       'synchronized slot has got its own inactive_since');
> > +
> >
> > By using <= we are not testing that it must get its own inactive_since (as we
> > allow them to be equal in the test). I think we should just add some usleep()
> > where appropriate and deny equality during the tests on inactive_since.
>
> Thanks. It looks like we can ignore the equality in all of the
> inactive_since comparisons. IIUC, all the TAP tests do run with
> primary and standbys on the single BF animals. And, it looks like
> assigning the inactive_since timestamps to perl variables is giving
> the microseconds precision level
> (./tmp_check/log/regress_log_040_standby_failover_slots_sync:inactive_since
> 2024-04-03 14:30:09.691648+00). FWIW, we already have some TAP and SQL
> tests relying on stats_reset timestamps without equality. So, I've
> left the equality for the inactive_since tests.
>
> > Except for the above, v32-0001 LGTM.
>
> Thanks. Please see the attached v33-0001 patch after removing equality
> on inactive_since TAP tests.
>

The v33-0001 looks good to me. I have made minor changes in the
comments/commit message and removed one part of the test which was a
bit confusing and didn't seem to add much value. Let me know what you
think of the attached?

--
With Regards,
Amit Kapila.

Attachment

v34-0001-Allow-synced-slots-to-have-their-inactive_since.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

04 April 2024, 05:42:11

On Thu, Apr 4, 2024 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> The v33-0001 looks good to me. I have made minor changes in the
> comments/commit message and removed one part of the test which was a
> bit confusing and didn't seem to add much value. Let me know what you
> think of the attached?

Thanks for the changes. v34-0001 LGTM.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Masahiko Sawada

Date:

04 April 2024, 08:01:30

On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
> >     if (SlotSyncCtx->pid == InvalidPid)
> >     {
> >         SpinLockRelease(&SlotSyncCtx->mutex);
> > +       update_synced_slots_inactive_since();
> >         return;
> >     }
> >     SpinLockRelease(&SlotSyncCtx->mutex);
> > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
> >     }
> >
> >     SpinLockRelease(&SlotSyncCtx->mutex);
> > +
> > +   update_synced_slots_inactive_since();
> >  }
> >
> > Why do we want to update all synced slots' inactive_since values at
> > shutdown in spite of updating the value every time when releasing the
> > slot? It seems to contradict the fact that inactive_since is updated
> > when releasing or restoring the slot.
>
> It is to get the inactive_since right for the cases where the standby
> is promoted without a restart similar to when a standby is promoted
> with restart in which case the inactive_since is set to current time
> in RestoreSlotFromDisk.
>
> Imagine the slot is synced last time at time t1 and then a few hours
> passed, the standby is promoted without a restart. If we don't set
> inactive_since to current time in this case in ShutDownSlotSync, the
> inactive timeout invalidation mechanism can kick in immediately.
>

Thank you for the explanation! I understood the needs.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

04 April 2024, 08:36:42

On Thu, Apr 4, 2024 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
> > >     if (SlotSyncCtx->pid == InvalidPid)
> > >     {
> > >         SpinLockRelease(&SlotSyncCtx->mutex);
> > > +       update_synced_slots_inactive_since();
> > >         return;
> > >     }
> > >     SpinLockRelease(&SlotSyncCtx->mutex);
> > > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
> > >     }
> > >
> > >     SpinLockRelease(&SlotSyncCtx->mutex);
> > > +
> > > +   update_synced_slots_inactive_since();
> > >  }
> > >
> > > Why do we want to update all synced slots' inactive_since values at
> > > shutdown in spite of updating the value every time when releasing the
> > > slot? It seems to contradict the fact that inactive_since is updated
> > > when releasing or restoring the slot.
> >
> > It is to get the inactive_since right for the cases where the standby
> > is promoted without a restart similar to when a standby is promoted
> > with restart in which case the inactive_since is set to current time
> > in RestoreSlotFromDisk.
> >
> > Imagine the slot is synced last time at time t1 and then a few hours
> > passed, the standby is promoted without a restart. If we don't set
> > inactive_since to current time in this case in ShutDownSlotSync, the
> > inactive timeout invalidation mechanism can kick in immediately.
> >
>
> Thank you for the explanation! I understood the needs.
>

Do you want to review the v34_0001* further or shall I proceed with
the commit of the same?

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Masahiko Sawada

Date:

04 April 2024, 09:05:02

On Thu, Apr 4, 2024 at 5:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
> > > >     if (SlotSyncCtx->pid == InvalidPid)
> > > >     {
> > > >         SpinLockRelease(&SlotSyncCtx->mutex);
> > > > +       update_synced_slots_inactive_since();
> > > >         return;
> > > >     }
> > > >     SpinLockRelease(&SlotSyncCtx->mutex);
> > > > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
> > > >     }
> > > >
> > > >     SpinLockRelease(&SlotSyncCtx->mutex);
> > > > +
> > > > +   update_synced_slots_inactive_since();
> > > >  }
> > > >
> > > > Why do we want to update all synced slots' inactive_since values at
> > > > shutdown in spite of updating the value every time when releasing the
> > > > slot? It seems to contradict the fact that inactive_since is updated
> > > > when releasing or restoring the slot.
> > >
> > > It is to get the inactive_since right for the cases where the standby
> > > is promoted without a restart similar to when a standby is promoted
> > > with restart in which case the inactive_since is set to current time
> > > in RestoreSlotFromDisk.
> > >
> > > Imagine the slot is synced last time at time t1 and then a few hours
> > > passed, the standby is promoted without a restart. If we don't set
> > > inactive_since to current time in this case in ShutDownSlotSync, the
> > > inactive timeout invalidation mechanism can kick in immediately.
> > >
> >
> > Thank you for the explanation! I understood the needs.
> >
>
> Do you want to review the v34_0001* further or shall I proceed with
> the commit of the same?

Thanks for asking. The v34-0001 patch looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

04 April 2024, 11:05:01

On Thu, Apr 4, 2024 at 11:12 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > The v33-0001 looks good to me. I have made minor changes in the
> > comments/commit message and removed one part of the test which was a
> > bit confusing and didn't seem to add much value. Let me know what you
> > think of the attached?
>
> Thanks for the changes. v34-0001 LGTM.
>

I was doing a final review before pushing 0001 and found that
'inactive_since' could be set twice during startup after promotion,
once while restoring slots and then via ShutDownSlotSync(). The reason
is that ShutDownSlotSync() will be invoked in normal startup on
primary though it won't do anything apart from setting inactive_since
if we have synced slots. I think you need to check 'StandbyMode' in
update_synced_slots_inactive_since() and return if the same is not
set. We can't use 'InRecovery' flag as that will be set even during
crash recovery.

Can you please test this once unless you don't agree with the above theory?

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

04 April 2024, 12:22:50

On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > Thanks for the changes. v34-0001 LGTM.
>
> I was doing a final review before pushing 0001 and found that
> 'inactive_since' could be set twice during startup after promotion,
> once while restoring slots and then via ShutDownSlotSync(). The reason
> is that ShutDownSlotSync() will be invoked in normal startup on
> primary though it won't do anything apart from setting inactive_since
> if we have synced slots. I think you need to check 'StandbyMode' in
> update_synced_slots_inactive_since() and return if the same is not
> set. We can't use 'InRecovery' flag as that will be set even during
> crash recovery.
>
> Can you please test this once unless you don't agree with the above theory?

Nice catch. I've verified that update_synced_slots_inactive_since is
called even for normal server startups/crash recovery. I've added a
check to exit if the StandbyMode isn't set.

Please find the attached v35 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v35-0001-Allow-synced-slots-to-have-their-inactive_since.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

05 April 2024, 03:09:13

On Thu, Apr 4, 2024 at 5:53 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > Thanks for the changes. v34-0001 LGTM.
> >
> > I was doing a final review before pushing 0001 and found that
> > 'inactive_since' could be set twice during startup after promotion,
> > once while restoring slots and then via ShutDownSlotSync(). The reason
> > is that ShutDownSlotSync() will be invoked in normal startup on
> > primary though it won't do anything apart from setting inactive_since
> > if we have synced slots. I think you need to check 'StandbyMode' in
> > update_synced_slots_inactive_since() and return if the same is not
> > set. We can't use 'InRecovery' flag as that will be set even during
> > crash recovery.
> >
> > Can you please test this once unless you don't agree with the above theory?
>
> Nice catch. I've verified that update_synced_slots_inactive_since is
> called even for normal server startups/crash recovery. I've added a
> check to exit if the StandbyMode isn't set.
>
> Please find the attached v35 patch.

Thanks for the patch. Tested it , works well. Few cosmetic changes needed:

in 040 test file:
1)
# Capture the inactive_since of the slot from the primary. Note that the slot
# will be inactive since the corresponding subscription is disabled..

2 .. at the end. Replace with one.

2)
# Synced slot on the standby must get its own inactive_since.

. not needed in single line comment (to be consistent with
neighbouring comments)


3)
update_synced_slots_inactive_since():

if (!StandbyMode)
return;

It will be good to add comments here.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

05 April 2024, 05:51:43

On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > > shouldn't the slot be dropped/recreated instead of updating inactive_since?
> >
> > The sync slots that are invalidated on the primary aren't dropped and
> > recreated on the standby.
>
> Yeah, right (I was confused with synced slots that are invalidated locally).
>
> > However, I
> > found that the synced slot is acquired and released unnecessarily
> > after the invalidation_reason is synced from the primary. I added a
> > skip check in synchronize_one_slot to skip acquiring and releasing the
> > slot if it's locally found inactive. With this, inactive_since won't
> > get updated for invalidated sync slots on the standby as we don't
> > acquire and release the slot.
>
> CR1 ===
>
> Yeah, I can see:
>
> @@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
>                                                    " name slot \"%s\" already exists on the standby",
>                                                    remote_slot->name));
>
> +               /*
> +                * Skip the sync if the local slot is already invalidated. We do this
> +                * beforehand to save on slot acquire and release.
> +                */
> +               if (slot->data.invalidated != RS_INVAL_NONE)
> +                       return false;
>
> Thanks to the drop_local_obsolete_slots() call I think we are not missing the case
> where the slot has been invalidated on the primary, invalidation reason has been
> synced on the standby and later the slot is dropped/ recreated manually on the
> primary (then it should be dropped/recreated on the standby too).
>
> Also it seems we are not missing the case where a sync slot is invalidated
> locally due to wal removal (it should be dropped/recreated).

Right.

> > > CR5 ===
> > >
> > > +       /*
> > > +        * This function isn't expected to be called for inactive timeout based
> > > +        * invalidation. A separate function InvalidateInactiveReplicationSlot is
> > > +        * to be used for that.
> > >
> > > Do you think it's worth to explain why?
> >
> > Hm, I just wanted to point out the actual function here. I modified it
> > to something like the following, if others feel we don't need that, I
> > can remove it.
>
> Sorry If I was not clear but I meant to say "Do you think it's worth to explain
> why we decided to create a dedicated function"? (currently we "just" explain that
> we created one).

We added a new function (InvalidateInactiveReplicationSlot) to
invalidate slot based on inactive timeout because 1) we do the
inactive timeout invalidation at slot level as opposed to
InvalidateObsoleteReplicationSlots which does loop over all the slots,
2)
InvalidatePossiblyObsoleteSlot does release the lock in some cases,
has a lot of unneeded code for inactive timeout invalidation check, 3)
we want some control over saving the slot to disk because we hook the
inactive timeout invalidation into the loop that checkpoints the slot
info to the disk in CheckPointReplicationSlots.

I've added a comment atop InvalidateInactiveReplicationSlot.

Please find the attached v36 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v36-0001-Add-inactive_timeout-based-replication-slot-inva.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bertrand Drouvot

Date:

05 April 2024, 07:43:58

Hi,

On Fri, Apr 05, 2024 at 11:21:43AM +0530, Bharath Rupireddy wrote:
> On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> Please find the attached v36 patch.

Thanks!

A few comments:

1 ===

+       <para>
+        The timeout is measured from the time since the slot has become
+        inactive (known from its
+        <structfield>inactive_since</structfield> value) until it gets
+        used (i.e., its <structfield>active</structfield> is set to true).
+       </para>

That's right except when it's invalidated during the checkpoint (as the slot
is not acquired in CheckPointReplicationSlots()).

So, what about adding: "or a checkpoint occurs"? That would also explain that
the invalidation could occur during checkpoint.

2 ===

+       /* If the slot has been invalidated, recalculate the resource limits */
+       if (invalidated)
+       {

/If the slot/If a slot/?

3 ===

+ * NB - this function also runs as part of checkpoint, so avoid raising errors

s/NB - this/NB: This function/? (that looks more consistent with other comments
in the code)

4 ===

+ * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause instead

I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause instead"
looks weird to me. Maybe it would make sense to reword this a bit.

5 ===

+        * considered not active as they don't actually perform logical decoding.

Not sure that's 100% accurate as we switched in fast forward logical
in 2ec005b4e2.

"as they perform only fast forward logical decoding (or not at all)", maybe?

6 ===

+       if (RecoveryInProgress() && slot->data.synced)
+               return false;
+
+       if (replication_slot_inactive_timeout == 0)
+               return false;

What about just using one if? It's more a matter of taste but it also probably
reduces the object file size a bit for non optimized build.

7 ===

+               /*
+                * Do not invalidate the slots which are currently being synced from
+                * the primary to the standby.
+                */
+               if (RecoveryInProgress() && slot->data.synced)
+                       return false;

I think we don't need this check as the exact same one is done just before.

8 ===

+sub check_for_slot_invalidation_in_server_log
+{
+       my ($node, $slot_name, $offset) = @_;
+       my $invalidated = 0;
+
+       for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+       {
+               $node->safe_psql('postgres', "CHECKPOINT");

Wouldn't be better to wait for the replication_slot_inactive_timeout time before
instead of triggering all those checkpoints? (it could be passed as an extra arg
to wait_for_slot_invalidation()).

9 ===

# Synced slot mustn't get invalidated on the standby, it must sync invalidation
# from the primary. So, we must not see the slot's invalidation message in server
# log.
ok( !$standby1->log_contains(
        "invalidating obsolete replication slot \"lsub1_sync_slot\"",
        $standby1_logstart),
    'check that syned slot has not been invalidated on the standby');

Would that make sense to trigger a checkpoint on the standby before this test?
I mean I think that without a checkpoint on the standby we should not see the
invalidation in the log anyway.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

06 April 2024, 06:25:38

On Fri, Apr 5, 2024 at 1:14 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > Please find the attached v36 patch.
>
> A few comments:
>
> 1 ===
>
> +       <para>
> +        The timeout is measured from the time since the slot has become
> +        inactive (known from its
> +        <structfield>inactive_since</structfield> value) until it gets
> +        used (i.e., its <structfield>active</structfield> is set to true).
> +       </para>
>
> That's right except when it's invalidated during the checkpoint (as the slot
> is not acquired in CheckPointReplicationSlots()).
>
> So, what about adding: "or a checkpoint occurs"? That would also explain that
> the invalidation could occur during checkpoint.

Reworded.

> 2 ===
>
> +       /* If the slot has been invalidated, recalculate the resource limits */
> +       if (invalidated)
> +       {
>
> /If the slot/If a slot/?

Modified it to be like elsewhere.

> 3 ===
>
> + * NB - this function also runs as part of checkpoint, so avoid raising errors
>
> s/NB - this/NB: This function/? (that looks more consistent with other comments
> in the code)

Done.

> 4 ===
>
> + * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause instead
>
> I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause instead"
> looks weird to me. Maybe it would make sense to reword this a bit.

Reworded.

> 5 ===
>
> +        * considered not active as they don't actually perform logical decoding.
>
> Not sure that's 100% accurate as we switched in fast forward logical
> in 2ec005b4e2.
>
> "as they perform only fast forward logical decoding (or not at all)", maybe?

Changed it to "as they don't perform logical decoding to produce the
changes". In fast_forward mode no changes are produced.

> 6 ===
>
> +       if (RecoveryInProgress() && slot->data.synced)
> +               return false;
> +
> +       if (replication_slot_inactive_timeout == 0)
> +               return false;
>
> What about just using one if? It's more a matter of taste but it also probably
> reduces the object file size a bit for non optimized build.

Changed.

> 7 ===
>
> +               /*
> +                * Do not invalidate the slots which are currently being synced from
> +                * the primary to the standby.
> +                */
> +               if (RecoveryInProgress() && slot->data.synced)
> +                       return false;
>
> I think we don't need this check as the exact same one is done just before.

Right. Removed.

> 8 ===
>
> +sub check_for_slot_invalidation_in_server_log
> +{
> +       my ($node, $slot_name, $offset) = @_;
> +       my $invalidated = 0;
> +
> +       for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
> +       {
> +               $node->safe_psql('postgres', "CHECKPOINT");
>
> Wouldn't be better to wait for the replication_slot_inactive_timeout time before
> instead of triggering all those checkpoints? (it could be passed as an extra arg
> to wait_for_slot_invalidation()).

Done.

> 9 ===
>
> # Synced slot mustn't get invalidated on the standby, it must sync invalidation
> # from the primary. So, we must not see the slot's invalidation message in server
> # log.
> ok( !$standby1->log_contains(
>         "invalidating obsolete replication slot \"lsub1_sync_slot\"",
>         $standby1_logstart),
>     'check that syned slot has not been invalidated on the standby');
>
> Would that make sense to trigger a checkpoint on the standby before this test?
> I mean I think that without a checkpoint on the standby we should not see the
> invalidation in the log anyway.

Done.

Please find the attached v37 patch for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v37-0001-Add-inactive_timeout-based-replication-slot-inva.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

06 April 2024, 06:48:34

On Sat, Apr 6, 2024 at 11:55 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>

Why the handling w.r.t active_pid in InvalidatePossiblyInactiveSlot()
is not similar to InvalidatePossiblyObsoleteSlot(). Won't we need to
ensure that there is no other active slot user? Is it sufficient to
check inactive_since for the same? If so, we need some comments to
explain the same.

Can we avoid introducing the new functions like
SaveGivenReplicationSlot() and MarkGivenReplicationSlotDirty(), if we
do the required work in the caller?

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

06 April 2024, 11:40:19

On Sat, Apr 6, 2024 at 12:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Why the handling w.r.t active_pid in InvalidatePossiblyInactiveSlot()
> is not similar to InvalidatePossiblyObsoleteSlot(). Won't we need to
> ensure that there is no other active slot user? Is it sufficient to
> check inactive_since for the same? If so, we need some comments to
> explain the same.

I removed the separate functions and with minimal changes, I've now
placed the RS_INVAL_INACTIVE_TIMEOUT logic into
InvalidatePossiblyObsoleteSlot and use that even in
CheckPointReplicationSlots.

> Can we avoid introducing the new functions like
> SaveGivenReplicationSlot() and MarkGivenReplicationSlotDirty(), if we
> do the required work in the caller?

Hm. Removed them now.

Please see the attached v38 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v38-0001-Add-inactive_timeout-based-replication-slot-inva.patch

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

13 April 2024, 04:06:25

On Sat, Apr 6, 2024 at 5:10 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Please see the attached v38 patch.

Hi, thanks everyone for reviewing the design and patches so far. Here
I'm with the v39 patches implementing inactive timeout based (0001)
and XID age based (0002) invalidation mechanisms.

I'm quoting the hackers who are okay with inactive timeout based
invalidation mechanism:
Bertrand Drouvot -
https://www.postgresql.org/message-id/ZgL0N%2BxVJNkyqsKL%40ip-10-97-1-34.eu-west-3.compute.internal
and https://www.postgresql.org/message-id/ZgPHDAlM79iLtGIH%40ip-10-97-1-34.eu-west-3.compute.internal
Amit Kapila -
https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com
Nathan Bossart -
https://www.postgresql.org/message-id/20240325195443.GA2923888%40nathanxps13
Robert Haas -
https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com

I'm quoting the hackers who are okay with XID age based invalidation mechanism:
Nathan Bossart -
https://www.postgresql.org/message-id/20240326150918.GB3181099%40nathanxps13
and https://www.postgresql.org/message-id/20240327150557.GA3994937%40nathanxps13
Alvaro Herrera -
https://www.postgresql.org/message-id/202403261539.xcjfle7sksz7%40alvherre.pgsql
Bertrand Drouvot -
https://www.postgresql.org/message-id/ZgPHDAlM79iLtGIH%40ip-10-97-1-34.eu-west-3.compute.internal
Amit Kapila -
https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com

There was a point raised by Robert
https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
for XID age based invalidation. An issue related to
vacuum_defer_cleanup_age
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=be504a3e974d75be6f95c8f9b7367126034f2d12
led to the removal of the GUC
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1118cd37eb61e6a2428f457a8b2026a7bb3f801a.
The same issue may not happen for the XID age based invaliation. This
is because the XID age is not calculated using FullTransactionId but
using TransactionId as the slot's xmin and catalog_xmin are tracked as
TransactionId.

There was a point raised by Amit
https://www.postgresql.org/message-id/CAA4eK1K8wqLsMw6j0hE_SFoWAeo3Kw8UNnMfhsWaYDF1GWYQ%2Bg%40mail.gmail.com
on when to do the XID age based invalidation - whether in checkpointer
or when vacuum is being run or whenever ComputeXIDHorizons gets called
or in autovacuum process. For now, I've chosen the design to do these
new invalidation checks in two places - 1) whenever the slot is
acquired and the slot acquisition errors out if invalidated, 2) during
checkpoint. However, I'm open to suggestions on this.

I've also verified the case whether the replication_slot_xid_age
setting can help in case of server inching towards the XID wraparound.
I've created a primary and streaming standby setup with
hot_standby_feedback set to on (so that the slot gets an xmin). Then,
I've set replication_slot_xid_age to 2 billion on the primary, and
used xid_wraparound extension to reach XID wraparound on the primary.
Once I start receiving the WARNINGs about VACUUM, I did a checkpoint
after which the slot got invalidated enabling my VACUUM to freeze XIDs
saving my database from XID wraparound problem.

Thanks a lot Masahiko Sawada for an offlist chat about the XID age
calculation logic.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Hi,

On Sat, Apr 13, 2024 at 9:36 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> There was a point raised by Amit
> https://www.postgresql.org/message-id/CAA4eK1K8wqLsMw6j0hE_SFoWAeo3Kw8UNnMfhsWaYDF1GWYQ%2Bg%40mail.gmail.com
> on when to do the XID age based invalidation - whether in checkpointer
> or when vacuum is being run or whenever ComputeXIDHorizons gets called
> or in autovacuum process. For now, I've chosen the design to do these
> new invalidation checks in two places - 1) whenever the slot is
> acquired and the slot acquisition errors out if invalidated, 2) during
> checkpoint. However, I'm open to suggestions on this.

Here are my thoughts on when to do the XID age invalidation. In all
the patches sent so far, the XID age invalidation happens in two
places - one during the slot acquisition, and another during the
checkpoint. As the suggestion is to do it during the vacuum (manual
and auto), so that even if the checkpoint isn't happening in the
database for whatever reasons, a vacuum command or autovacuum can
invalidate the slots whose XID is aged.

An idea is to check for XID age based invalidation for all the slots
in ComputeXidHorizons() before it reads replication_slot_xmin and
replication_slot_catalog_xmin, and obviously before the proc array
lock is acquired. A potential problem with this approach is that the
invalidation check can become too aggressive as XID horizons are
computed from many places.

Another idea is to check for XID age based invalidation for all the
slots in higher levels than ComputeXidHorizons(), for example in
vacuum() which is an entry point for both vacuum command and
autovacuum. This approach seems similar to vacuum_failsafe_age GUC
which checks each relation for the failsafe age before vacuum gets
triggered on it.

Does anyone see any issues or risks with the above two approaches or
have any other ideas? Thoughts?

I attached v40 patches here. I reworded some of the ERROR messages,
and did some code clean-up. Note that I haven't implemented any of the
above approaches yet.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nathan Bossart

Date:

17 June 2024, 15:09:53

On Mon, Jun 17, 2024 at 05:55:04PM +0530, Bharath Rupireddy wrote:
> Here are my thoughts on when to do the XID age invalidation. In all
> the patches sent so far, the XID age invalidation happens in two
> places - one during the slot acquisition, and another during the
> checkpoint. As the suggestion is to do it during the vacuum (manual
> and auto), so that even if the checkpoint isn't happening in the
> database for whatever reasons, a vacuum command or autovacuum can
> invalidate the slots whose XID is aged.

+1.  IMHO this is a principled choice.  The similar max_slot_wal_keep_size
parameter is considered where it arguably matters most: when we are trying
to remove/recycle WAL segments.  Since this parameter is intended to
prevent the server from running out of space, it makes sense that we'd
apply it at the point where we are trying to free up space.  The proposed
max_slot_xid_age parameter is intended to prevent the server from running
out of transaction IDs, so it follows that we'd apply it at the point where
we reclaim them, which happens to be vacuum.

> An idea is to check for XID age based invalidation for all the slots
> in ComputeXidHorizons() before it reads replication_slot_xmin and
> replication_slot_catalog_xmin, and obviously before the proc array
> lock is acquired. A potential problem with this approach is that the
> invalidation check can become too aggressive as XID horizons are
> computed from many places.
>
> Another idea is to check for XID age based invalidation for all the
> slots in higher levels than ComputeXidHorizons(), for example in
> vacuum() which is an entry point for both vacuum command and
> autovacuum. This approach seems similar to vacuum_failsafe_age GUC
> which checks each relation for the failsafe age before vacuum gets
> triggered on it.

I don't presently have any strong opinion on where this logic should go,
but in general, I think we should only invalidate slots if invalidating
them would allow us to advance the vacuum cutoff.  If the cutoff is held
back by something else, I don't see a point in invalidating slots because
we'll just be breaking replication in return for no additional reclaimed
transaction IDs.

-- 
nathan

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

24 June 2024, 06:00:00

Hi,

On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Here are my thoughts on when to do the XID age invalidation. In all
> the patches sent so far, the XID age invalidation happens in two
> places - one during the slot acquisition, and another during the
> checkpoint. As the suggestion is to do it during the vacuum (manual
> and auto), so that even if the checkpoint isn't happening in the
> database for whatever reasons, a vacuum command or autovacuum can
> invalidate the slots whose XID is aged.
>
> An idea is to check for XID age based invalidation for all the slots
> in ComputeXidHorizons() before it reads replication_slot_xmin and
> replication_slot_catalog_xmin, and obviously before the proc array
> lock is acquired. A potential problem with this approach is that the
> invalidation check can become too aggressive as XID horizons are
> computed from many places.
>
> Another idea is to check for XID age based invalidation for all the
> slots in higher levels than ComputeXidHorizons(), for example in
> vacuum() which is an entry point for both vacuum command and
> autovacuum. This approach seems similar to vacuum_failsafe_age GUC
> which checks each relation for the failsafe age before vacuum gets
> triggered on it.

I am attaching the patches implementing the idea of invalidating
replication slots during vacuum when current slot xmin limits
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) are aged as per the new XID
age GUC. When either of these limits are aged, there must be at least
one replication slot that is aged, because the xmin limits, after all,
are the minimum of xmin or catalog_xmin of all replication slots. In
this approach, the new XID age GUC will help vacuum when needed,
because the current slot xmin limits are recalculated after
invalidating replication slots that are holding xmins for longer than
the age. The code is placed in vacuum() which is common for both
vacuum command and autovacuum, and gets executed only once every
vacuum cycle to not be too aggressive in invalidating.

However, there might be some concerns with this approach like the following:
1) Adding more code to vacuum might not be acceptable
2) What if invalidation of replication slots emits an error, will it
block vacuum forever? Currently, InvalidateObsoleteReplicationSlots()
is also called as part of the checkpoint, and emitting ERRORs from
within is avoided already. Therefore, there is no concern here for
now.
3) What if there are more replication slots to be invalidated, will it
delay the vacuum? If yes, by how much? <<TODO>>
4) Will the invalidation based on just current replication slot xmin
limits suffice irrespective of vacuum cutoffs? IOW, if the replication
slots are invalidated but vacuum isn't going to do any work because
vacuum cutoffs are not yet met? Is the invalidation work wasteful
here?
5) Is it okay to take just one more time the proc array lock to get
current replication slot xmin limits via
ProcArrayGetReplicationSlotXmin() once every vacuum cycle? <<TODO>>
6) Vacuum command can't be run on the standby in recovery. So, to help
invalidate replication slots on the standby, I have for now let the
checkpointer also do the XID age based invalidation. I know
invalidating both in checkpointer and vacuum may not be a great idea,
but I'm open to thoughts.

Following are some of the alternative approaches which IMHO don't help
vacuum when needed:
a) Let the checkpointer do the XID age based invalidation, and call it
out in the documentation that if the checkpoint doesn't happen, the
new GUC doesn't help even if the vacuum is run. This has been the
approach until v40 patch.
b) Checkpointer and/or other backends add an autovacuum work item via
AutoVacuumRequestWork(), and autovacuum when it gets to it will
invalidate the replication slots. But, what to do for the vacuum
command here?

Please find the attached v41 patches implementing the idea of vacuum
doing the invalidation.

Thoughts?

Thanks to Sawada-san for a detailed off-list discussion.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nathan Bossart

Date:

09 July 2024, 22:01:23

On Mon, Jun 24, 2024 at 11:30:00AM +0530, Bharath Rupireddy wrote:
> 6) Vacuum command can't be run on the standby in recovery. So, to help
> invalidate replication slots on the standby, I have for now let the
> checkpointer also do the XID age based invalidation. I know
> invalidating both in checkpointer and vacuum may not be a great idea,
> but I'm open to thoughts.

Hm.  I hadn't considered this angle.

> a) Let the checkpointer do the XID age based invalidation, and call it
> out in the documentation that if the checkpoint doesn't happen, the
> new GUC doesn't help even if the vacuum is run. This has been the
> approach until v40 patch.

My first reaction is that this is probably okay.  I guess you might run
into problems if you set max_slot_xid_age to 2B and checkpoint_timeout to 1
day, but even in that case your transaction ID usage rate would need to be
pretty high for wraparound to occur.

-- 
nathan

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Ajin Cherian

Date:

12 August 2024, 12:17:55

On Mon, Jun 24, 2024 at 4:01 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Here are my thoughts on when to do the XID age invalidation. In all
> the patches sent so far, the XID age invalidation happens in two
> places - one during the slot acquisition, and another during the
> checkpoint. As the suggestion is to do it during the vacuum (manual
> and auto), so that even if the checkpoint isn't happening in the
> database for whatever reasons, a vacuum command or autovacuum can
> invalidate the slots whose XID is aged.
>
> An idea is to check for XID age based invalidation for all the slots
> in ComputeXidHorizons() before it reads replication_slot_xmin and
> replication_slot_catalog_xmin, and obviously before the proc array
> lock is acquired. A potential problem with this approach is that the
> invalidation check can become too aggressive as XID horizons are
> computed from many places.
>
> Another idea is to check for XID age based invalidation for all the
> slots in higher levels than ComputeXidHorizons(), for example in
> vacuum() which is an entry point for both vacuum command and
> autovacuum. This approach seems similar to vacuum_failsafe_age GUC
> which checks each relation for the failsafe age before vacuum gets
> triggered on it.

I am attaching the patches implementing the idea of invalidating
replication slots during vacuum when current slot xmin limits
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) are aged as per the new XID
age GUC. When either of these limits are aged, there must be at least
one replication slot that is aged, because the xmin limits, after all,
are the minimum of xmin or catalog_xmin of all replication slots. In
this approach, the new XID age GUC will help vacuum when needed,
because the current slot xmin limits are recalculated after
invalidating replication slots that are holding xmins for longer than
the age. The code is placed in vacuum() which is common for both
vacuum command and autovacuum, and gets executed only once every
vacuum cycle to not be too aggressive in invalidating.

However, there might be some concerns with this approach like the following:
1) Adding more code to vacuum might not be acceptable
2) What if invalidation of replication slots emits an error, will it
block vacuum forever? Currently, InvalidateObsoleteReplicationSlots()
is also called as part of the checkpoint, and emitting ERRORs from
within is avoided already. Therefore, there is no concern here for
now.
3) What if there are more replication slots to be invalidated, will it
delay the vacuum? If yes, by how much? <<TODO>>
4) Will the invalidation based on just current replication slot xmin
limits suffice irrespective of vacuum cutoffs? IOW, if the replication
slots are invalidated but vacuum isn't going to do any work because
vacuum cutoffs are not yet met? Is the invalidation work wasteful
here?
5) Is it okay to take just one more time the proc array lock to get
current replication slot xmin limits via
ProcArrayGetReplicationSlotXmin() once every vacuum cycle? <<TODO>>
6) Vacuum command can't be run on the standby in recovery. So, to help
invalidate replication slots on the standby, I have for now let the
checkpointer also do the XID age based invalidation. I know
invalidating both in checkpointer and vacuum may not be a great idea,
but I'm open to thoughts.

Following are some of the alternative approaches which IMHO don't help
vacuum when needed:
a) Let the checkpointer do the XID age based invalidation, and call it
out in the documentation that if the checkpoint doesn't happen, the
new GUC doesn't help even if the vacuum is run. This has been the
approach until v40 patch.
b) Checkpointer and/or other backends add an autovacuum work item via
AutoVacuumRequestWork(), and autovacuum when it gets to it will
invalidate the replication slots. But, what to do for the vacuum
command here?

Please find the attached v41 patches implementing the idea of vacuum
doing the invalidation.

Thoughts?

Thanks to Sawada-san for a detailed off-list discussion.

The patch no longer applies on HEAD, please rebase.

regards,

Ajin Cherian

Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Masahiko Sawada

Date:

12 August 2024, 21:32:45

On Tue, Jul 9, 2024 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, Jun 24, 2024 at 11:30:00AM +0530, Bharath Rupireddy wrote:
> > 6) Vacuum command can't be run on the standby in recovery. So, to help
> > invalidate replication slots on the standby, I have for now let the
> > checkpointer also do the XID age based invalidation. I know
> > invalidating both in checkpointer and vacuum may not be a great idea,
> > but I'm open to thoughts.
>
> Hm.  I hadn't considered this angle.

Another idea would be to let the startup process do slot invalidation
when replaying a RUNNING_XACTS record. Since a RUNNING_XACTS record
has the latest XID on the primary, I think the startup process can
compare it to the slot-xmin, and invalidate slots which are older than
the age limit.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Ajin Cherian

Date:

14 August 2024, 03:50:38

On Mon, Jun 24, 2024 at 4:01 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the attached v41 patches implementing the idea of vacuum
doing the invalidation.

Thoughts?

Some minor comments on the patch:
1.
+ /*
+ * Release the lock if it's not yet to keep the cleanup path on
+ * error happy.
+ */

I suggest rephrasing to: " "Release the lock if it hasn't been already to ensure smooth cleanup on error."

2.

elog(DEBUG1, "performing replication slot invalidation");

Probably change it to "performing replication slot invalidation checks" as we might not actually invalidate any slot here.

3.

In CheckPointReplicationSlots()

+ invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+ 0,
+ InvalidOid,
+ InvalidTransactionId);
+
+ if (invalidated)
+ {
+ /*
+ * If any slots have been invalidated, recalculate the resource
+ * limits.
+ */
+ ReplicationSlotsComputeRequiredXmin(false);
+ ReplicationSlotsComputeRequiredLSN();
+ }

Is this calculation of resource limits really required here when the same is already done inside InvalidateObsoleteReplicationSlots()

regards,

Ajin Cherian

Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

26 August 2024, 14:05:02

On Mon, Aug 26, 2024 at 11:44 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>

Few comments on 0001:
1.
@@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
     " name slot \"%s\" already exists on the standby",
     remote_slot->name));

+ /*
+ * Skip the sync if the local slot is already invalidated. We do this
+ * beforehand to avoid slot acquire and release.
+ */
+ if (slot->data.invalidated != RS_INVAL_NONE)
+ return false;
+
  /*
  * The slot has been synchronized before.

I was wondering why you have added this new check as part of this
patch. If you see the following comments in the related code, you will
know why we haven't done this previously.

/*
* The slot has been synchronized before.
*
* It is important to acquire the slot here before checking
* invalidation. If we don't acquire the slot first, there could be a
* race condition that the local slot could be invalidated just after
* checking the 'invalidated' flag here and we could end up
* overwriting 'invalidated' flag to remote_slot's value. See
* InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
* if the slot is not acquired by other processes.
*
* XXX: If it ever turns out that slot acquire/release is costly for
* cases when none of the slot properties is changed then we can do a
* pre-check to ensure that at least one of the slot properties is
* changed before acquiring the slot.
*/
ReplicationSlotAcquire(remote_slot->name, true);

We need some modifications in these comments if you want to add a
pre-check here.

2.
@@ -1907,6 +2033,31 @@ CheckPointReplicationSlots(bool is_shutdown)
  SaveSlotToPath(s, path, LOG);
  }
  LWLockRelease(ReplicationSlotAllocationLock);
+
+ elog(DEBUG1, "performing replication slot invalidation checks");
+
+ /*
+ * Note that we will make another pass over replication slots for
+ * invalidations to keep the code simple. The assumption here is that the
+ * traversal over replication slots isn't that costly even with hundreds
+ * of replication slots. If it ever turns out that this assumption is
+ * wrong, we might have to put the invalidation check logic in the above
+ * loop, for that we might have to do the following:
+ *
+ * - Acqure ControlLock lock once before the loop.
+ *
+ * - Call InvalidatePossiblyObsoleteSlot for each slot.
+ *
+ * - Handle the cases in which ControlLock gets released just like
+ * InvalidateObsoleteReplicationSlots does.
+ *
+ * - Avoid saving slot info to disk two times for each invalidated slot.
+ *
+ * XXX: Should we move inactive_timeout inavalidation check closer to
+ * wal_removed in CreateCheckPoint and CreateRestartPoint?
+ */
+ InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+    0, InvalidOid, InvalidTransactionId);

Why do we want to call this for shutdown case (when is_shutdown is
true)? I understand trying to invalidate slots during regular
checkpoint but not sure if we need it at the time of shutdown. The
other point is can we try to check the performance impact with 100s of
slots as mentioned in the code comments?

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

02 September 2024, 09:55:35

On Thu, Aug 29, 2024 at 11:31 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Thanks for looking into this.
>
> On Mon, Aug 26, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Few comments on 0001:
> > 1.
> > @@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
> >
> > + /*
> > + * Skip the sync if the local slot is already invalidated. We do this
> > + * beforehand to avoid slot acquire and release.
> > + */
> >
> > I was wondering why you have added this new check as part of this
> > patch. If you see the following comments in the related code, you will
> > know why we haven't done this previously.
>
> Removed. Can deal with optimization separately.
>
> > 2.
> > + */
> > + InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
> > +    0, InvalidOid, InvalidTransactionId);
> >
> > Why do we want to call this for shutdown case (when is_shutdown is
> > true)? I understand trying to invalidate slots during regular
> > checkpoint but not sure if we need it at the time of shutdown.
>
> Changed it to invalidate only for non-shutdown checkpoints. inactive_timeout invalidation isn't critical for shutdown
unlikewal_removed which can help shutdown by freeing up some disk space. 
>
> > The
> > other point is can we try to check the performance impact with 100s of
> > slots as mentioned in the code comments?
>
> I first checked how much does the wal_removed invalidation check add to the checkpoint (see 2nd and 3rd column). I
thenchecked how much inactive_timeout invalidation check adds to the checkpoint (see 4th column), it is not more than
wal_removeinvalidation check. I then checked how much the wal_removed invalidation check adds for replication slots
thathave already been invalidated due to inactive_timeout (see 5th column), looks like not much. 
>
> | # of slots | HEAD (no invalidation) ms | HEAD (wal_removed) ms | PATCHED (inactive_timeout) ms | PATCHED
(inactive_timeout+wal_removed)ms | 
>
|------------|----------------------------|-----------------------|-------------------------------|------------------------------------------|
> | 100        | 18.591                     | 370.586               | 359.299                       | 373.882
                      | 
> | 1000       | 15.722                     | 4834.901              | 5081.751                      | 5072.128
                      | 
> | 10000      | 19.261                     | 59801.062             | 61270.406                     | 60270.099
                      | 
>
> Having said that, I'm okay to implement the optimization specified. Thoughts?
>

The other possibility is to try invalidating due to timeout along with
wal_removed case during checkpoint. The idea is that if the slot can
be invalidated due to WAL then fine, otherwise check if it can be
invalidated due to timeout. This can avoid looping the slots and doing
similar work multiple times during the checkpoint.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

02 September 2024, 12:50:06

On Sat, Aug 31, 2024 at 1:45 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Please find the attached v44 patch with the above changes. I will
> include the 0002 xid_age based invalidation patch later.
>

It is better to get the 0001 reviewed and committed first. We can
discuss about 0002 afterwards as 0001 is in itself a complete and
separate patch that can be committed.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

03 September 2024, 09:55:58

Hi, my previous review posts did not cover the test code.

Here are my review comments for the v44-0001 test code

======
TEST CASE #1

1.
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE slot_name = 'lsub1_sync_slot' AND
+ invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for lsub1_sync_slot invalidation to be
synced on standby";
+

Is that comment correct? IIUC the synced slot should *already* be
invalidated from the primary, so here we are not really "waiting" for
it to be invalidated; Instead, we are just "confirming" that the
synchronized slot is already invalidated with the correct reason as
expected.

~~~

2.
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+ "invalidating obsolete replication slot \"lsub1_sync_slot\"",
+ $standby1_logstart),
+ 'check that syned lsub1_sync_slot has not been invalidated on the standby'
+);
+

This test case seemed bogus, for a couple of reasons:

2a. IIUC this 'lsub1_sync_slot' is the same one that is already
invalid (from the primary), so nobody should be surprised that an
already invalid slot doesn't get flagged as invalid again. i.e.
Shouldn't your test scenario here be done using a valid synced slot?

2b. AFAICT it was only moments above this CHECKPOINT where you
assigned the standby inactivity timeout to 2s. So even if there was
some bug invalidating synced slots I don't think you gave it enough
time to happen -- e.g. I doubt 2s has elapsed yet.

~

3.
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+ $inactive_timeout);

This seems a bit tricky. Both these (the stop and the wait) seem to
belong together, so I think maybe a single bigger explanatory comment
covering both parts would help for understanding.

======
TEST CASE #2

4.
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+ $inactive_timeout);

IIUC, this is just like comment #3 above. Both these (the stop and the
wait) seem to belong together, so I think maybe a single bigger
explanatory comment covering both parts would help for understanding.

~~~

5.
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================


IMO the rest of the comment after "Testcase end" isn't very useful.

======
sub wait_for_slot_invalidation

6.
+sub wait_for_slot_invalidation
+{

An explanatory header comment for this subroutine would be helpful.

~~~

7.
+ # Wait for the replication slot to become inactive
+ $node->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE slot_name = '$slot_name' AND active = 'f';
+ ])
+   or die
+   "Timed out while waiting for slot $slot_name to become inactive on
node $name";
+
+ # Wait for the replication slot info to be updated
+ $node->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE inactive_since IS NOT NULL
+ AND slot_name = '$slot_name' AND active = 'f';
+ ])
+   or die
+   "Timed out while waiting for info of slot $slot_name to be updated
on node $name";
+

Why are there are 2 separate poll_query_until's here? Can't those be
combined into just one?

~~~

8.
+ # Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+ # for the slot to get invalidated.
+ sleep($inactive_timeout);
+

Maybe this special sleep to prevent too many CHECKPOINTs should be
moved to be inside the other subroutine, which is actually doing those
CHECKPOINTs.

~~~

9.
+ # Wait for the inactive replication slot to be invalidated
+ $node->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE slot_name = '$slot_name' AND
+ invalidation_reason = 'inactive_timeout';
+ ])
+   or die
+   "Timed out while waiting for inactive slot $slot_name to be
invalidated on node $name";
+

The comment seems misleading. IIUC you are not "waiting" for the
invalidation here, because it is the other subroutine doing the
waiting for the invalidation message in the logs. Instead, here I
think you are just confirming the 'invalidation_reason' got set
correctly. The comment should say what it is really doing.

======
sub check_for_slot_invalidation_in_server_log

10.
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{

I think the main function of this subroutine is the CHECKPOINT and the
waiting for the server log to say invalidation happened. It is doing a
loop of a) CHECKPOINT then b) inspecting the server log for the slot
invalidation, and c) waiting for a bit. Repeat 10 times.

A comment describing the logic for this subroutine would be helpful.

The most important side-effect of this function is the CHECKPOINT
because without that nothing will ever get invalidated due to
inactivity, but this key point is not obvious from the subroutine
name.

IMO it would be better to name this differently to reflect what it is
really doing:
e.g. "CHECKPOINT_and_wait_for_slot_invalidation_in_server_log"

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

03 September 2024, 12:31:06

On Sat, Aug 31, 2024 at 1:45 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Hi,
>
>
> Please find the attached v44 patch with the above changes. I will
> include the 0002 xid_age based invalidation patch later.
>

Thanks for the patch Bharath. My review and testing is WIP, but please
find few comments and queries:

1)
I see that ReplicationSlotAlter() will error out if the slot is
invalidated due to timeout. I have not tested it myself, but do you
know if  slot-alter errors out for other invalidation causes as well?
Just wanted to confirm that the behaviour is consistent for all
invalidation causes.

2)
When a slot is invalidated, and we try to use that slot, it gives this msg:

ERROR:  can no longer get changes from replication slot "mysubnew1_2"
DETAIL:  The slot became invalid because it was inactive since
2024-09-03 14:23:34.094067+05:30, which is more than 600 seconds ago.
HINT:  You might need to increase "replication_slot_inactive_timeout.".

Isn't HINT misleading? Even if we increase it now, the slot can not be
reused again.

3)
When the slot is invalidated, the' inactive_since' still keeps on
changing when there is a subscriber trying to start replication
continuously. I think ReplicationSlotAcquire() keeps on failing and
thus Release keeps on setting it again and again. Shouldn't we stop
setting/chnaging  'inactive_since' once the slot is invalidated
already, otherwise it will be misleading.

postgres=# select failover,synced,inactive_since,invalidation_reason
from pg_replication_slots;

 failover | synced |          inactive_since          | invalidation_reason
----------+--------+----------------------------------+---------------------
 t        | f      | 2024-09-03 14:23:.. | inactive_timeout

after sometime:
 failover | synced |          inactive_since          | invalidation_reason
----------+--------+----------------------------------+---------------------
 t        | f      | 2024-09-03 14:26:..| inactive_timeout

4)
src/sgml/config.sgml:

4a)
+ A value of zero (which is default) disables the timeout mechanism.

Better will be:
A value of zero (which is default) disables the inactive timeout
invalidation mechanism .
or
A value of zero (which is default) disables the slot invalidation due
to the inactive timeout mechanism.

i.e. rephrase to indicate that invalidation is disabled.

4b)
'synced' and inactive_since should point to pg_replication_slots:

example:
<link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>

5)
src/sgml/system-views.sgml:
+ ..the slot has been inactive for longer than the duration specified
by replication_slot_inactive_timeout parameter.

Better to have:
..the slot has been inactive for a time longer than the duration
specified by the replication_slot_inactive_timeout parameter.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

04 September 2024, 06:47:33

On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
>
>
> 1)
> I see that ReplicationSlotAlter() will error out if the slot is
> invalidated due to timeout. I have not tested it myself, but do you
> know if  slot-alter errors out for other invalidation causes as well?
> Just wanted to confirm that the behaviour is consistent for all
> invalidation causes.

I was able to test this and as anticipated behavior is different. When
slot is invalidated due to say 'wal_removed', I am still able to do
'alter' of that slot.
Please see:

Pub:
  slot_name  | failover | synced |          inactive_since          |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
 mysubnew1_1 | t        | f      | 2024-09-04 08:58:12.802278+05:30 |
wal_removed

Sub:
newdb1=# alter subscription mysubnew1_1 disable;
ALTER SUBSCRIPTION

newdb1=# alter subscription mysubnew1_1 set (failover=false);
ALTER SUBSCRIPTION

Pub: (failover altered)
  slot_name  | failover | synced |          inactive_since          |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
 mysubnew1_1 | f        | f      | 2024-09-04 08:58:47.824471+05:30 |
wal_removed


while when invalidation_reason is 'inactive_timeout', it fails:

Pub:
  slot_name  | failover | synced |          inactive_since          |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
 mysubnew1_1 | t        | f      | 2024-09-03 14:30:57.532206+05:30 |
inactive_timeout

Sub:
newdb1=# alter subscription mysubnew1_1 disable;
ALTER SUBSCRIPTION

newdb1=# alter subscription mysubnew1_1 set (failover=false);
ERROR:  could not alter replication slot "mysubnew1_1": ERROR:  can no
longer get changes from replication slot "mysubnew1_1"
DETAIL:  The slot became invalid because it was inactive since
2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago.
HINT:  You might need to increase "replication_slot_inactive_timeout.".

I think the behavior should be same.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

04 September 2024, 12:18:51

On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> >

1)
It is related to one of my previous comments (pt 3 in [1]) where I
stated that inactive_since should not keep on changing once a slot is
invalidated.
Below is one side effect if inactive_since keeps on changing:

postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
pg_current_wal_lsn());
ERROR:  can no longer get changes from replication slot "mysubnew1_1"
DETAIL:  The slot became invalid because it was inactive since
2024-09-04 10:03:56.68053+05:30, which is more than 10 seconds ago.
HINT:  You might need to increase "replication_slot_inactive_timeout.".

postgres=# select now();
               now
---------------------------------
 2024-09-04 10:04:00.26564+05:30

'DETAIL' gives wrong information, we are not past 10-seconds. This is
because inactive_since got updated even in ERROR scenario.

2)
One more issue in this message is, once I set
replication_slot_inactive_timeout to a bigger value, it becomes more
misleading. This is because invalidation was done in the past using
previous value while message starts showing new value:

ALTER SYSTEM SET replication_slot_inactive_timeout TO '36h';

--see 129600 secs in DETAIL and the current time.
postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
pg_current_wal_lsn());
ERROR:  can no longer get changes from replication slot "mysubnew1_1"
DETAIL:  The slot became invalid because it was inactive since
2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds
ago.
postgres=# select now();
               now
----------------------------------
 2024-09-04 10:07:35.201894+05:30

I feel we should change this message itself.

~~~~~

When invalidation is due to wal_removed, we get a way simpler message:

newdb1=# SELECT * FROM pg_replication_slot_advance('mysubnew1_2',
pg_current_wal_lsn());
ERROR:  replication slot "mysubnew1_2" cannot be advanced
DETAIL:  This slot has never previously reserved WAL, or it has been
invalidated.

This message does not mention 'max_slot_wal_keep_size'. We should have
a similar message for our case. Thoughts?

[1]:  https://www.postgresql.org/message-id/CAJpy0uC8Dg-0JS3NRUwVUemgz5Ar2v3_EQQFXyAigWSEQ8U47Q%40mail.gmail.com

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

05 September 2024, 06:55:51

On Wed, Sep 4, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > >
>
>
> 1)
> It is related to one of my previous comments (pt 3 in [1]) where I
> stated that inactive_since should not keep on changing once a slot is
> invalidated.
>

Agreed. Updating the inactive_since for a slot that is already invalid
is misleading.

>
>
> 2)
> One more issue in this message is, once I set
> replication_slot_inactive_timeout to a bigger value, it becomes more
> misleading. This is because invalidation was done in the past using
> previous value while message starts showing new value:
>
> ALTER SYSTEM SET replication_slot_inactive_timeout TO '36h';
>
> --see 129600 secs in DETAIL and the current time.
> postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
> pg_current_wal_lsn());
> ERROR:  can no longer get changes from replication slot "mysubnew1_1"
> DETAIL:  The slot became invalid because it was inactive since
> 2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds
> ago.
> postgres=# select now();
>                now
> ----------------------------------
>  2024-09-04 10:07:35.201894+05:30
>
> I feel we should change this message itself.
>

+1.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

05 September 2024, 07:00:16

On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> >
> > 1)
> > I see that ReplicationSlotAlter() will error out if the slot is
> > invalidated due to timeout. I have not tested it myself, but do you
> > know if  slot-alter errors out for other invalidation causes as well?
> > Just wanted to confirm that the behaviour is consistent for all
> > invalidation causes.
>
> I was able to test this and as anticipated behavior is different. When
> slot is invalidated due to say 'wal_removed', I am still able to do
> 'alter' of that slot.
> Please see:
>
> Pub:
>   slot_name  | failover | synced |          inactive_since          |
> invalidation_reason
> -------------+----------+--------+----------------------------------+---------------------
>  mysubnew1_1 | t        | f      | 2024-09-04 08:58:12.802278+05:30 |
> wal_removed
>
> Sub:
> newdb1=# alter subscription mysubnew1_1 disable;
> ALTER SUBSCRIPTION
>
> newdb1=# alter subscription mysubnew1_1 set (failover=false);
> ALTER SUBSCRIPTION
>
> Pub: (failover altered)
>   slot_name  | failover | synced |          inactive_since          |
> invalidation_reason
> -------------+----------+--------+----------------------------------+---------------------
>  mysubnew1_1 | f        | f      | 2024-09-04 08:58:47.824471+05:30 |
> wal_removed
>
>
> while when invalidation_reason is 'inactive_timeout', it fails:
>
> Pub:
>   slot_name  | failover | synced |          inactive_since          |
> invalidation_reason
> -------------+----------+--------+----------------------------------+---------------------
>  mysubnew1_1 | t        | f      | 2024-09-03 14:30:57.532206+05:30 |
> inactive_timeout
>
> Sub:
> newdb1=# alter subscription mysubnew1_1 disable;
> ALTER SUBSCRIPTION
>
> newdb1=# alter subscription mysubnew1_1 set (failover=false);
> ERROR:  could not alter replication slot "mysubnew1_1": ERROR:  can no
> longer get changes from replication slot "mysubnew1_1"
> DETAIL:  The slot became invalid because it was inactive since
> 2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago.
> HINT:  You might need to increase "replication_slot_inactive_timeout.".
>
> I think the behavior should be same.
>

We should not allow the invalid replication slot to be altered
irrespective of the reason unless there is any benefit.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

08 September 2024, 14:55:42

Hi,

Thanks for reviewing.

On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> 1)
> I see that ReplicationSlotAlter() will error out if the slot is
> invalidated due to timeout. I have not tested it myself, but do you
> know if  slot-alter errors out for other invalidation causes as well?
> Just wanted to confirm that the behaviour is consistent for all
> invalidation causes.

Will respond to Amit's comment soon.

> 2)
> When a slot is invalidated, and we try to use that slot, it gives this msg:
>
> ERROR:  can no longer get changes from replication slot "mysubnew1_2"
> DETAIL:  The slot became invalid because it was inactive since
> 2024-09-03 14:23:34.094067+05:30, which is more than 600 seconds ago.
> HINT:  You might need to increase "replication_slot_inactive_timeout.".
>
> Isn't HINT misleading? Even if we increase it now, the slot can not be
> reused again.
>
> Below is one side effect if inactive_since keeps on changing:
>
> postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
> pg_current_wal_lsn());
> ERROR:  can no longer get changes from replication slot "mysubnew1_1"
> DETAIL:  The slot became invalid because it was inactive since
> 2024-09-04 10:03:56.68053+05:30, which is more than 10 seconds ago.
> HINT:  You might need to increase "replication_slot_inactive_timeout.".
>
> postgres=# select now();
>                now
> ---------------------------------
>  2024-09-04 10:04:00.26564+05:30
>
> 'DETAIL' gives wrong information, we are not past 10-seconds. This is
> because inactive_since got updated even in ERROR scenario.
>
> ERROR:  can no longer get changes from replication slot "mysubnew1_1"
> DETAIL:  The slot became invalid because it was inactive since
> 2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds
> ago.
> postgres=# select now();
>                now
> ----------------------------------
>  2024-09-04 10:07:35.201894+05:30
>
> I feel we should change this message itself.

Removed the hint and corrected the detail message as following:

errmsg("can no longer get changes from replication slot \"%s\"",
NameStr(s->data.name)),
errdetail("This slot has been invalidated because it was inactive for
longer than the amount of time specified by \"%s\".",
"replication_slot_inactive_timeout.")));

> 3)
> When the slot is invalidated, the' inactive_since' still keeps on
> changing when there is a subscriber trying to start replication
> continuously. I think ReplicationSlotAcquire() keeps on failing and
> thus Release keeps on setting it again and again. Shouldn't we stop
> setting/chnaging  'inactive_since' once the slot is invalidated
> already, otherwise it will be misleading.
>
> postgres=# select failover,synced,inactive_since,invalidation_reason
> from pg_replication_slots;
>
>  failover | synced |          inactive_since          | invalidation_reason
> ----------+--------+----------------------------------+---------------------
>  t        | f      | 2024-09-03 14:23:.. | inactive_timeout
>
> after sometime:
>  failover | synced |          inactive_since          | invalidation_reason
> ----------+--------+----------------------------------+---------------------
>  t        | f      | 2024-09-03 14:26:..| inactive_timeout

Changed it to not update inactive_since for slots invalidated due to
inactive timeout.

> 4)
> src/sgml/config.sgml:
>
> 4a)
> + A value of zero (which is default) disables the timeout mechanism.
>
> Better will be:
> A value of zero (which is default) disables the inactive timeout
> invalidation mechanism .

Changed.

> 4b)
> 'synced' and inactive_since should point to pg_replication_slots:
>
> example:
> <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>

Modified.

> 5)
> src/sgml/system-views.sgml:
> + ..the slot has been inactive for longer than the duration specified
> by replication_slot_inactive_timeout parameter.
>
> Better to have:
> ..the slot has been inactive for a time longer than the duration
> specified by the replication_slot_inactive_timeout parameter.

Changed it to the following to be consistent with the config.sgml.

          <literal>inactive_timeout</literal> means that the slot has been
          inactive for longer than the amount of time specified by the
          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.

Please find the v45 patch posted upthread at
https://www.postgresql.org/message-id/CALj2ACWXQT3_HY40ceqKf1DadjLQP6b1r%3D0sZRh-xhAOd-b0pA%40mail.gmail.com
for the changes.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

09 September 2024, 06:47:30

On Thu, Sep 5, 2024 at 9:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > >
> > > 1)
> > > I see that ReplicationSlotAlter() will error out if the slot is
> > > invalidated due to timeout. I have not tested it myself, but do you
> > > know if  slot-alter errors out for other invalidation causes as well?
> > > Just wanted to confirm that the behaviour is consistent for all
> > > invalidation causes.
> >
> > I was able to test this and as anticipated behavior is different. When
> > slot is invalidated due to say 'wal_removed', I am still able to do
> > 'alter' of that slot.
> > Please see:
> >
> > Pub:
> >   slot_name  | failover | synced |          inactive_since          |
> > invalidation_reason
> > -------------+----------+--------+----------------------------------+---------------------
> >  mysubnew1_1 | t        | f      | 2024-09-04 08:58:12.802278+05:30 |
> > wal_removed
> >
> > Sub:
> > newdb1=# alter subscription mysubnew1_1 disable;
> > ALTER SUBSCRIPTION
> >
> > newdb1=# alter subscription mysubnew1_1 set (failover=false);
> > ALTER SUBSCRIPTION
> >
> > Pub: (failover altered)
> >   slot_name  | failover | synced |          inactive_since          |
> > invalidation_reason
> > -------------+----------+--------+----------------------------------+---------------------
> >  mysubnew1_1 | f        | f      | 2024-09-04 08:58:47.824471+05:30 |
> > wal_removed
> >
> >
> > while when invalidation_reason is 'inactive_timeout', it fails:
> >
> > Pub:
> >   slot_name  | failover | synced |          inactive_since          |
> > invalidation_reason
> > -------------+----------+--------+----------------------------------+---------------------
> >  mysubnew1_1 | t        | f      | 2024-09-03 14:30:57.532206+05:30 |
> > inactive_timeout
> >
> > Sub:
> > newdb1=# alter subscription mysubnew1_1 disable;
> > ALTER SUBSCRIPTION
> >
> > newdb1=# alter subscription mysubnew1_1 set (failover=false);
> > ERROR:  could not alter replication slot "mysubnew1_1": ERROR:  can no
> > longer get changes from replication slot "mysubnew1_1"
> > DETAIL:  The slot became invalid because it was inactive since
> > 2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago.
> > HINT:  You might need to increase "replication_slot_inactive_timeout.".
> >
> > I think the behavior should be same.
> >
>
> We should not allow the invalid replication slot to be altered
> irrespective of the reason unless there is any benefit.
>

Okay, then I think we need to change the existing behaviour of the
other invalidation causes which still allow alter-slot.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

09 September 2024, 07:56:17

Hi,

On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> > We should not allow the invalid replication slot to be altered
> > irrespective of the reason unless there is any benefit.
>
> Okay, then I think we need to change the existing behaviour of the
> other invalidation causes which still allow alter-slot.

+1. Perhaps, track it in a separate thread?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

09 September 2024, 07:58:42

On Mon, Sep 9, 2024 at 10:26 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Hi,
>
> On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > > We should not allow the invalid replication slot to be altered
> > > irrespective of the reason unless there is any benefit.
> >
> > Okay, then I think we need to change the existing behaviour of the
> > other invalidation causes which still allow alter-slot.
>
> +1. Perhaps, track it in a separate thread?

I think so. It does not come under the scope of this thread.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

09 September 2024, 08:23:50

On Sun, Sep 8, 2024 at 5:25 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
>
> Please find the v45 patch. Addressed above and Shveta's review comments [1].
>

Thanks for the patch. Please find my comments:

1)
src/sgml/config.sgml:

+  Synced slots are always considered to be inactive because they
don't perform logical decoding to produce changes.

It is better we avoid such a statement, as internally we use logical
decoding to advance restart-lsn, see
'LogicalSlotAdvanceAndCheckSnapState' called form slotsync.c.
<Also see related comment 6 below>

2)
src/sgml/config.sgml:

+ disables the inactive timeout invalidation mechanism

+ Slot invalidation due to inactivity timeout occurs during checkpoint.

Either have 'inactive' at both the places or 'inactivity'.

3)
slot.c:
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause
cause,
+    ReplicationSlot *s,
+    XLogRecPtr oldestLSN,
+    Oid dboid,
+    TransactionId snapshotConflictHorizon,
+    bool *invalidated);
+static inline bool SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s);

I think, we do not need above 2 declarations. The code compile fine
without these as the usage is later than the definition.

4)
+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
+ */
+ if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)

The comment is generic while the 'if condition' is specific to one
invalidation cause. Even though I feel it can be made generic test for
all invalidation causes but that is not under scope of this thread and
needs more testing/analysis. For the time being, we can make comment
specific to the concerned invalidation cause. The header of function
will also need the same change.

5)
SlotInactiveTimeoutCheckAllowed():

+ * Check if inactive timeout invalidation mechanism is disabled or slot is
+ * currently being used or server is in recovery mode or slot on standby is
+ * currently being synced from the primary.
+ *

These comments say exact opposite of what we are checking in code.
Since the function name has 'Allowed' in it, we should be putting
comments which say what allows it instead of what disallows it.

6)

+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)

Perhaps we should avoid mentioning logical decoding here. When slots
are synced, they are performing decoding and their inactive_since is
changing continuously. A better way to make this statement will be:

We want to ensure that the slots being synchronized are not
invalidated, as they need to be preserved for future use when the
standby server is promoted to the primary. This is necessary for
resuming logical replication from the new primary server.
<Rephrase if needed>

7)

InvalidatePossiblyObsoleteSlot()

we are calling SlotInactiveTimeoutCheckAllowed() twice in this
function. We shall optimize.

At the first usage place, shall we simply get timestamp when cause is
RS_INVAL_INACTIVE_TIMEOUT without checking
SlotInactiveTimeoutCheckAllowed() as IMO it does not seem a
performance critical section. Or if we retain check at first place,
then at the second place we can avoid calling it again based on
whether 'now' is NULL or not.

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

09 September 2024, 12:34:44

On Mon, Sep 9, 2024 at 10:28 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Sep 9, 2024 at 10:26 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > > We should not allow the invalid replication slot to be altered
> > > > irrespective of the reason unless there is any benefit.
> > >
> > > Okay, then I think we need to change the existing behaviour of the
> > > other invalidation causes which still allow alter-slot.
> >
> > +1. Perhaps, track it in a separate thread?
>
> I think so. It does not come under the scope of this thread.
>

It makes sense to me as well. But let's go ahead and get that sorted out first.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

09 September 2024, 21:42:50

Hi,

On Mon, Sep 9, 2024 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > > > > We should not allow the invalid replication slot to be altered
> > > > > irrespective of the reason unless there is any benefit.
> > > >
> > > > Okay, then I think we need to change the existing behaviour of the
> > > > other invalidation causes which still allow alter-slot.
> > >
> > > +1. Perhaps, track it in a separate thread?
> >
> > I think so. It does not come under the scope of this thread.
>
> It makes sense to me as well. But let's go ahead and get that sorted out first.

Moved the discussion to new thread -
https://www.postgresql.org/message-id/CALj2ACW4fSOMiKjQ3%3D2NVBMTZRTG8Ujg6jsK9z3EvOtvA4vzKQ%40mail.gmail.com.
Please have a look.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

16 September 2024, 06:25:10

On Tue, Sep 10, 2024 at 12:13 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Sep 9, 2024 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > > > > We should not allow the invalid replication slot to be altered
> > > > > > irrespective of the reason unless there is any benefit.
> > > > >
> > > > > Okay, then I think we need to change the existing behaviour of the
> > > > > other invalidation causes which still allow alter-slot.
> > > >
> > > > +1. Perhaps, track it in a separate thread?
> > >
> > > I think so. It does not come under the scope of this thread.
> >
> > It makes sense to me as well. But let's go ahead and get that sorted out first.
>
> Moved the discussion to new thread -
> https://www.postgresql.org/message-id/CALj2ACW4fSOMiKjQ3%3D2NVBMTZRTG8Ujg6jsK9z3EvOtvA4vzKQ%40mail.gmail.com.
> Please have a look.
>

That is pushed now. Please send the rebased patch after addressing the
pending comments.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

16 September 2024, 12:47:47

Hi,

Thanks for reviewing.

On Mon, Sep 9, 2024 at 10:54 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> 2)
> src/sgml/config.sgml:
>
> + disables the inactive timeout invalidation mechanism
>
> + Slot invalidation due to inactivity timeout occurs during checkpoint.
>
> Either have 'inactive' at both the places or 'inactivity'.

Used "inactive timeout".

> 3)
> slot.c:
> +static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause
> cause,
> +    ReplicationSlot *s,
> +    XLogRecPtr oldestLSN,
> +    Oid dboid,
> +    TransactionId snapshotConflictHorizon,
> +    bool *invalidated);
> +static inline bool SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s);
>
> I think, we do not need above 2 declarations. The code compile fine
> without these as the usage is later than the definition.

Hm, it's a usual practice that I follow irrespective of the placement
of function declarations. Since it was brought up, I removed the
declarations.

> 4)
> + /*
> + * An error is raised if error_if_invalid is true and the slot has been
> + * invalidated previously.
> + */
> + if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
>
> The comment is generic while the 'if condition' is specific to one
> invalidation cause. Even though I feel it can be made generic test for
> all invalidation causes but that is not under scope of this thread and
> needs more testing/analysis.

Right.

> For the time being, we can make comment
> specific to the concerned invalidation cause. The header of function
> will also need the same change.

Adjusted the comment, but left the variable name error_if_invalid as
is. Didn't want to make it long, one can look at the code to
understand what it is used for.

> 5)
> SlotInactiveTimeoutCheckAllowed():
>
> + * Check if inactive timeout invalidation mechanism is disabled or slot is
> + * currently being used or server is in recovery mode or slot on standby is
> + * currently being synced from the primary.
> + *
>
> These comments say exact opposite of what we are checking in code.
> Since the function name has 'Allowed' in it, we should be putting
> comments which say what allows it instead of what disallows it.

Modified.

> 1)
> src/sgml/config.sgml:
>
> +  Synced slots are always considered to be inactive because they
> don't perform logical decoding to produce changes.
>
> It is better we avoid such a statement, as internally we use logical
> decoding to advance restart-lsn, see
> 'LogicalSlotAdvanceAndCheckSnapState' called form slotsync.c.
> <Also see related comment 6 below>
>
> 6)
>
> + * Synced slots are always considered to be inactive because they don't
> + * perform logical decoding to produce changes.
> + */
> +static inline bool
> +SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)
>
> Perhaps we should avoid mentioning logical decoding here. When slots
> are synced, they are performing decoding and their inactive_since is
> changing continuously. A better way to make this statement will be:
>
> We want to ensure that the slots being synchronized are not
> invalidated, as they need to be preserved for future use when the
> standby server is promoted to the primary. This is necessary for
> resuming logical replication from the new primary server.
> <Rephrase if needed>

They are performing logical decoding, but not producing the changes
for the clients to consume. So, IMO, the accompanying "to produce
changes" next to the "logical decoding" is good here.

> 7)
>
> InvalidatePossiblyObsoleteSlot()
>
> we are calling SlotInactiveTimeoutCheckAllowed() twice in this
> function. We shall optimize.
>
> At the first usage place, shall we simply get timestamp when cause is
> RS_INVAL_INACTIVE_TIMEOUT without checking
> SlotInactiveTimeoutCheckAllowed() as IMO it does not seem a
> performance critical section. Or if we retain check at first place,
> then at the second place we can avoid calling it again based on
> whether 'now' is NULL or not.

Getting a current timestamp can get costlier on platforms that use
various clock sources, so assigning 'now' unconditionally isn't the
way IMO. Using the inline function in two places improves the
readability. Can optimize it if there's any performance impact of
calling the inline function in two places.

Will post the new patch version soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

16 September 2024, 14:24:40

On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Please find the attached v46 patch having changes for the above review
> comments and your test review comments and Shveta's review comments.
>

-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
  ReplicationSlot *s;
  int active_pid;
@@ -615,6 +620,22 @@ retry:
  /* We made this slot active, so it's ours now. */
  MyReplicationSlot = s;

+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated due to inactive timeout.
+ */
+ if (error_if_invalid &&
+ s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+ {
+ Assert(s->inactive_since > 0);
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail("This slot has been invalidated because it was inactive
for longer than the amount of time specified by \"%s\".",
+    "replication_slot_inactive_timeout")));
+ }

Why raise the ERROR just for timeout invalidation here and why not if
the slot is invalidated for other reasons? This raises the question of
what happens before this patch if the invalid slot is used from places
where we call ReplicationSlotAcquire(). I did a brief code analysis
and found that for StartLogicalReplication(), even if the error won't
occur in ReplicationSlotAcquire(), it would have been caught in
CreateDecodingContext(). I think that is where we should also add this
new error. Similarly, pg_logical_slot_get_changes_guts() and other
logical replication functions should be calling
CreateDecodingContext() which can raise the new ERROR. I am not sure
about how the invalid slots are handled during physical replication,
please check the behavior of that before this patch.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Bharath Rupireddy

Date:

16 September 2024, 20:10:52

Hi,

Thanks for looking into this.

On Mon, Sep 16, 2024 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Why raise the ERROR just for timeout invalidation here and why not if
> the slot is invalidated for other reasons? This raises the question of
> what happens before this patch if the invalid slot is used from places
> where we call ReplicationSlotAcquire(). I did a brief code analysis
> and found that for StartLogicalReplication(), even if the error won't
> occur in ReplicationSlotAcquire(), it would have been caught in
> CreateDecodingContext(). I think that is where we should also add this
> new error. Similarly, pg_logical_slot_get_changes_guts() and other
> logical replication functions should be calling
> CreateDecodingContext() which can raise the new ERROR. I am not sure
> about how the invalid slots are handled during physical replication,
> please check the behavior of that before this patch.

When physical slots are invalidated due to wal_removed reason, the failure happens at a much later point for the streaming standbys while reading the requested WAL files like the following:

2024-09-16 16:29:52.416 UTC [876059] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000005 has already been removed
2024-09-16 16:29:52.416 UTC [872418] LOG: waiting for WAL to become available at 0/5002000

At this point, despite the slot being invalidated, its wal_status can still come back to 'unreserved' even from 'lost', and the standby can catch up if removed WAL files are copied either by manually or by a tool/script to the primary's pg_wal directory. IOW, the physical slots invalidated due to wal_removed are *somehow* recoverable unlike the logical slots.

IIUC, the invalidation of a slot implies that it is not guaranteed to hold any resources like WAL and XMINs. Does it also imply that the slot must be unusable?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

18 September 2024, 09:51:56

On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Hi,
>
>
> Please find the attached v46 patch having changes for the above review
> comments and your test review comments and Shveta's review comments.
>

Thanks for addressing comments.

Is there a reason that we don't support this invalidation on hot
standby for non-synced slots? Shouldn't we support this time-based
invalidation there too just like other invalidations?

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

18 September 2024, 12:19:00

On Wed, Sep 18, 2024 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Hi,
> >
> >
> > Please find the attached v46 patch having changes for the above review
> > comments and your test review comments and Shveta's review comments.
> >
>
> Thanks for addressing comments.
>
> Is there a reason that we don't support this invalidation on hot
> standby for non-synced slots? Shouldn't we support this time-based
> invalidation there too just like other invalidations?
>

Now since we are not changing inactive_since once it is invalidated,
we are not even initializing it during restart; and thus later when
someone tries to use slot, it leads to assert in
ReplicationSlotAcquire()  ( Assert(s->inactive_since > 0);

Steps:
--Disable logical subscriber and let the slot on publisher gets
invalidated due to inactive_timeout.
--Enable the logical subscriber again.
--Restart publisher.

a) We should initialize inactive_since when
ReplicationSlotSetInactiveSince() is called from RestoreSlotFromDisk()
even though it is invalidated.
b) And shall we mention in the doc of 'active_since', that once the
slot is invalidated, this value will remain unchanged until we
shutdown the server. On server restart, it is initialized to start
time. Thought?


thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

18 September 2024, 13:01:16

On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> > > Please find the attached v46 patch having changes for the above review
> > > comments and your test review comments and Shveta's review comments.
> > >

When the synced slot is marked as 'inactive_timeout' invalidated on
hot standby due to invalidation of publisher 's failover slot, the
former starts showing NULL' inactive_since'. Is this intentional
behaviour? I feel inactive_since should be non-NULL here too?
Thoughts?

physical standby:
postgres=# select slot_name, inactive_since, invalidation_reason,
failover, synced from pg_replication_slots;
slot_name  |          inactive_since                              |
invalidation_reason | failover | synced
-------------+----------------------------------+---------------------+----------+--------
sub2 | 2024-09-18 15:20:04.364998+05:30 |           | t        | t
sub3 | 2024-09-18 15:20:04.364953+05:30 |           | t        | t

After sync of invalidation_reason:

slot_name  |          inactive_since          | invalidation_reason |
failover | synced
-------------+----------------------------------+---------------------+----------+--------
 sub2 |                               | inactive_timeout    | t        | t
 sub3 |                               | inactive_timeout    | t        | t


thanks
shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

18 September 2024, 15:10:10

On Mon, Sep 16, 2024 at 10:41 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Thanks for looking into this.
>
> On Mon, Sep 16, 2024 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Why raise the ERROR just for timeout invalidation here and why not if
> > the slot is invalidated for other reasons? This raises the question of
> > what happens before this patch if the invalid slot is used from places
> > where we call ReplicationSlotAcquire(). I did a brief code analysis
> > and found that for StartLogicalReplication(), even if the error won't
> > occur in ReplicationSlotAcquire(), it would have been caught in
> > CreateDecodingContext(). I think that is where we should also add this
> > new error. Similarly, pg_logical_slot_get_changes_guts() and other
> > logical replication functions should be calling
> > CreateDecodingContext() which can raise the new ERROR. I am not sure
> > about how the invalid slots are handled during physical replication,
> > please check the behavior of that before this patch.
>
> When physical slots are invalidated due to wal_removed reason, the failure happens at a much later point for the
streamingstandbys while reading the requested WAL files like the following: 
>
> 2024-09-16 16:29:52.416 UTC [876059] FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment
000000010000000000000005has already been removed 
> 2024-09-16 16:29:52.416 UTC [872418] LOG:  waiting for WAL to become available at 0/5002000
>
> At this point, despite the slot being invalidated, its wal_status can still come back to 'unreserved' even from
'lost',and the standby can catch up if removed WAL files are copied either by manually or by a tool/script to the
primary'spg_wal directory. IOW, the physical slots invalidated due to wal_removed are *somehow* recoverable unlike the
logicalslots. 
>
> IIUC, the invalidation of a slot implies that it is not guaranteed to hold any resources like WAL and XMINs. Does it
alsoimply that the slot must be unusable? 
>

If we can't hold the dead rows against xmin of the invalid slot, then
how can we make it usable even after copying the required WAL?

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

shveta malik

Date:

19 September 2024, 07:10:12

On Wed, Sep 18, 2024 at 3:31 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > > > Please find the attached v46 patch having changes for the above review
> > > > comments and your test review comments and Shveta's review comments.
> > > >
>

When we promote hot standby with synced logical slots to become new
primary, the logical slots are never invalidated with
'inactive_timeout' on new primary.  It seems the check in
SlotInactiveTimeoutCheckAllowed() is wrong. We should allow
invalidation of slots on primary even if they are marked as 'synced'.
Please see [4].
I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3].
Once all these are addressed, I can continue reviewing further.

[1]: https://www.postgresql.org/message-id/CAJpy0uAwxc49Dz6t%3D-y_-z-MU%2BA4RWX4BR3Zri_jj2qgGMq_8g%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAJpy0uC6nN3SLbEuCvz7-CpaPdNdXxH%3DfeW5MhYQch-JWV0tLg%40mail.gmail.com
[3]: https://www.postgresql.org/message-id/CAJpy0uBXXJC6f04%2BFU1axKaU%2Bp78wN0SEhUNE9XoqbjXj%3Dhhgw%40mail.gmail.com

[4]:
--------------------
postgres=#  select pg_is_in_recovery();
--------
 f

postgres=# show replication_slot_inactive_timeout;
 replication_slot_inactive_timeout
-----------------------------------
 10s

postgres=# select slot_name, inactive_since, invalidation_reason,
synced from pg_replication_slots;
  slot_name  |          inactive_since          | invalidation_reason | synced
-------------+----------------------------------+---------------------+----------+--------
 mysubnew1_1 | 2024-09-19 09:04:09.714283+05:30 |                     | t

postgres=# select now();
               now
----------------------------------
 2024-09-19 09:06:28.871354+05:30

postgres=# checkpoint;
CHECKPOINT

postgres=# select slot_name, inactive_since, invalidation_reason,
synced from pg_replication_slots;
  slot_name  |          inactive_since          | invalidation_reason | synced
-------------+----------------------------------+---------------------+----------+--------
 mysubnew1_1 | 2024-09-19 09:04:09.714283+05:30 |                     | t
--------------------

thanks
Shveta

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

11 November 2024, 15:42:28

On Wed, Sep 18, 2024 at 3:31 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > > > Please find the attached v46 patch having changes for the above review
> > > > comments and your test review comments and Shveta's review comments.
> > > >
>
> When the synced slot is marked as 'inactive_timeout' invalidated on
> hot standby due to invalidation of publisher 's failover slot, the
> former starts showing NULL' inactive_since'. Is this intentional
> behaviour? I feel inactive_since should be non-NULL here too?
> Thoughts?
>
> physical standby:
> postgres=# select slot_name, inactive_since, invalidation_reason,
> failover, synced from pg_replication_slots;
> slot_name  |          inactive_since                              |
> invalidation_reason | failover | synced
> -------------+----------------------------------+---------------------+----------+--------
> sub2 | 2024-09-18 15:20:04.364998+05:30 |           | t        | t
> sub3 | 2024-09-18 15:20:04.364953+05:30 |           | t        | t
>
> After sync of invalidation_reason:
>
> slot_name  |          inactive_since          | invalidation_reason |
> failover | synced
> -------------+----------------------------------+---------------------+----------+--------
>  sub2 |                               | inactive_timeout    | t        | t
>  sub3 |                               | inactive_timeout    | t        | t
>
>

For synced slots on the standby, inactive_since indicates the last
synchronization time rather than the time the slot became inactive
(see doc - https://www.postgresql.org/docs/devel/view-pg-replication-slots.html).

In the reported case above, once a synced slot is invalidated we don't
even keep the last synchronization time for it. This is because when a
synced slot on the standby is marked invalid, inactive_since is reset
to NULL each time the slot-sync worker acquires a lock on it. This
lock acquisition before checking invalidation is done to avoid certain
race conditions and will activate the slot temporarily, resetting
inactive_since. Later, the slot-sync worker updates inactive_since for
all synced slots to the current synchronization time. However, for
invalid slots, this update is skipped, as per the patch’s design.

If we want to preserve the inactive_since value for the invalid synced
slots on standby, we need to clarify the time it should display. Here
are three possible approaches:

1) Copy the primary's inactive_since upon invalidation: When a slot
becomes invalid on the primary, the slot-sync worker could copy the
primary slot’s inactive_since to the standby slot and retain it, by
preventing future updates on the standby.

2) Use the current time of standby when the synced slot is marked
invalid for the first time and do not update it in subsequent sync
cycles if the slot is invalid.

Approach (2) seems more reasonable to me, however, Both 1) & 2)
approaches contradicts the purpose of inactive_since, as it no longer
represents either the true "last sync time" or the "time slot became
inactive" because the slot-sync worker acquires locks periodically for
syncing, and keeps activating the slot.

3) Continuously update inactive_since for invalid synced slots as
well: Treat invalid synced slots like valid ones by updating
inactive_since with each sync cycle. This way, we can keep the "last
sync time" in the inactive_since. However, this could confuse users
when "invalidation_reason=inactive_timeout" is set for a synced slot
on standby but inactive_since would reflect sync time rather than the
time slot became inactive. IIUC, on the primary, when
invalidation_reason=inactive_timeout for a slot, the inactive_since
represents the actual time the slot became inactive before getting
invalidated, unless the primary is restarted.

Thoughts?

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

13 November 2024, 12:29:25

On Thu, 7 Nov 2024 at 15:33, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Please find the attached v46 patch having changes for the above review
> > comments and your test review comments and Shveta's review comments.
> >
> Hi,
>
> I’ve reviewed this thread and am interested in working on the
> remaining tasks and comments, as well as the future review comments.
> However, Bharath, please let me know if you'd prefer to continue with
> it.
>
> Attached the rebased v47 patch, which also addresses Peter’s comments
> #2, #3, and #4 at [1]. I will try addressing other comments as well in
> next versions.

The following crash occurs while upgrading:
2024-11-13 14:19:45.955 IST [44539] LOG:  checkpoint starting: time
TRAP: failed Assert("!(*invalidated && SlotIsLogical(s) &&
IsBinaryUpgrade)"), File: "slot.c", Line: 1793, PID: 44539
postgres: checkpointer (ExceptionalCondition+0xbb)[0x555555e305bd]
postgres: checkpointer (+0x63ab04)[0x555555b8eb04]
postgres: checkpointer
(InvalidateObsoleteReplicationSlots+0x149)[0x555555b8ee5f]
postgres: checkpointer (CheckPointReplicationSlots+0x267)[0x555555b8f125]
postgres: checkpointer (+0x1f3ee8)[0x555555747ee8]
postgres: checkpointer (CreateCheckPoint+0x78f)[0x5555557475ee]
postgres: checkpointer (CheckpointerMain+0x632)[0x555555b2f1e7]
postgres: checkpointer (postmaster_child_launch+0x119)[0x555555b30892]
postgres: checkpointer (+0x5e2dc8)[0x555555b36dc8]
postgres: checkpointer (PostmasterMain+0x14bd)[0x555555b33647]
postgres: checkpointer (+0x487f2e)[0x5555559dbf2e]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7ffff6c29d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7ffff6c29e40]
postgres: checkpointer (_start+0x25)[0x555555634c25]
2024-11-13 14:19:45.967 IST [44538] LOG:  checkpointer process (PID
44539) was terminated by signal 6: Aborted

This can happen in the following case:
1) Setup a logical replication cluster with enough data so that it
will take at least few minutes to upgrade
2) Stop the publisher node
3) Configure replication_slot_inactive_timeout and checkpoint_timeout
to 30 seconds
4) Upgrade the publisher node.

This is happening because logical replication slots are getting
invalidated during upgrade and there is an assertion which checks that
the slots are not invalidated.
I feel this can be fixed by having a function similar to
check_max_slot_wal_keep_size which will make sure that
replication_slot_inactive_timeout is 0 during upgrade.

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

13 November 2024, 12:35:01

On Wed, Sep 18, 2024 at 12:22 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Hi,
> >
> >
> > Please find the attached v46 patch having changes for the above review
> > comments and your test review comments and Shveta's review comments.
> >
>
> Thanks for addressing comments.
>
> Is there a reason that we don't support this invalidation on hot
> standby for non-synced slots? Shouldn't we support this time-based
> invalidation there too just like other invalidations?
>

I don’t see any reason to *not* support this invalidation on hot
standby for non-synced slots. Therefore, I’ve added the same in v48.

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

14 November 2024, 02:58:45

Hi Nisha.

Thanks for the recent patch updates. Here are my review comments for
the latest patch v48-0001.

======
Commit message

1.
Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage for instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

~

What do the words "for instance" mean here? Did it mean "per instance"
or "(for example)" or something else?

======
doc/src/sgml/system-views.sgml

2.
       <para>
         The time since the slot has become inactive.
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being
synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the

Is this change related to the new inactivity timeout feature or are
you just clarifying the existing behaviour of the 'active_since'
field.

Note there is already another thread [1] created to patch/clarify this
same field. So if you are just clarifying existing behavior then IMO
it would be better if you can to try and get your desired changes
included there quickly before that other patch gets pushed.

~~~

3.
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for longer than the amount of time specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>

Maybe there is a slightly shorter/simpler way to express this. For example,

BEFORE
inactive_timeout means that the slot has been inactive for longer than
the amount of time specified by the replication_slot_inactive_timeout
parameter.

SUGGESTION
inactive_timeout means that the slot has remained inactive beyond the
duration specified by the replication_slot_inactive_timeout parameter.

======
src/backend/replication/slot.c

4.
+int replication_slot_inactive_timeout = 0;

IMO it would be more informative to give the units in the variable
name (but not in the GUC name). e.g.
'replication_slot_inactive_timeout_secs'.

~~~

ReplicationSlotAcquire:

5.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)

This function comment makes it seem like "invalidated previously"
might mean *any* kind of invalidation, but later in the body of the
function we find the logic is really only used for inactive timeout.

+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated due to inactive timeout.
+ */

So, I think a better name for that parameter might be
'error_if_inactive_timeout'

OTOH, if it really is supposed to erro for *any* kind of invalidation
then there needs to be more ereports.

~~~

6.
+ errdetail("This slot has been invalidated because it was inactive
for longer than the amount of time specified by \"%s\".",

This errdetail message seems quite long. I think it can be shortened
like below and still retain exactly the same meaning:

BEFORE:
This slot has been invalidated because it was inactive for longer than
the amount of time specified by \"%s\".

SUGGESTION:
This slot has been invalidated due to inactivity exceeding the time
limit set by "%s".

~~~

ReportSlotInvalidation:

7.
+ case RS_INVAL_INACTIVE_TIMEOUT:
+ Assert(inactive_since > 0);
+ appendStringInfo(&err_detail,
+ _("The slot has been inactive since %s for longer than the amount of
time specified by \"%s\"."),
+ timestamptz_to_str(inactive_since),
+ "replication_slot_inactive_timeout");
+ break;

Here also as in the above review comment #6 I think the message can be
shorter and still say the same thing

BEFORE:
_("The slot has been inactive since %s for longer than the amount of
time specified by \"%s\"."),

SUGGESTION:
_("The slot has been inactive since %s, exceeding the time limit set
by \"%s\"."),

~~~

SlotInactiveTimeoutCheckAllowed:

8.
+/*
+ * Is this replication slot allowed for inactive timeout invalidation check?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. Server is in recovery and slot is not being synced from the primary
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */

8a.
Somehow that first sentence seems strange. Would it be better to write it like:

SUGGESTION
Can this replication slot timeout due to inactivity?

~

8b.
AFAICT that reason 3 ("Server is in recovery and slot is not being
synced from the primary") seems not quite worded right...

Should it say more like:
The slot is not being synced from the primary while the server is in recovery

or maybe like:
The slot is not currently being synced from the primary (e.g. not
'synced' is true when server is in recovery)

~

8c.
Similarly, I think something about that "Note that the inactive
timeout invalidation mechanism is not applicable..." paragraph needs
tweaking because IMO that should also now be saying something about
'RecoveryInProgress'.

~~~

9.
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)

Maybe the function name should be 'IsSlotInactiveTimeoutPossible' or
something better.

~~~

InvalidatePossiblyObsoleteSlot:

10.
  break;
+ case RS_INVAL_INACTIVE_TIMEOUT:
+
+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */

Since there are no other blank lines anywhere in this switch, the
introduction of this one in v48 looks out of place to me. IMO it would
be more readable if a blank line followed each/every of the breaks,
but then that is not a necessary change for this patch so...

~~~

11.
+ /*
+ * Invalidation due to inactive timeout implies that
+ * no one is using the slot.
+ */
+ Assert(s->active_pid == 0);

Given this assertion, does it mean that "(s->active_pid == 0)" should
have been another condition done up-front in the function
'SlotInactiveTimeoutCheckAllowed'?

~~~

12.
  /*
- * If the slot can be acquired, do so and mark it invalidated
- * immediately.  Otherwise we'll signal the owning process, below, and
- * retry.
+ * If the slot can be acquired, do so and mark it as invalidated. If
+ * the slot is already ours, mark it as invalidated. Otherwise, we'll
+ * signal the owning process below and retry.
  */
- if (active_pid == 0)
+ if (active_pid == 0 ||
+ (MyReplicationSlot == s &&
+ active_pid == MyProcPid))

I wasn't sure how this change belongs to this patch, because the logic
of the previous review comment said for the case of invalidation due
to inactivity that active_id must be 0. e.g. Assert(s->active_pid ==
0);

~~~

RestoreSlotFromDisk:

13.
- slot->inactive_since = GetCurrentTimestamp();
+ slot->inactive_since = now;

In v47 this assignment used to call the function
'ReplicationSlotSetInactiveSince'. I recognise there is a very subtle
difference between direct assignment and the function, because the
function will skip assignment if the slot is already invalidated.
Anyway, if you are *deliberately* not wanting to call
ReplicationSlotSetInactiveSince here then I think this assignment
should be commented to explain the reason why not, otherwise someone
in the future might be tempted to think it was just an oversight and
add the call back in that you don't want.

======
src/test/recovery/t/050_invalidate_slots.pl

14.
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby. So, we must not see invalidation message in server
+# log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+ 'postgres',
+ q{SELECT count(*) = 1 FROM pg_replication_slots
+   WHERE slot_name = 'sync_slot1'
+ AND invalidation_reason IS NULL;}
+ ),
+ "t",
+ 'check that synced slot sync_slot1 has not been invalidated on standby');
+

But, now, we are confirming this by another way -- not checking the
logs here, so the comment "So, we must not see invalidation message in
server log." is no longer appropriate here.

======
[1]
https://www.postgresql.org/message-id/flat/CAA4eK1JQFdssaBBh-oQskpKM-UpG8jPyUdtmGWa_0qCDy%2BK7_A%40mail.gmail.com#ab98379f220288ed40d34f8c2a21cf96

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

14 November 2024, 06:44:40

On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Please find the v48 patch attached.
>
> On Thu, Sep 19, 2024 at 9:40 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > When we promote hot standby with synced logical slots to become new
> > primary, the logical slots are never invalidated with
> > 'inactive_timeout' on new primary.  It seems the check in
> > SlotInactiveTimeoutCheckAllowed() is wrong. We should allow
> > invalidation of slots on primary even if they are marked as 'synced'.
>
> fixed.
>
> > I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3].
> > Once all these are addressed, I can continue reviewing further.
> >
>
> Fixed issues reported in [1], [2].

Few comments:
1) Since we don't change the value of now in
ReplicationSlotSetInactiveSince, the function parameter can be passed
by value:
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+                                                               bool
acquire_lock)
+{
+       if (s->data.invalidated != RS_INVAL_NONE)
+               return;
+
+       if (acquire_lock)
+               SpinLockAcquire(&s->mutex);
+
+       s->inactive_since = *now;

2) Currently it allows a minimum value of less than 1 second like in
milliseconds, I feel we can have some minimum value at least something
like checkpoint_timeout:
diff --git a/src/backend/utils/misc/guc_tables.c
b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..367f510118 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
                NULL, NULL, NULL
        },

+       {
+               {"replication_slot_inactive_timeout", PGC_SIGHUP,
REPLICATION_SENDING,
+                       gettext_noop("Sets the amount of time a
replication slot can remain inactive before "
+                                                "it will be invalidated."),
+                       NULL,
+                       GUC_UNIT_S
+               },
+               &replication_slot_inactive_timeout,
+               0, 0, INT_MAX,
+               NULL, NULL, NULL
+       },

3) Since SlotInactiveTimeoutCheckAllowed check is just done above and
the current time has been retrieved can we used "now" variable instead
of SlotInactiveTimeoutCheckAllowed again second time:
@@ -1651,6 +1713,26 @@
InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
                                        if (SlotIsLogical(s))
                                                invalidation_cause = cause;
                                        break;
+                               case RS_INVAL_INACTIVE_TIMEOUT:
+
+                                       /*
+                                        * Check if the slot needs to
be invalidated due to
+                                        *
replication_slot_inactive_timeout GUC.
+                                        */
+                                       if
(SlotInactiveTimeoutCheckAllowed(s) &&
+
TimestampDifferenceExceeds(s->inactive_since, now,
+
                            replication_slot_inactive_timeout * 1000))
+                                       {
+                                               invalidation_cause = cause;
+                                               inactive_since =
s->inactive_since;

4) I'm not sure if this change required by this patch or is it a
general optimization, if it is required for this patch we can detail
the comments:
@@ -2208,6 +2328,7 @@ RestoreSlotFromDisk(const char *name)
        bool            restored = false;
        int                     readBytes;
        pg_crc32c       checksum;
+       TimestampTz now;

        /* no need to lock here, no concurrent access allowed yet */

@@ -2368,6 +2489,9 @@ RestoreSlotFromDisk(const char *name)
                                                NameStr(cp.slotdata.name)),
                                 errhint("Change \"wal_level\" to be
\"replica\" or higher.")));

+       /* Use same inactive_since time for all slots */
+       now = GetCurrentTimestamp();
+
        /* nothing can be active yet, don't lock anything */
        for (i = 0; i < max_replication_slots; i++)
        {
@@ -2400,7 +2524,7 @@ RestoreSlotFromDisk(const char *name)
                 * slot from the disk into memory. Whoever acquires
the slot i.e.
                 * makes the slot active will reset it.
                 */
-               slot->inactive_since = GetCurrentTimestamp();
+               slot->inactive_since = now;

5) Why should the slot invalidation be updated during shutdown,
shouldn't the inactive_since value be intact during shutdown?
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being
synced from a

6) New Style of ereport does not need braces around errcode, it can be
changed similarly:
+       if (error_if_invalid &&
+               s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+       {
+               Assert(s->inactive_since > 0);
+               ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                errmsg("can no longer get changes
from replication slot \"%s\"",
+                                               NameStr(s->data.name)),
+                                errdetail("This slot has been
invalidated because it was inactive for longer than the amount of time
specified by \"%s\".",
+
"replication_slot_inactive_timeout")));

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

19 November 2024, 10:17:37

On Thu, Nov 14, 2024 at 5:29 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Nisha.
>
> Thanks for the recent patch updates. Here are my review comments for
> the latest patch v48-0001.
>

Thank you for the review. Comments are addressed in v49 version.
Below is my response to comments that may require further discussion.

> ======
> doc/src/sgml/system-views.sgml
>
> 2.
>        <para>
>          The time since the slot has become inactive.
> -        <literal>NULL</literal> if the slot is currently being used.
> -        Note that for slots on the standby that are being synced from a
> +        <literal>NULL</literal> if the slot is currently being used. Once the
> +        slot is invalidated, this value will remain unchanged until we shutdown
> +        the server. Note that for slots on the standby that are being
> synced from a
>          primary server (whose <structfield>synced</structfield> field is
>          <literal>true</literal>), the
>
> Is this change related to the new inactivity timeout feature or are
> you just clarifying the existing behaviour of the 'active_since'
> field.
>

Yes, this patch introduces inactive_timeout invalidation and prevents
updates to inactive_since for invalid slots. Only a node restart can
modify it,  so, I believe we should retain these lines in this patch.

> Note there is already another thread [1] created to patch/clarify this
> same field. So if you are just clarifying existing behavior then IMO
> it would be better if you can to try and get your desired changes
> included there quickly before that other patch gets pushed.
>

Thanks for the reference, I have posted my suggestion on the thread.

>
> ReplicationSlotAcquire:
>
> 5.
> + *
> + * An error is raised if error_if_invalid is true and the slot has been
> + * invalidated previously.
>   */
>  void
> -ReplicationSlotAcquire(const char *name, bool nowait)
> +ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
>
> This function comment makes it seem like "invalidated previously"
> might mean *any* kind of invalidation, but later in the body of the
> function we find the logic is really only used for inactive timeout.
>
> + /*
> + * An error is raised if error_if_invalid is true and the slot has been
> + * previously invalidated due to inactive timeout.
> + */
>
> So, I think a better name for that parameter might be
> 'error_if_inactive_timeout'
>
> OTOH, if it really is supposed to erro for *any* kind of invalidation
> then there needs to be more ereports.
>

+1 to the idea.
I have created a separate patch v49-0001 adding more ereports for all
kinds of invalidations.

> ~~~
> SlotInactiveTimeoutCheckAllowed:
>
> 8.
> +/*
> + * Is this replication slot allowed for inactive timeout invalidation check?
> + *
> + * Inactive timeout invalidation is allowed only when:
> + *
> + * 1. Inactive timeout is set
> + * 2. Slot is inactive
> + * 3. Server is in recovery and slot is not being synced from the primary
> + *
> + * Note that the inactive timeout invalidation mechanism is not
> + * applicable for slots on the standby server that are being synced
> + * from the primary server (i.e., standby slots having 'synced' field 'true').
> + * Synced slots are always considered to be inactive because they don't
> + * perform logical decoding to produce changes.
> + */
>
> 8a.
> Somehow that first sentence seems strange. Would it be better to write it like:
>
> SUGGESTION
> Can this replication slot timeout due to inactivity?
>

I feel the suggestion is not very clear on the purpose of the
function, This function doesn't check inactivity or decide slot
timeout invalidation. It only pre-checks if the slot qualifies for an
inactivity check, which the caller will perform.
As I have changed function name too as per commnet#9, I used the following  -
"Is inactive timeout invalidation possible for this replication slot?"
Thoughts?

> ~
> 8c.
> Similarly, I think something about that "Note that the inactive
> timeout invalidation mechanism is not applicable..." paragraph needs
> tweaking because IMO that should also now be saying something about
> 'RecoveryInProgress'.
>

'RecoveryInProgress' check indicates that the server is a standby, and
the mentioned paragraph uses the term "standby" to describe the
condition. It seems unnecessary to mention RecoveryInProgress
separately.

> ~~~
>
> InvalidatePossiblyObsoleteSlot:
>
> 10.
>   break;
> + case RS_INVAL_INACTIVE_TIMEOUT:
> +
> + /*
> + * Check if the slot needs to be invalidated due to
> + * replication_slot_inactive_timeout GUC.
> + */
>
> Since there are no other blank lines anywhere in this switch, the
> introduction of this one in v48 looks out of place to me.

pgindent automatically added this blank line after 'case
RS_INVAL_INACTIVE_TIMEOUT'.

> IMO it would
> be more readable if a blank line followed each/every of the breaks,
> but then that is not a necessary change for this patch so...
>

Since it's not directly related to the patch, I feel it might be best
to leave it as is for now.

> ~~~
>
> 11.
> + /*
> + * Invalidation due to inactive timeout implies that
> + * no one is using the slot.
> + */
> + Assert(s->active_pid == 0);
>
> Given this assertion, does it mean that "(s->active_pid == 0)" should
> have been another condition done up-front in the function
> 'SlotInactiveTimeoutCheckAllowed'?
>

I don't think it's a good idea to check (s->active_pid == 0) upfront,
before the timeout-invalidation check. AFAIU, this assertion is meant
to ensure active_pid = 0 only if the slot is going to be invalidated,
i.e., when the following condition is true:

TimestampDifferenceExceeds(s->inactive_since, now,

replication_slot_inactive_timeout_sec * 1000)

Thoughts? Open to others' opinions too.

> ~~~
>
> 12.
>   /*
> - * If the slot can be acquired, do so and mark it invalidated
> - * immediately.  Otherwise we'll signal the owning process, below, and
> - * retry.
> + * If the slot can be acquired, do so and mark it as invalidated. If
> + * the slot is already ours, mark it as invalidated. Otherwise, we'll
> + * signal the owning process below and retry.
>   */
> - if (active_pid == 0)
> + if (active_pid == 0 ||
> + (MyReplicationSlot == s &&
> + active_pid == MyProcPid))
>
> I wasn't sure how this change belongs to this patch, because the logic
> of the previous review comment said for the case of invalidation due
> to inactivity that active_id must be 0. e.g. Assert(s->active_pid ==
> 0);
>

I don't fully understand the purpose of this change yet. I'll look
into it further and get back.

> ~~~
>
> RestoreSlotFromDisk:
>
> 13.
> - slot->inactive_since = GetCurrentTimestamp();
> + slot->inactive_since = now;
>
> In v47 this assignment used to call the function
> 'ReplicationSlotSetInactiveSince'. I recognise there is a very subtle
> difference between direct assignment and the function, because the
> function will skip assignment if the slot is already invalidated.
> Anyway, if you are *deliberately* not wanting to call
> ReplicationSlotSetInactiveSince here then I think this assignment
> should be commented to explain the reason why not, otherwise someone
> in the future might be tempted to think it was just an oversight and
> add the call back in that you don't want.
>

Added comment saying avoid using ReplicationSlotSetInactiveSince()
here as it will skip the invalid slots.

~~~~

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

19 November 2024, 10:20:51

On Thu, Nov 14, 2024 at 9:14 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > Please find the v48 patch attached.
> >
> > On Thu, Sep 19, 2024 at 9:40 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > When we promote hot standby with synced logical slots to become new
> > > primary, the logical slots are never invalidated with
> > > 'inactive_timeout' on new primary.  It seems the check in
> > > SlotInactiveTimeoutCheckAllowed() is wrong. We should allow
> > > invalidation of slots on primary even if they are marked as 'synced'.
> >
> > fixed.
> >
> > > I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3].
> > > Once all these are addressed, I can continue reviewing further.
> > >
> >
> > Fixed issues reported in [1], [2].
>
> Few comments:

Thanks for the review.

>
> 2) Currently it allows a minimum value of less than 1 second like in
> milliseconds, I feel we can have some minimum value at least something
> like checkpoint_timeout:
> diff --git a/src/backend/utils/misc/guc_tables.c
> b/src/backend/utils/misc/guc_tables.c
> index 8a67f01200..367f510118 100644
> --- a/src/backend/utils/misc/guc_tables.c
> +++ b/src/backend/utils/misc/guc_tables.c
> @@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
>                 NULL, NULL, NULL
>         },
>
> +       {
> +               {"replication_slot_inactive_timeout", PGC_SIGHUP,
> REPLICATION_SENDING,
> +                       gettext_noop("Sets the amount of time a
> replication slot can remain inactive before "
> +                                                "it will be invalidated."),
> +                       NULL,
> +                       GUC_UNIT_S
> +               },
> +               &replication_slot_inactive_timeout,
> +               0, 0, INT_MAX,
> +               NULL, NULL, NULL
> +       },
>

Currently, the feature is disabled by default when
replication_slot_inactive_timeout = 0. However, if we set a minimum
value, the default_val cannot be less than min_val, making it
impossible to use 0 to disable the feature.
Thoughts or any suggestions?

>
> 4) I'm not sure if this change required by this patch or is it a
> general optimization, if it is required for this patch we can detail
> the comments:
> @@ -2208,6 +2328,7 @@ RestoreSlotFromDisk(const char *name)
>         bool            restored = false;
>         int                     readBytes;
>         pg_crc32c       checksum;
> +       TimestampTz now;
>
>         /* no need to lock here, no concurrent access allowed yet */
>
> @@ -2368,6 +2489,9 @@ RestoreSlotFromDisk(const char *name)
>                                                 NameStr(cp.slotdata.name)),
>                                  errhint("Change \"wal_level\" to be
> \"replica\" or higher.")));
>
> +       /* Use same inactive_since time for all slots */
> +       now = GetCurrentTimestamp();
> +
>         /* nothing can be active yet, don't lock anything */
>         for (i = 0; i < max_replication_slots; i++)
>         {
> @@ -2400,7 +2524,7 @@ RestoreSlotFromDisk(const char *name)
>                  * slot from the disk into memory. Whoever acquires
> the slot i.e.
>                  * makes the slot active will reset it.
>                  */
> -               slot->inactive_since = GetCurrentTimestamp();
> +               slot->inactive_since = now;
>

After removing the "ReplicationSlotSetInactiveSince" from here, it
became irrelevant to this patch. Now, it is a general optimization to
set the same timestamp for all slots while restoring from disk. I have
added a few comments as per Peter's suggestion.

> 5) Why should the slot invalidation be updated during shutdown,
> shouldn't the inactive_since value be intact during shutdown?
> -        <literal>NULL</literal> if the slot is currently being used.
> -        Note that for slots on the standby that are being synced from a
> +        <literal>NULL</literal> if the slot is currently being used. Once the
> +        slot is invalidated, this value will remain unchanged until we shutdown
> +        the server. Note that for slots on the standby that are being
> synced from a
>

The "inactive_since" data of a slot is not stored on disk, so the
older value cannot be restored after a restart.

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

19 November 2024, 13:23:35

On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Attached is the v49 patch set:
> - Fixed the bug reported in [1].
> - Addressed comments in [2] and [3].
>
> I've split the patch into two, implementing the suggested idea in
> comment #5 of [2] separately in 001:
>
> Patch-001: Adds additional error reports (for all invalidation types)
> in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
> true.
> Patch-002: The original patch with comments addressed.

Few comments:
1) I felt this check in wait_for_slot_invalidation is not required as
there is a call to trigger_slot_invalidation which sleeps for
inactive_timeout seconds and ensures checkpoint is triggered, also the
test passes without this:
+       # Wait for slot to become inactive
+       $node->poll_query_until(
+               'postgres', qq[
+               SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+                       WHERE slot_name = '$slot' AND active = 'f' AND
+                                 inactive_since IS NOT NULL;
+       ])
+         or die
+         "Timed out while waiting for slot $slot to become inactive
on node $node_name";

2) Instead of calling this in a loop, won't it be enough to call
checkpoint only once explicitly:
+       for (my $i = 0; $i < 10 *
$PostgreSQL::Test::Utils::timeout_default; $i++)
+       {
+               $node->safe_psql('postgres', "CHECKPOINT");
+               if ($node->log_contains(
+                               "invalidating obsolete replication
slot \"$slot\"", $offset))
+               {
+                       $invalidated = 1;
+                       last;
+               }
+               usleep(100_000);
+       }
+       ok($invalidated,
+               "check that slot $slot invalidation has been logged on
node $node_name"
+       );

3) Since pg_sync_replication_slots is a sync call, we can directly use
"is( $standby1->safe_psql('postgres', SELECT COUNT(slot_name) = 1 FROM
pg_replication_slots..." instead of poll_query_until:
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+$standby1->poll_query_until(
+       'postgres', qq[
+       SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+               WHERE slot_name = 'sync_slot1' AND
+               invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for sync_slot1 invalidation to be synced
on standby";

4) Since this variable is being referred to at many places, how about
changing it to inactive_timeout_1s so that it is easier while
reviewing across many places:
# Set timeout GUC on the standby to verify that the next checkpoint will not
# invalidate synced slots.
my $inactive_timeout = 1;

5) Since we have already tested invalidation of logical replication
slot 'sync_slot1' above, this test might not be required:
+# =============================================================================
+# Testcase start
+# Invalidate logical subscriber slot due to inactive timeout.
+
+my $publisher = $primary;
+
+# Prepare for test
+$publisher->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

20 November 2024, 10:59:29

On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Attached is the v49 patch set:
> - Fixed the bug reported in [1].
> - Addressed comments in [2] and [3].
>
> I've split the patch into two, implementing the suggested idea in
> comment #5 of [2] separately in 001:
>
> Patch-001: Adds additional error reports (for all invalidation types)
> in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
> true.
> Patch-002: The original patch with comments addressed.

This Assert can fail:
+                                       /*
+                                        * Check if the slot needs to
be invalidated due to
+                                        *
replication_slot_inactive_timeout GUC.
+                                        */
+                                       if (now &&
+
TimestampDifferenceExceeds(s->inactive_since, now,
+
                            replication_slot_inactive_timeout_sec *
1000))
+                                       {
+                                               invalidation_cause = cause;
+                                               inactive_since =
s->inactive_since;
+
+                                               /*
+                                                * Invalidation due to
inactive timeout implies that
+                                                * no one is using the slot.
+                                                */
+                                               Assert(s->active_pid == 0);

With the following scenario:
Set replication_slot_inactive_timeout to 10 seconds
-- Create a slot
postgres=# select pg_create_logical_replication_slot ('test',
'pgoutput', true, true);
 pg_create_logical_replication_slot
------------------------------------
 (test,0/1748068)
(1 row)

-- Wait for 10 seconds and execute checkpoint
postgres=# checkpoint;
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly

The assert fails:
#5  0x00005b074f0c922f in ExceptionalCondition
(conditionName=0x5b074f2f0b4c "s->active_pid == 0",
fileName=0x5b074f2f0010 "slot.c", lineNumber=1762) at assert.c:66
#6  0x00005b074ee26ead in InvalidatePossiblyObsoleteSlot
(cause=RS_INVAL_INACTIVE_TIMEOUT, s=0x740925361780, oldestLSN=0,
dboid=0, snapshotConflictHorizon=0, invalidated=0x7fffaee87e63) at
slot.c:1762
#7  0x00005b074ee273b2 in InvalidateObsoleteReplicationSlots
(cause=RS_INVAL_INACTIVE_TIMEOUT, oldestSegno=0, dboid=0,
snapshotConflictHorizon=0) at slot.c:1952
#8  0x00005b074ee27678 in CheckPointReplicationSlots
(is_shutdown=false) at slot.c:2061
#9  0x00005b074e9dfda7 in CheckPointGuts (checkPointRedo=24412528,
flags=108) at xlog.c:7513
#10 0x00005b074e9df4ad in CreateCheckPoint (flags=108) at xlog.c:7179
#11 0x00005b074edc6bfc in CheckpointerMain (startup_data=0x0,
startup_data_len=0) at checkpointer.c:463

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

21 November 2024, 08:06:32

On Tue, 19 Nov 2024 at 12:51, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Thu, Nov 14, 2024 at 9:14 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > Please find the v48 patch attached.
> > >
> > 2) Currently it allows a minimum value of less than 1 second like in
> > milliseconds, I feel we can have some minimum value at least something
> > like checkpoint_timeout:
> > diff --git a/src/backend/utils/misc/guc_tables.c
> > b/src/backend/utils/misc/guc_tables.c
> > index 8a67f01200..367f510118 100644
> > --- a/src/backend/utils/misc/guc_tables.c
> > +++ b/src/backend/utils/misc/guc_tables.c
> > @@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
> >                 NULL, NULL, NULL
> >         },
> >
> > +       {
> > +               {"replication_slot_inactive_timeout", PGC_SIGHUP,
> > REPLICATION_SENDING,
> > +                       gettext_noop("Sets the amount of time a
> > replication slot can remain inactive before "
> > +                                                "it will be invalidated."),
> > +                       NULL,
> > +                       GUC_UNIT_S
> > +               },
> > +               &replication_slot_inactive_timeout,
> > +               0, 0, INT_MAX,
> > +               NULL, NULL, NULL
> > +       },
> >
>
> Currently, the feature is disabled by default when
> replication_slot_inactive_timeout = 0. However, if we set a minimum
> value, the default_val cannot be less than min_val, making it
> impossible to use 0 to disable the feature.
> Thoughts or any suggestions?

We could implement this similarly to how the vacuum_buffer_usage_limit
GUC is handled. Setting the value to 0 would allow the operation to
use any amount of shared_buffers. Otherwise, valid sizes would range
from 128 kB to 16 GB. Similarly, we can modify
check_replication_slot_inactive_timeout to behave in the same way as
check_vacuum_buffer_usage_limit function.

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

22 November 2024, 15:13:24

On Thu, 21 Nov 2024 at 17:35, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Wed, Nov 20, 2024 at 1:29 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > Attached is the v49 patch set:
> > > - Fixed the bug reported in [1].
> > > - Addressed comments in [2] and [3].
> > >
> > > I've split the patch into two, implementing the suggested idea in
> > > comment #5 of [2] separately in 001:
> > >
> > > Patch-001: Adds additional error reports (for all invalidation types)
> > > in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
> > > true.
> > > Patch-002: The original patch with comments addressed.
> >
> > This Assert can fail:
> >
>
> Attached v50 patch-set addressing review comments in [1] and [2].

We are setting inactive_since when the replication slot is released.
We are marking the slot as inactive only if it has been released.
However, there's a scenario where the network connection between the
publisher and subscriber may be lost where the replication slot is not
released, but no changes are replicated due to the network problem. In
this case, no updates would occur in the replication slot for a period
exceeding the replication_slot_inactive_timeout.
Should we invalidate these replication slots as well, or is it
intentionally left out?

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

25 November 2024, 09:06:03

Hi Nisha,

Here are my review comments for the patch v50-0001.

======
Commit message

1.
In ReplicationSlotAcquire(), raise an error for invalid slots if caller
specify error_if_invalid=true.

/caller/the caller/
/specify/specifies/

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

2.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)

The "has been invalidated previously." sounds a bit tricky. Do you just mean:

"An error is raised if error_if_invalid is true and the slot is found
to be invalid."

~

3.
+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated.
+ */

(ditto previous comment)

~

4.
+ appendStringInfo(&err_detail, _("This slot has been invalidated because "));
+
+ switch (s->data.invalidated)
+ {
+ case RS_INVAL_WAL_REMOVED:
+ appendStringInfo(&err_detail, _("the required WAL has been removed."));
+ break;
+
+ case RS_INVAL_HORIZON:
+ appendStringInfo(&err_detail, _("the required rows have been removed."));
+ break;
+
+ case RS_INVAL_WAL_LEVEL:
+ appendStringInfo(&err_detail, _("wal_level is insufficient for slot."));
+ break;

4a.
I suspect that building the errdetail in 2 parts like this will be
troublesome for the translators of some languages. Probably it is
safer to have the entire errdetail for each case.

~

4b.
By convention, I think the GUC "wal_level" should be double-quoted in
the message.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

27 November 2024, 06:09:15

Hi Nisha,

Here are some review comments for the patch v50-0002.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:

1.
+ if (now &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_sec * 1000))

Previously this was using an additional call to SlotInactiveTimeoutCheckAllowed:

+ if (SlotInactiveTimeoutCheckAllowed(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout * 1000))

Is it OK to skip that call? e.g. can the slot fields possibly change
between assigning the 'now' and acquiring the mutex? If not, then the
current code is fine. The only reason for asking is because it is
slightly suspicious that it was not done this "easy" way in the first
place.

~~~

check_replication_slot_inactive_timeout:

2.
+/*
+ * GUC check_hook for replication_slot_inactive_timeout
+ *
+ * We don't allow the value of replication_slot_inactive_timeout other than 0
+ * during the binary upgrade.
+ */

The "We don't allow..." sentence seems like a backward way of saying:
The value of replication_slot_inactive_timeout must be set to 0 during
the binary upgrade.

======
src/test/recovery/t/050_invalidate_slots.pl

3.
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby.

What does "on its own" mean here? Do you mean it won't get invalidated
unless the invalidation state is propagated from the primary? Maybe
the comment can be clearer.

~

4.
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset, $inactive_timeout_1s) = @_;
+ my $node_name = $node->name;
+

It was OK to change the variable name to 'inactive_timeout_1s' outside
of here, but within the subroutine, I don't think it is appropriate
because this is a parameter that potentially could have any value.

~

5.
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+ my ($node, $slot, $offset, $inactive_timeout_1s) = @_;
+ my $node_name = $node->name;
+ my $invalidated = 0;

It was OK to change the variable name to 'inactive_timeout_1s' outside
of here, but within the subroutine, I don't think it is appropriate
because this is a parameter that potentially could have any value.

~

6.
+ # Give enough time to avoid multiple checkpoints
+ sleep($inactive_timeout_1s + 1);
+
+ # Run a checkpoint
+ $node->safe_psql('postgres', "CHECKPOINT");

Since you are not doing multiple checkpoints anymore, it looks like
that "Give enough time..." comment needs updating.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

28 November 2024, 00:37:47

Hi Nisha, here are my review comments for the patch v51-0001.

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

1.
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+    NameStr(s->data.name)),
+ errdetail_internal("%s", err_detail.data));
+
+ pfree(err_detail.data);
+ }
+

Won't the 'pfree' be unreachable due to the prior ereport ERROR?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

28 November 2024, 02:50:05

Hi Nisha. Here are some review comments for patch v51-0002.

======
doc/src/sgml/system-views.sgml

1.
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. Once the slot is invalidated, this
+        value will remain unchanged until we shutdown the server.
.

I think "Once the ..." kind of makes it sound like invalidation is
inevitable. Also maybe it's better to remove the "we".

SUGGESTION:
If the slot becomes invalidated, this value will remain unchanged
until server shutdown.

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

2.
GENERAL.

This just is a question/idea. It may not be feasible to change. It
seems like there is a lot of overlap between the error messages in
'ReplicationSlotAcquire' which are saying "This slot has been
invalidated because...", and with the other function
'ReportSlotInvalidation' which is kind of the same but called in
different circumstances and with slightly different message text. I
wondered if there is a way to use common code to unify these messages
instead of having a nearly duplicate set of messages for all the
invalidation causes?

~~~

3.
+ case RS_INVAL_INACTIVE_TIMEOUT:
+ appendStringInfo(&err_detail, _("inactivity exceeded the time limit
set by \"%s\"."),
+ "replication_slot_inactive_timeout");
+ break;

Should this err_detail also say "This slot has been invalidated
because ..." like all the others?

~~~

InvalidatePossiblyObsoleteSlot:

4.
+ case RS_INVAL_INACTIVE_TIMEOUT:
+
+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */
+ if (IsSlotInactiveTimeoutPossible(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_sec * 1000))
+ {

Maybe this code should have Assert(now > 0); before the condition just
as a way to 'document' that it is assumed 'now' was already set this
outside the mutex.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Introduce XID age and inactive timeout based replication slot invalidation

From

"Hayato Kuroda (Fujitsu)"

Date:

28 November 2024, 10:59:21

Dear Nisha,

> 
> Attached v51 patch-set addressing all comments in [1] and [2].
>

Thanks for working on the feature! I've stated to review the patch.
Here are my comments - sorry if there are something which have already been discussed.
The thread is too long to follow correctly.

Comments for 0001
=============

01. binary_upgrade_logical_slot_has_caught_up

ISTM that error_if_invalid is set to true when the slot can be moved forward, otherwise
it is set to false. Regarding the binary_upgrade_logical_slot_has_caught_up, however,
only valid slots will be passed to the funciton (see pg_upgrade/info.c) so I feel
it is OK to set to true. Thought?

02. ReplicationSlotAcquire

According to other functions, we are adding to a note to the translator when
parameters represent some common nouns, GUC names. I feel we should add a comment
for RS_INVAL_WAL_LEVEL part based on it.


Comments for 0002
=============

03. check_replication_slot_inactive_timeout

Can we overwrite replication_slot_inactive_timeout to zero when pg_uprade (and also
pg_createsubscriber?) starts a server process? Several parameters have already been
specified via -c option at that time. This can avoid an error while the upgrading.
Note that this part is still needed even if you accept the comment. Users can
manually boot with upgrade mode.

04. ReplicationSlotAcquire

Same comment as 02.

05. ReportSlotInvalidation

Same comment as 02.

06. found bug

While testing the patch, I found that slots can be invalidated too early when when
the GUC is quite large. I think because an overflow is caused in InvalidatePossiblyObsoleteSlot().

- Reproducer

I set the replication_slot_inactive_timeout to INT_MAX and executed below commands,
and found that the slot is invalidated.

```
postgres=# SHOW replication_slot_inactive_timeout;
 replication_slot_inactive_timeout 
-----------------------------------
 2147483647s
(1 row)
postgres=# SELECT * FROM pg_create_logical_replication_slot('test', 'test_decoding');
 slot_name |    lsn    
-----------+-----------
 test      | 0/18B7F38
(1 row)
postgres=# CHECKPOINT ;
CHECKPOINT
postgres=# SELECT slot_name, inactive_since, invalidation_reason FROM pg_replication_slots ;
 slot_name |        inactive_since         | invalidation_reason 
-----------+-------------------------------+---------------------
 test      | 2024-11-28 07:50:25.927594+00 | inactive_timeout
(1 row)
```

- analysis

In InvalidatePossiblyObsoleteSlot(), replication_slot_inactive_timeout_sec * 1000
is passed to the third argument of TimestampDifferenceExceeds(), which is also the
integer datatype. This causes an overflow and parameter is handled as the small
value.

- solution

I think there are two possible solutions. You can choose one of them:

a. Make the maximum INT_MAX/1000, or
b. Change the unit to millisecond.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

28 November 2024, 12:13:50

On Fri, 22 Nov 2024 at 17:43, vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 21 Nov 2024 at 17:35, Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > On Wed, Nov 20, 2024 at 1:29 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > >
> > > > Attached is the v49 patch set:
> > > > - Fixed the bug reported in [1].
> > > > - Addressed comments in [2] and [3].
> > > >
> > > > I've split the patch into two, implementing the suggested idea in
> > > > comment #5 of [2] separately in 001:
> > > >
> > > > Patch-001: Adds additional error reports (for all invalidation types)
> > > > in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
> > > > true.
> > > > Patch-002: The original patch with comments addressed.
> > >
> > > This Assert can fail:
> > >
> >
> > Attached v50 patch-set addressing review comments in [1] and [2].
>
> We are setting inactive_since when the replication slot is released.
> We are marking the slot as inactive only if it has been released.
> However, there's a scenario where the network connection between the
> publisher and subscriber may be lost where the replication slot is not
> released, but no changes are replicated due to the network problem. In
> this case, no updates would occur in the replication slot for a period
> exceeding the replication_slot_inactive_timeout.
> Should we invalidate these replication slots as well, or is it
> intentionally left out?

On further thinking, I felt we can keep the current implementation as
is and simply add a brief comment in the code to address this.
Additionally, we can mention it in the commit message for clarity.

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

28 November 2024, 17:10:32

On Wed, 27 Nov 2024 at 16:25, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Wed, Nov 27, 2024 at 8:39 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Nisha,
> >
> > Here are some review comments for the patch v50-0002.
> >
> > ======
> > src/backend/replication/slot.c
> >
> > InvalidatePossiblyObsoleteSlot:
> >
> > 1.
> > + if (now &&
> > + TimestampDifferenceExceeds(s->inactive_since, now,
> > +    replication_slot_inactive_timeout_sec * 1000))
> >
> > Previously this was using an additional call to SlotInactiveTimeoutCheckAllowed:
> >
> > + if (SlotInactiveTimeoutCheckAllowed(s) &&
> > + TimestampDifferenceExceeds(s->inactive_since, now,
> > +    replication_slot_inactive_timeout * 1000))
> >
> > Is it OK to skip that call? e.g. can the slot fields possibly change
> > between assigning the 'now' and acquiring the mutex? If not, then the
> > current code is fine. The only reason for asking is because it is
> > slightly suspicious that it was not done this "easy" way in the first
> > place.
> >
> Good catch! While the mutex was being acquired right after the now
> assignment, there was a rare chance of another process modifying the
> slot in the meantime. So, I reverted the change in v51. To optimize
> the SlotInactiveTimeoutCheckAllowed() call, it's sufficient to check
> it here instead of during the 'now' assignment.
>
> Attached v51 patch-set addressing all comments in [1] and [2].

Few comments:
1) replication_slot_inactive_timeout can be mentioned in logical
replication config, we could mention something like:
Logical replication slot is also affected by replication_slot_inactive_timeout

2.a) Is this change applicable only for inactive timeout or it is
applicable to others like wal removed, wal level etc also? If it is
applicable to all of them we could move this to the first patch and
update the commit message:
+                * If the slot can be acquired, do so and mark it as
invalidated. If
+                * the slot is already ours, mark it as invalidated.
Otherwise, we'll
+                * signal the owning process below and retry.
                 */
-               if (active_pid == 0)
+               if (active_pid == 0 ||
+                       (MyReplicationSlot == s &&
+                        active_pid == MyProcPid))

2.b) Also this MyReplicationSlot and active_pid check can be in same line:
+                       (MyReplicationSlot == s &&
+                        active_pid == MyProcPid))


3) Error detail should start in upper case here similar to how others are done:
+                       case RS_INVAL_INACTIVE_TIMEOUT:
+                               appendStringInfo(&err_detail,
_("inactivity exceeded the time limit set by \"%s\"."),
+
"replication_slot_inactive_timeout");
+                               break;

4) Since this change is not related to this patch, we can move this to
the first patch and update the commit message:
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data,
size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-       TimestampTz now = 0;
+       TimestampTz now;

        /*
         * We need to update inactive_since only when we are promoting
standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
        /* The slot sync worker or SQL function mustn't be running by now */
        Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);

+       /* Use same inactive_since time for all slots */
+       now = GetCurrentTimestamp();

5) Since this change is not related to this patch, we can move this to
the first patch.
@@ -2250,6 +2350,7 @@ RestoreSlotFromDisk(const char *name)
        bool            restored = false;
        int                     readBytes;
        pg_crc32c       checksum;
+       TimestampTz now;

        /* no need to lock here, no concurrent access allowed yet */

@@ -2410,6 +2511,9 @@ RestoreSlotFromDisk(const char *name)
                                                NameStr(cp.slotdata.name)),
                                 errhint("Change \"wal_level\" to be
\"replica\" or higher.")));

+       /* Use same inactive_since time for all slots */
+       now = GetCurrentTimestamp();
+
        /* nothing can be active yet, don't lock anything */
        for (i = 0; i < max_replication_slots; i++)
        {
@@ -2440,9 +2544,11 @@ RestoreSlotFromDisk(const char *name)
                /*
                 * Set the time since the slot has become inactive
after loading the
                 * slot from the disk into memory. Whoever acquires
the slot i.e.
-                * makes the slot active will reset it.
+                * makes the slot active will reset it. Avoid calling
+                * ReplicationSlotSetInactiveSince() here, as it will
not set the time
+                * for invalid slots.
                 */
-               slot->inactive_since = GetCurrentTimestamp();
+               slot->inactive_since = now;

[1] - https://www.postgresql.org/docs/current/logical-replication-config.html

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

29 November 2024, 15:36:33

On Tue, Nov 19, 2024 at 12:47 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Thu, Nov 14, 2024 at 5:29 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> >
> > 12.
> >   /*
> > - * If the slot can be acquired, do so and mark it invalidated
> > - * immediately.  Otherwise we'll signal the owning process, below, and
> > - * retry.
> > + * If the slot can be acquired, do so and mark it as invalidated. If
> > + * the slot is already ours, mark it as invalidated. Otherwise, we'll
> > + * signal the owning process below and retry.
> >   */
> > - if (active_pid == 0)
> > + if (active_pid == 0 ||
> > + (MyReplicationSlot == s &&
> > + active_pid == MyProcPid))
> >
> > I wasn't sure how this change belongs to this patch, because the logic
> > of the previous review comment said for the case of invalidation due
> > to inactivity that active_id must be 0. e.g. Assert(s->active_pid ==
> > 0);
> >
>
> I don't fully understand the purpose of this change yet. I'll look
> into it further and get back.
>

This change applies to all types of invalidation, not just
inactive_timeout case, so moved the change to patch-001. It’s a
general optimization for the case when the current process is the
active PID for the slot.
Also, the Assert(s->active_pid == 0); has been removed (in v50) as it
was unnecessary.

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

29 November 2024, 15:36:56

On Thu, Nov 28, 2024 at 1:29 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Nisha,
>
> >
> > Attached v51 patch-set addressing all comments in [1] and [2].
> >
>
> Thanks for working on the feature! I've stated to review the patch.
> Here are my comments - sorry if there are something which have already been discussed.
> The thread is too long to follow correctly.
>
> Comments for 0001
> =============
>
> 01. binary_upgrade_logical_slot_has_caught_up
>
> ISTM that error_if_invalid is set to true when the slot can be moved forward, otherwise
> it is set to false. Regarding the binary_upgrade_logical_slot_has_caught_up, however,
> only valid slots will be passed to the funciton (see pg_upgrade/info.c) so I feel
> it is OK to set to true. Thought?
>

Right, corrected the call with error_if_invalid as true.

> Comments for 0002
> =============
>
> 03. check_replication_slot_inactive_timeout
>
> Can we overwrite replication_slot_inactive_timeout to zero when pg_uprade (and also
> pg_createsubscriber?) starts a server process? Several parameters have already been
> specified via -c option at that time. This can avoid an error while the upgrading.
> Note that this part is still needed even if you accept the comment. Users can
> manually boot with upgrade mode.
>

Done.

> 06. found bug
>
> While testing the patch, I found that slots can be invalidated too early when when
> the GUC is quite large. I think because an overflow is caused in InvalidatePossiblyObsoleteSlot().
>
> - Reproducer
>
> I set the replication_slot_inactive_timeout to INT_MAX and executed below commands,
> and found that the slot is invalidated.
>
> ```
> postgres=# SHOW replication_slot_inactive_timeout;
>  replication_slot_inactive_timeout
> -----------------------------------
>  2147483647s
> (1 row)
> postgres=# SELECT * FROM pg_create_logical_replication_slot('test', 'test_decoding');
>  slot_name |    lsn
> -----------+-----------
>  test      | 0/18B7F38
> (1 row)
> postgres=# CHECKPOINT ;
> CHECKPOINT
> postgres=# SELECT slot_name, inactive_since, invalidation_reason FROM pg_replication_slots ;
>  slot_name |        inactive_since         | invalidation_reason
> -----------+-------------------------------+---------------------
>  test      | 2024-11-28 07:50:25.927594+00 | inactive_timeout
> (1 row)
> ```
>
> - analysis
>
> In InvalidatePossiblyObsoleteSlot(), replication_slot_inactive_timeout_sec * 1000
> is passed to the third argument of TimestampDifferenceExceeds(), which is also the
> integer datatype. This causes an overflow and parameter is handled as the small
> value.
>
> - solution
>
> I think there are two possible solutions. You can choose one of them:
>
> a. Make the maximum INT_MAX/1000, or
> b. Change the unit to millisecond.
>

Fixed. It is reasonable to align with other timeout parameters by
using milliseconds as the unit.

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

29 November 2024, 15:37:32

On Thu, Nov 28, 2024 at 5:20 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Nisha. Here are some review comments for patch v51-0002.
>
> ======
> src/backend/replication/slot.c
>
> ReplicationSlotAcquire:
>
> 2.
> GENERAL.
>
> This just is a question/idea. It may not be feasible to change. It
> seems like there is a lot of overlap between the error messages in
> 'ReplicationSlotAcquire' which are saying "This slot has been
> invalidated because...", and with the other function
> 'ReportSlotInvalidation' which is kind of the same but called in
> different circumstances and with slightly different message text. I
> wondered if there is a way to use common code to unify these messages
> instead of having a nearly duplicate set of messages for all the
> invalidation causes?
>

The error handling could be moved to a new function; however, as you
pointed out, the contexts in which these functions are called differ.
IMO, a single error message may not suit both cases. For example,
ReportSlotInvalidation provides additional details and a hint in its
message, which isn’t necessary for ReplicationSlotAcquire.
Thoughts?

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

03 December 2024, 10:00:08

Hi Nisha, here are a couple of review comments for patch v52-0001.

======
Commit Message

Add check if slot is already acquired, then mark it invalidate directly.

~

/slot/the slot/

"mark it invalidate" ?

Maybe you meant:
"then invalidate it directly", or
"then mark it 'invalidated' directly", or
etc.

======
src/backend/replication/logical/slotsync.c

1.
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data,
size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
- TimestampTz now = 0;
+ TimestampTz now;

  /*
  * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
  /* The slot sync worker or SQL function mustn't be running by now */
  Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);

+ /* Use same inactive_since time for all slots */
+ now = GetCurrentTimestamp();
+

Something is broken with these changes.

AFAICT, the result after applying patch 0001 still has code:
/* Use the same inactive_since time for all the slots. */
if (now == 0)
  now = GetCurrentTimestamp();

So the end result has multiple/competing assignments to variable 'now'.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Introduce XID age and inactive timeout based replication slot invalidation

From

"Hayato Kuroda (Fujitsu)"

Date:

03 December 2024, 10:39:13

Dear Nisha,

Thanks for updating the patch!

> Fixed. It is reasonable to align with other timeout parameters by
> using milliseconds as the unit.

It looks you just replaced to GUC_UNIT_MS, but the documentation and
postgresql.conf.sample has not been changed yet. They should follow codes.
Anyway, here are other comments, mostly cosmetic.

01. slot.c

```
+int         replication_slot_inactive_timeout_ms = 0;
```

According to other lines, we should add a short comment for the GUC.

02. 050_invalidate_slots.pl

Do you have a reason why you use the number 050? I feel it can be 043.

03. 050_invalidate_slots.pl

Also, not sure the file name is correct. This file contains only a slot invalidation due to the
replication_slot_inactive_timeout. But I feel current name is too general.

04. 050_invalidate_slots.pl

```
+use Time::HiRes qw(usleep);
```

This line is not needed because usleep() is not used in this file.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

04 December 2024, 13:26:53

On Wed, 4 Dec 2024 at 15:01, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Tue, Dec 3, 2024 at 1:09 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Nisha,
> >
> > Thanks for updating the patch!
> >
> > > Fixed. It is reasonable to align with other timeout parameters by
> > > using milliseconds as the unit.
> >
> > It looks you just replaced to GUC_UNIT_MS, but the documentation and
> > postgresql.conf.sample has not been changed yet. They should follow codes.
> > Anyway, here are other comments, mostly cosmetic.
> >
>
> Here is v53 patch-set addressing all the comments in [1] and [2].

Currently, replication slots are invalidated based on the
replication_slot_inactive_timeout only during a checkpoint. This means
that if the checkpoint_timeout is set to a higher value than the
replication_slot_inactive_timeout, slot invalidation will occur only
when the checkpoint is triggered. Identifying the invalidation slots
might be slightly delayed in this case. As an alternative, users can
forcefully invalidate inactive slots that have exceeded the
replication_slot_inactive_timeout by forcing a checkpoint. I was
thinking we could suggest this in the documentation.

+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link
linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+

We could accurately invalidate the slots using the checkpointer
process by calculating the invalidation time based on the active_since
timestamp and the replication_slot_inactive_timeout, and then set the
checkpointer's main wait-latch accordingly for triggering the next
checkpoint. Ideally, a different process handling this task would be
better, but there is currently no dedicated daemon capable of
identifying and managing slots across streaming replication, logical
replication, and other slots used by plugins. Additionally,
overloading the checkpointer with this responsibility may not be
ideal. As an alternative, we could document about this delay in
identifying and mention that it could be triggered by forceful manual
checkpoint.

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

04 December 2024, 13:46:06

On Wed, 4 Dec 2024 at 15:01, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Tue, Dec 3, 2024 at 1:09 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Nisha,
> >
> > Thanks for updating the patch!
> >
> > > Fixed. It is reasonable to align with other timeout parameters by
> > > using milliseconds as the unit.
> >
> > It looks you just replaced to GUC_UNIT_MS, but the documentation and
> > postgresql.conf.sample has not been changed yet. They should follow codes.
> > Anyway, here are other comments, mostly cosmetic.
> >
>
> Here is v53 patch-set addressing all the comments in [1] and [2].

CFBot is failing at [1] because the file name is changed to
043_invalidate_inactive_slots, the meson.build file should be updated
accordingly:
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }

[1] - https://cirrus-ci.com/task/6266479424831488

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

05 December 2024, 03:35:20

Hi Nisha,

Here are my review comments for the v53* patch set

//////////

Patch v53-0001.

======
src/backend/replication/slot.c

1.
+ if (error_if_invalid &&
+ s->data.invalidated != RS_INVAL_NONE)

Looks like some unnecessary wrapping here. I think this condition can
be on one line.

//////////

Patch v53-0002.

======
GENERAL - How about using the term "idle"?

1.
I got to wondering why this new GUC was called
"replication_slot_inactive_timeout", with invalidation_reason =
"inactive_timeout". When I look at similar GUCs I don't see words like
"inactivity" or "inactive" anywhere; Instead, they are using the term
"idle" to refer to when something is inactive:
e.g.
#idle_in_transaction_session_timeout = 0 # in milliseconds, 0 is disabled
#idle_session_timeout = 0 # in milliseconds, 0 is disabled

I know the "inactive" term is used a bit in the slot code but that is
(mostly) not exposed to the user. Therefore, I am beginning to feel it
would be better (e.g. more consistent) to use "idle" for the
user-facing stuff. e.g.
New Slot GUC = "idle_replication_slot_timeout"
Slot invalidation_reason = "idle_timeout"

Of course, changing this will cascade to impact quite a lot of other
things in the patch -- comments, error messages, some function names
etc.

======
doc/src/sgml/logical-replication.sgml

2.
+   <para>
+    Logical replication slot is also affected by
+    <link
linkend="guc-replication-slot-inactive-timeout"><varname>replication_slot_inactive_timeout</varname></link>.
+   </para>
+

/Logical replication slot is also affected by/Logical replication
slots are also affected by/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

05 December 2024, 04:13:40

On Wed, Dec 4, 2024 at 9:27 PM vignesh C <vignesh21@gmail.com> wrote:
>
...
>
> Currently, replication slots are invalidated based on the
> replication_slot_inactive_timeout only during a checkpoint. This means
> that if the checkpoint_timeout is set to a higher value than the
> replication_slot_inactive_timeout, slot invalidation will occur only
> when the checkpoint is triggered. Identifying the invalidation slots
> might be slightly delayed in this case. As an alternative, users can
> forcefully invalidate inactive slots that have exceeded the
> replication_slot_inactive_timeout by forcing a checkpoint. I was
> thinking we could suggest this in the documentation.
>
> +       <para>
> +        Slot invalidation due to inactive timeout occurs during checkpoint.
> +        The duration of slot inactivity is calculated using the slot's
> +        <link
linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
> +        value.
> +       </para>
> +
>
> We could accurately invalidate the slots using the checkpointer
> process by calculating the invalidation time based on the active_since
> timestamp and the replication_slot_inactive_timeout, and then set the
> checkpointer's main wait-latch accordingly for triggering the next
> checkpoint. Ideally, a different process handling this task would be
> better, but there is currently no dedicated daemon capable of
> identifying and managing slots across streaming replication, logical
> replication, and other slots used by plugins. Additionally,
> overloading the checkpointer with this responsibility may not be
> ideal. As an alternative, we could document about this delay in
> identifying and mention that it could be triggered by forceful manual
> checkpoint.
>

Hi Vignesh.

I felt that manipulating the checkpoint timing behind the scenes
without the user's consent might be a bit of an overreach.

But there might still be something else we could do:

1. We can add the documentation note like you suggested ("we could
document about this delay in identifying and mention that it could be
triggered by forceful manual checkpoint").

2. We can also detect such delays in the code. When the invalidation
occurs (e.g. code fragment below) we could check if there was some
excessive lag between the slot becoming idle and it being invalidated.
If the lag is too much (whatever "too much" means) we can log a hint
for the user to increase the checkpoint frequency (or whatever else we
might advise them to do).

+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */
+ if (IsSlotInactiveTimeoutPossible(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_ms))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since;

pseudo-code:
if (slot invalidation occurred much later after the
replication_slot_inactive_timeout GUC elapsed)
{
  elog(LOG, "This slot was inactive for a period of %s. Slot timeout
invalidation only occurs at a checkpoint so if you want inactive slots
to be invalidated in a more timely manner consider reducing the time
between checkpoints or executing a manual checkpoint.
(replication_slot_inactive_timeout = %s; checkpoint_timeout = %s,
....)"
}

+ }

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

06 December 2024, 08:34:29

On Thu, 5 Dec 2024 at 06:44, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, Dec 4, 2024 at 9:27 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> ...
> >
> > Currently, replication slots are invalidated based on the
> > replication_slot_inactive_timeout only during a checkpoint. This means
> > that if the checkpoint_timeout is set to a higher value than the
> > replication_slot_inactive_timeout, slot invalidation will occur only
> > when the checkpoint is triggered. Identifying the invalidation slots
> > might be slightly delayed in this case. As an alternative, users can
> > forcefully invalidate inactive slots that have exceeded the
> > replication_slot_inactive_timeout by forcing a checkpoint. I was
> > thinking we could suggest this in the documentation.
> >
> > +       <para>
> > +        Slot invalidation due to inactive timeout occurs during checkpoint.
> > +        The duration of slot inactivity is calculated using the slot's
> > +        <link
linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
> > +        value.
> > +       </para>
> > +
> >
> > We could accurately invalidate the slots using the checkpointer
> > process by calculating the invalidation time based on the active_since
> > timestamp and the replication_slot_inactive_timeout, and then set the
> > checkpointer's main wait-latch accordingly for triggering the next
> > checkpoint. Ideally, a different process handling this task would be
> > better, but there is currently no dedicated daemon capable of
> > identifying and managing slots across streaming replication, logical
> > replication, and other slots used by plugins. Additionally,
> > overloading the checkpointer with this responsibility may not be
> > ideal. As an alternative, we could document about this delay in
> > identifying and mention that it could be triggered by forceful manual
> > checkpoint.
> >
>
> Hi Vignesh.
>
> I felt that manipulating the checkpoint timing behind the scenes
> without the user's consent might be a bit of an overreach.

Agree

> But there might still be something else we could do:
>
> 1. We can add the documentation note like you suggested ("we could
> document about this delay in identifying and mention that it could be
> triggered by forceful manual checkpoint").

Yes, that makes sense

> 2. We can also detect such delays in the code. When the invalidation
> occurs (e.g. code fragment below) we could check if there was some
> excessive lag between the slot becoming idle and it being invalidated.
> If the lag is too much (whatever "too much" means) we can log a hint
> for the user to increase the checkpoint frequency (or whatever else we
> might advise them to do).
>
> + /*
> + * Check if the slot needs to be invalidated due to
> + * replication_slot_inactive_timeout GUC.
> + */
> + if (IsSlotInactiveTimeoutPossible(s) &&
> + TimestampDifferenceExceeds(s->inactive_since, now,
> +    replication_slot_inactive_timeout_ms))
> + {
> + invalidation_cause = cause;
> + inactive_since = s->inactive_since;
>
> pseudo-code:
> if (slot invalidation occurred much later after the
> replication_slot_inactive_timeout GUC elapsed)
> {
>   elog(LOG, "This slot was inactive for a period of %s. Slot timeout
> invalidation only occurs at a checkpoint so if you want inactive slots
> to be invalidated in a more timely manner consider reducing the time
> between checkpoints or executing a manual checkpoint.
> (replication_slot_inactive_timeout = %s; checkpoint_timeout = %s,
> ....)"
> }
>
> + }

Determining the correct time may be challenging for users, as it
depends on when the active_since value is set, as well as when the
checkpoint_timeout occurs and the subsequent checkpoint is triggered.
Even if the user sets it to an appropriate value, there is still a
possibility of delayed identification due to the timing of when the
slot's active_timeout is being set. Including this information in the
documentation should be sufficient.

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

11 December 2024, 05:44:05

Hi Nisha.

Here are some review comments for patch v54-0002.

(I had also checked patch v54-0001, but have no further review
comments for that one).

======
doc/src/sgml/config.sgml

1.
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        If the <varname>checkpoint_timeout</varname> exceeds
+        <varname>idle_replication_slot_timeout</varname>, the slot
+        invalidation will be delayed until the next checkpoint is triggered.
+        To avoid delays, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated
using the slot's
+        <link
linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+

The wording of "If the checkpoint_timeout exceeds
idle_replication_slot_timeout, the slot invalidation will be delayed
until the next checkpoint is triggered." seems slightly misleading,
because AFAIK it is not conditional on the GUC value differences like
that -- i.e. slot invalidation is *always* delayed until the next
checkpoint occurs.

SUGGESTION:
Slot invalidation due to idle timeout occurs during checkpoint.
Because checkpoints happen at checkpoint_timeout intervals, there can
be some lag between when the idle_replication_slot_timeout was
exceeded and when the slot invalidation is triggered at the next
checkpoint. To avoid such lags, users can force...

=======
src/backend/replication/slot.c

2. GENERAL

+/* Invalidate replication slots idle beyond this time; '0' disables it */
+int idle_replication_slot_timeout_ms = 0;

I noticed this patch is using a variety of ways of describing the same thing:
* guc var: Invalidate replication slots idle beyond this time...
* guc_tables: ... the amount of time a replication slot can remain
idle before it will be invalidated.
* docs: means that the slot has remained idle beyond the duration
specified by the idle_replication_slot_timeout parameter
* errmsg: ... slot has been invalidated because inactivity exceeded
the time limit set by ...
* etc..

They are all the same, but they are all worded slightly differently:
* "idle" vs "inactivity" vs ...
* "time" vs "amount of time" vs "duration" vs "time limit" vs ...

There may not be a one-size-fits-all, but still, it might be better to
try to search for all different phrasing and use common wording as
much as possible.

~~~

CheckPointReplicationSlots:

3.
+ * XXX: Slot invalidation due to 'idle_timeout' occurs only for
+ * released slots, based on 'idle_replication_slot_timeout'. Active
+ * slots in use for replication are excluded, preventing accidental
+ * invalidation. Slots where communication between the publisher and
+ * subscriber is down are also excluded, as they are managed by the
+ * 'wal_sender_timeout'.

Maybe a slight rewording like below is better. Maybe not. YMMV.

SUGGESTION:
XXX: Slot invalidation due to 'idle_timeout' applies only to released
slots, and is based on the 'idle_replication_slot_timeout' GUC. Active
slots
currently in use for replication are excluded to prevent accidental
invalidation.  Slots...

======
src/bin/pg_upgrade/server.c

4.
+ /*
+ * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+ * inactive_timeout by checkpointer process during upgrade.
+ */
+ if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+ appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+

/inactive_timeout/idle_timeout/

======
src/test/recovery/t/043_invalidate_inactive_slots.pl

5.
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset, $idle_timeout) = @_;
+ my $node_name = $node->name;

AFAICT this 'idle_timeout' parameter is passed units of "seconds", so
it would be better to call it something like 'idle_timeout_s' to make
the units clear.

~~~

6.
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+ my ($node, $slot, $offset, $idle_timeout) = @_;
+ my $node_name = $node->name;
+ my $invalidated = 0;

Ditto above review comment #5 -- better to call it something like
'idle_timeout_s' to make the units clear.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

11 December 2024, 09:51:11

On Tue, 10 Dec 2024 at 17:21, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Fri, Dec 6, 2024 at 11:04 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Determining the correct time may be challenging for users, as it
> > depends on when the active_since value is set, as well as when the
> > checkpoint_timeout occurs and the subsequent checkpoint is triggered.
> > Even if the user sets it to an appropriate value, there is still a
> > possibility of delayed identification due to the timing of when the
> > slot's active_timeout is being set. Including this information in the
> > documentation should be sufficient.
> >
>
> +1
> v54 documents this information as suggested.
>
> Attached the v54 patch-set addressing all the comments till now in

Few comments on the test added:
1) Can we remove this and set idle_replication_slot_timeout while the
standby node is created itself during append_conf:
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1s = 1;
+$standby1->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1s}s';
+]);
+$standby1->reload;

2) You can move these statements before the standby node is created:
+# Create sync slot on the primary
+$primary->psql('postgres',
+       q{SELECT pg_create_logical_replication_slot('sync_slot1',
'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+       'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name :=
'sb_slot1', immediately_reserve := true);
+]);

3) Do we need autovacuum as off for these tests, is there any
probability of a test failure without this. I felt it should not
impact these tests, if not we can remove this:
+# Avoid unpredictability
+$primary->append_conf(
+       'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});

4) Generally we mention single char in single quotes, we can update "t" to 't':
+       ),
+       "t",
+       'logical slot sync_slot1 is synced to standby');
+

5) Similarly here too:
+                 WHERE slot_name = 'sync_slot1'
+                       AND invalidation_reason IS NULL;}
+       ),
+       "t",
+       'check that synced slot sync_slot1 has not been invalidated on
standby');

6) This standby offset is not used anywhere, it can be removed:
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

12 December 2024, 07:12:35

On Tue, 10 Dec 2024 at 17:21, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Fri, Dec 6, 2024 at 11:04 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Determining the correct time may be challenging for users, as it
> > depends on when the active_since value is set, as well as when the
> > checkpoint_timeout occurs and the subsequent checkpoint is triggered.
> > Even if the user sets it to an appropriate value, there is still a
> > possibility of delayed identification due to the timing of when the
> > slot's active_timeout is being set. Including this information in the
> > documentation should be sufficient.
> >
>
> +1
> v54 documents this information as suggested.
>
> Attached the v54 patch-set addressing all the comments till now in
> [1], [2] and [3].

Now that we support idle_replication_slot_timeout in milliseconds, we
can set this value from 1s to 1ms or 10millseconds and change sleep to
usleep, this will bring down the test execution time significantly:
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1s = 1;
+$standby1->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1s}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+               'postgres',
+               q{SELECT count(*) = 1 FROM pg_replication_slots
+                 WHERE slot_name = 'sync_slot1' AND synced
+                       AND NOT temporary
+                       AND invalidation_reason IS NULL;}
+       ),
+       "t",
+       'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1s + 1);

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

13 December 2024, 14:00:27

On Wed, Dec 11, 2024 at 8:14 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Nisha.
>
> Here are some review comments for patch v54-0002.
> ======
> src/test/recovery/t/043_invalidate_inactive_slots.pl
>
> 5.
> +# Wait for slot to first become idle and then get invalidated
> +sub wait_for_slot_invalidation
> +{
> + my ($node, $slot, $offset, $idle_timeout) = @_;
> + my $node_name = $node->name;
>
> AFAICT this 'idle_timeout' parameter is passed units of "seconds", so
> it would be better to call it something like 'idle_timeout_s' to make
> the units clear.
>

As per the suggestion in [1], the test has been updated to use
idle_timeout=1ms. Since the parameter uses the default unit of
"milliseconds," keeping it as 'idle_timeout' seems reasonable to me.

> ~~~
>
> 6.
> +# Trigger slot invalidation and confirm it in the server log
> +sub trigger_slot_invalidation
> +{
> + my ($node, $slot, $offset, $idle_timeout) = @_;
> + my $node_name = $node->name;
> + my $invalidated = 0;
>
> Ditto above review comment #5 -- better to call it something like
> 'idle_timeout_s' to make the units clear.
>

The 'idle_timeout' parameter name remains unchanged as explained above.

[1] https://www.postgresql.org/message-id/CALDaNm1FQS04aG0C0gCRpvi-o-OTdq91y6Az34YKN-dVc9r5Ng%40mail.gmail.com

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

16 December 2024, 07:28:24

Hi Nisha.

Thanks for the v55* patches.

I have no comments for patch v55-0001.

I have only 1 comment for patch v55-0002 regarding some remaining
nitpicks (below) about the consistency of phrases.

======

I scanned again over all the phrases for consistency:

CURRENT PATCH:

Docs (idle_replication_slot_timeout): Invalidate replication slots
that are idle for longer than this amount of time
Docs (idle_timeout): means that the slot has remained idle longer than
the duration specified by the idle_replication_slot_timeout parameter.

Code (guc var comment):  Invalidate replication slots idle longer than this time
Code (guc_tables): Sets the time limit for how long a replication slot
can remain idle before it is invalidated.

Msg (errdetail): This slot has been invalidated because it has
remained idle longer than the configured \"%s\" time.
Msg (errdetail): The slot has been inactive since %s and has remained
idle longer than the configured \"%s\" time.

~

NITPICKS:

nit -- There are still some variations "amount of time" versus "time"
versus "duration".  I think the term "duration" best describe the
maing so we can use that everywhere.

nit - Should consistently say "remained idle" instead of just "idle"
or "are idle",

nit - The last errdetail is also rearranged a bit because IMO we don't
need to say inactive and idle in the same sentence.

nit - Just say "longer than" instead of sometimes saying "for longer than"

~

SUGGESTIONS:

Docs (idle_replication_slot_timeout): Invalidate replication slots
that have remained idle longer than this duration.
Docs (idle_timeout): means that the slot has remained idle longer than
the configured idle_replication_slot_timeout duration.

Code (guc var comment):  Invalidate replication slots that have
remained idle longer than this duration.
Code (guc_tables): Sets the duration a replication slot can remain
idle before it is invalidated.

Msg (errdetail): This slot has been invalidated because it has
remained idle longer than the configured \"%s\" duration.
Msg (errdetail): The slot has remained idle since %s, which is longer
than the configured \"%s\" duration.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

17 December 2024, 01:46:49

On Mon, Dec 16, 2024 at 9:40 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Mon, Dec 16, 2024 at 9:58 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
...
> > SUGGESTIONS:
> >
> > Docs (idle_replication_slot_timeout): Invalidate replication slots
> > that have remained idle longer than this duration.
> > Docs (idle_timeout): means that the slot has remained idle longer than
> > the configured idle_replication_slot_timeout duration.
> >
> > Code (guc var comment):  Invalidate replication slots that have
> > remained idle longer than this duration.
> > Code (guc_tables): Sets the duration a replication slot can remain
> > idle before it is invalidated.
> >
> > Msg (errdetail): This slot has been invalidated because it has
> > remained idle longer than the configured \"%s\" duration.
> > Msg (errdetail): The slot has remained idle since %s, which is longer
> > than the configured \"%s\" duration.
> >
>
> Here is the v56 patch set with the above comments incorporated.
>

Hi Nisha.

Thanks for the updates.

- Both patches could be applied cleanly.
- Tests (make check, TAP subscriber, TAP recovery) are all passing.
- The rendering of the documentation changes from patch 0002 looked good.
- I have no more review comments.

So, the v56* patchset LGTM.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

20 December 2024, 12:42:06

On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Here is the v56 patch set with the above comments incorporated.
>

Review comments:
===============
1.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &idle_replication_slot_timeout_ms,

I think users are going to keep idele_slot timeout at least in hours.
So, millisecond seems the wrong choice to me. I suggest to keep the
units in minutes. I understand that writing a test would be
challenging as spending a minute or more on one test is not advisable.
But I don't see any test testing the other GUCs that are in minutes
(wal_summary_keep_time and log_rotation_age). The default value should
be one day.

2.
+ /*
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
+ */
+ if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+ {
+ StringInfoData err_detail;
+
+ initStringInfo(&err_detail);
+
+ switch (s->data.invalidated)
+ {
+ case RS_INVAL_WAL_REMOVED:
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because the required WAL has been removed."));
+ break;
+
+ case RS_INVAL_HORIZON:
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because the required rows have been removed."));
+ break;
+
+ case RS_INVAL_WAL_LEVEL:
+ /* translator: %s is a GUC variable name */
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because \"%s\" is insufficient for slot."),
+ "wal_level");
+ break;
+
+ case RS_INVAL_NONE:
+ pg_unreachable();
+ }
+
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+    NameStr(s->data.name)),
+ errdetail_internal("%s", err_detail.data));
+ }
+

This should be moved to a separate function.

3.
+static inline bool
+IsSlotIdleTimeoutPossible(ReplicationSlot *s)

Would it be better to name this function as CanInvalidateIdleSlot()?
The current name doesn't seem to match with similar other
functionalities.

--
With Regards,
Amit Kapila.

RE: Introduce XID age and inactive timeout based replication slot invalidation

From

"Zhijie Hou (Fujitsu)"

Date:

26 December 2024, 09:02:20

On Tuesday, December 24, 2024 8:57 PM Michail Nikolaev <michail.nikolaev@gmail.com>  wrote:

Hi,

> Yesterday I got a strange set of test errors, probably somehow related to
> that patch. It happened on changed master branch (based on
> d96d1d5152f30d15678e08e75b42756101b7cab6) but I don't think my changes were
> affecting it.
> 
> My setup is a little bit tricky: Windows 11 run WSL2 with Ubuntu, meson.
> 
> So, `recovery ` suite started failing on:
> 
> 1) at /src/test/recovery/t/http://019_replslot_limit.pl line 530.
> 2) at /src/test/recovery/t/http://040_standby_failover_slots_sync.pl line
>    198.
> 
> It was failing almost every run, one test or another. I was lurking around
> for about 10 min, and..... it just stopped failing. And I can't reproduce it
> anymore.
> 
> But I have logs of two fails. I am not sure if it is helpful, but decided to
> mail them here just in case.

Thanks for reporting the issue.

After checking the log, I think the failure is caused by the unexpected
behavior of the local system clock.

It's clear from the '019_replslot_limit_primary4.log'[1] that the clock went
backwards which makes the slot's inactive_since go backwards as well. That's
why the last testcase didn't pass.

And for 040_standby_failover_slots_sync, we can see that the clock of standby
lags behind that of the primary, which caused the inactive_since of newly synced
slot on standby to be earlier than the one on the primary.

So, I think it's not a bug in the committed patch but an issue in the testing
environment. Besides, since we have not seen such failures on BF, I think it
may not be necessary to improve the testcases.

[1]
2024-12-24 01:37:19.967 CET [161409] sub STATEMENT:  START_REPLICATION SLOT "lsub4_slot" LOGICAL 0/0 (proto_version
'4',streaming 'parallel', origin 'any', publication_names '"pub"')

...
2024-12-24 01:37:20.025 CET [161447] 019_replslot_limit.pl LOG:  statement: SELECT '0/30003D8' <= replay_lsn AND state
='streaming'

...
2024-12-24 01:37:19.388 CET [161097] LOG:  received fast shutdown request

Best Regards,
Hou zj

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Michail Nikolaev

Date:

26 December 2024, 16:56:50

Hello, Hou!

> So, I think it's not a bug in the committed patch but an issue in the testing
> venvironment. Besides, since we have not seen such failures on BF, I think it
> may not be necessary to improve the testcases.

Thanks for your analysis!

Yes, probably WSL2/Windows interactions cause strange system clock moving.

It looks like it is a common issue with WSL2 [0].

[0]: https://github.com/microsoft/WSL/issues/10006

Best regards,

Mikhail.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

27 December 2024, 06:52:18

On Tue, 24 Dec 2024 at 17:07, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Fri, Dec 20, 2024 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>> >
>> > Here is the v56 patch set with the above comments incorporated.
>> >
>>
>> Review comments:
>> ===============
>> 1.
>> + {
>> + {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
>> + gettext_noop("Sets the duration a replication slot can remain idle before "
>> + "it is invalidated."),
>> + NULL,
>> + GUC_UNIT_MS
>> + },
>> + &idle_replication_slot_timeout_ms,
>>
>> I think users are going to keep idele_slot timeout at least in hours.
>> So, millisecond seems the wrong choice to me. I suggest to keep the
>> units in minutes. I understand that writing a test would be
>> challenging as spending a minute or more on one test is not advisable.
>> But I don't see any test testing the other GUCs that are in minutes
>> (wal_summary_keep_time and log_rotation_age). The default value should
>> be one day.
>
>
> +1
> - Changed the GUC unit to "minute".
>
> Regarding the tests, we have two potential options:
>  1) Introduce an additional "debug_xx" GUC parameter with units of seconds or milliseconds, only for testing
purposes.
>  2) Skip writing tests for this, similar to other GUCs with units in minutes.
>
> IMO, adding an additional GUC just for testing may not be worthwhile. It's reasonable to proceed without the test.
>
> Thoughts?
>
> The attached v57 patch-set addresses all the comments. I have kept the test case in the patch for now, it takes 2-3
minutesto complete. 

Few comments:
1) We have disabled the similar configuration max_slot_wal_keep_size
by setting to -1, as this GUC also is in similar lines, should we
disable this and let the user configure it?
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int                    idle_replication_slot_timeout_min =
HOURS_PER_DAY * MINS_PER_HOUR;
+

2) I felt this behavior is an existing behavior, so this can also be
moved to 0001 patch:
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.


3) Can we change the comment below to "We don't allow the value of
idle_replication_slot_timeout other than 0 during the binary upgrade.
See start_postmaster() in pg_upgrade for more details.":
+ * The idle_replication_slot_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra,
GucSource source)

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

30 December 2024, 08:02:19

Hi Nisha,

Here are some review comments for the patch v57-0001.

======
src/backend/replication/slot.c

1.
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int idle_replication_slot_timeout_min = HOURS_PER_DAY * MINS_PER_HOUR;

IMO it would be better to have the suffix "_mins" instead of "_min"
here to avoid any confusion with "minimum".

~~~

2.
+/*
+ * Can invalidate an idle replication slot?
+ *

Not an English sentence.

======
src/backend/utils/adt/timestamp.c

3.
+ /* Return if the difference meets or exceeds the threshold */
+ return (secs >= threshold_sec);

That comment may not be necessary; it is saying just the same as the code.

======
src/backend/utils/misc/guc_tables.c

4.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MIN
+ },
+ &idle_replication_slot_timeout_min,
+ HOURS_PER_DAY * MINS_PER_HOUR, 0, INT_MAX / SECS_PER_MINUTE,
+ check_idle_replication_slot_timeout, NULL, NULL
+ },
+

Maybe it's better to include a comment that says "24 hours".

(e.g. like wal_summary_keep_time does)

======
src/backend/utils/misc/postgresql.conf.sample

5.
 #track_commit_timestamp = off # collect timestamp of transaction commit
  # (change requires restart)
+#idle_replication_slot_timeout = 1d # in minutes; 0 disables


I felt it might be better to say 24h here instead of 1d. And, that
would also be consistent with the docs, which said the default was 24
hours.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

30 December 2024, 08:34:45

On Tue, Dec 24, 2024 at 10:37 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Fri, Dec 20, 2024 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>> >
>> > Here is the v56 patch set with the above comments incorporated.
>> >
>>
>> Review comments:
>> ===============
>> 1.
>> + {
>> + {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
>> + gettext_noop("Sets the duration a replication slot can remain idle before "
>> + "it is invalidated."),
>> + NULL,
>> + GUC_UNIT_MS
>> + },
>> + &idle_replication_slot_timeout_ms,
>>
>> I think users are going to keep idele_slot timeout at least in hours.
>> So, millisecond seems the wrong choice to me. I suggest to keep the
>> units in minutes. I understand that writing a test would be
>> challenging as spending a minute or more on one test is not advisable.
>> But I don't see any test testing the other GUCs that are in minutes
>> (wal_summary_keep_time and log_rotation_age). The default value should
>> be one day.
>
>
> +1
> - Changed the GUC unit to "minute".
>
> Regarding the tests, we have two potential options:
>  1) Introduce an additional "debug_xx" GUC parameter with units of seconds or milliseconds, only for testing
purposes.
>  2) Skip writing tests for this, similar to other GUCs with units in minutes.
>
> IMO, adding an additional GUC just for testing may not be worthwhile. It's reasonable to proceed without the test.
>
> Thoughts?
>
> The attached v57 patch-set addresses all the comments. I have kept the test case in the patch for now, it takes 2-3
minutesto complete. 
>

Hi Nisha.

I think we are often too quick to throw out perfectly good tests.
Citing that some similar GUCs don't do testing as a reason to skip
them just seems to me like an example of "two wrongs don't make a
right".

There is a third option.

Keep the tests. Because they take excessive time to run, that simply
means you should run them *conditionally* based on the PG_TEST_EXTRA
environment variable so they don't impact the normal BF execution. The
documentation [1] says this env var is for "resource intensive" tests
-- AFAIK this is exactly the scenario we find ourselves in, so is
exactly what this env var was meant for.

Search other *.pl tests for PG_TEST_EXTRA to see some examples.

======
[1] https://www.postgresql.org/docs/17/regress-run.html

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

31 December 2024, 10:05:10

On Fri, Dec 27, 2024 at 9:22 AM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Few comments:
> 1) We have disabled the similar configuration max_slot_wal_keep_size
> by setting to -1, as this GUC also is in similar lines, should we
> disable this and let the user configure it?
> +/*
> + * Invalidate replication slots that have remained idle longer than this
> + * duration; '0' disables it.
> + */
> +int                    idle_replication_slot_timeout_min =
> HOURS_PER_DAY * MINS_PER_HOUR;
> +
>

I’m okay with setting the default to either '1-day' or 'Off'. Let’s
wait for feedback from others.

>
> 2) I felt this behavior is an existing behavior, so this can also be
> moved to 0001 patch:
> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
> index a586156614..199d7248ee 100644
> --- a/doc/src/sgml/system-views.sgml
> +++ b/doc/src/sgml/system-views.sgml
> @@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
>        </para>
>        <para>
>          The time when the slot became inactive. <literal>NULL</literal> if the
> -        slot is currently being streamed.
> +        slot is currently being streamed. If the slot becomes invalidated,
> +        this value will remain unchanged until server shutdown.
>

You are correct that the 'inactive_since' value getting reset on
server restart has been the existing behavior.
However, earlier, there was no guarantee that it would remain
unchanged for invalid slots. The new function
"ReplicationSlotSetInactiveSince()" in patch 002, ensures that the
value does not change for invalidated slots until the server is shut
down. Therefore, I feel the doc addition in patch 002 is appropriate.

>
> 3) Can we change the comment below to "We don't allow the value of
> idle_replication_slot_timeout other than 0 during the binary upgrade.
> See start_postmaster() in pg_upgrade for more details.":
> + * The idle_replication_slot_timeout must be disabled (set to 0)
> + * during the binary upgrade.
> + */
> +bool
> +check_idle_replication_slot_timeout(int *newval, void **extra,
> GucSource source)

Done.

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

02 January, 03:13:38

Hi Nisha.

My review comments for patch v58-0001.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:

1.
  /*
- * If the slot can be acquired, do so and mark it invalidated
- * immediately.  Otherwise we'll signal the owning process, below, and
- * retry.
+ * If the slot can be acquired, do so and mark it as invalidated. If
+ * the slot is already ours, mark it as invalidated. Otherwise, we'll
+ * signal the owning process below and retry.
  */
- if (active_pid == 0)
+ if (active_pid == 0 ||
+ (MyReplicationSlot == s && active_pid == MyProcPid))
  {

As you previously explained [1] "This change applies to all types of
invalidation, not just inactive_timeout case [...] It's a general
optimization for the case when the current process is the active PID
for the slot."

In that case, should this be in a separate patch that can be pushed to
master by itself, i.e. independent of anything else in this thread
that is being done for the purpose of implementing the timeout
feature?

======
[1] https://www.postgresql.org/message-id/CABdArM5tcYTQ2zeAPWTciTnea4jj6sPUjVY9M1O-4wWoTBjFgw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Austalia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

02 January, 05:46:20

Hi Nisha,

Here are some minor review comments for patch v58-0002.

======
src/backend/replication/slot.c

check_replication_slot_inactive_timeout:

1.
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * We don't allow the value of idle_replication_slot_timeout other
+ * than 0 during the binary upgrade.
+ * See start_postmaster() in pg_upgrade for more details.
+ */

If you want to express it this way, then it seems there are some
wrong/missing words:

SUGGESTION #1.
We don't allow any value of idle_replication_slot_timeout other than 0
during a binary upgrade.

SUGGESTION #2.
We don't allow the value of idle_replication_slot_timeout to be other
than 0 during a binary upgrade.

~

(But, I prefer more terse comments which are not negative-sounding. YMMV).

SUGGESTION #3 (nearly identical text to the actual error message)
The value of idle_replication_slot_timeout must be set to 0 during a
binary upgrade.

======
src/test/recovery/README

2.
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. The test takes over 2 minutes, so not done
+by default.
+

Maybe it's better to use consistent wording with the other tests like
this one already in the README:

/The test/This test/

/so not done by default./so it's not done by default./


======
.../t/043_invalidate_inactive_slots.pl

3.
+# Copyright (c) 2024, PostgreSQL Global Development Group
+

Happy New Year.

/2024/2025/

~~~

4.
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+ || $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+ plan skip_all =>
+   'A time consuming test, idle_replication_slot_timeout is not
enabled in PG_TEST_EXTRA';
+}

4a.
I noticed the other skipping TAP tests like this have a simpler
message without giving a reason, so maybe it's better to be consistent
with those:

SUGGESTION:
plan skip_all => "test idle_replication_slot_timeout not enabled in
PG_TEST_EXTRA";

~

4b.
Should the check be done right at the top of the file (e.g. even
before the "# Testcase start" comment)?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

02 January, 13:26:28

On Thu, Jan 2, 2025 at 5:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Nisha.
>
> My review comments for patch v58-0001.
>
> ======
> src/backend/replication/slot.c
>
> InvalidatePossiblyObsoleteSlot:
>
> 1.
>   /*
> - * If the slot can be acquired, do so and mark it invalidated
> - * immediately.  Otherwise we'll signal the owning process, below, and
> - * retry.
> + * If the slot can be acquired, do so and mark it as invalidated. If
> + * the slot is already ours, mark it as invalidated. Otherwise, we'll
> + * signal the owning process below and retry.
>   */
> - if (active_pid == 0)
> + if (active_pid == 0 ||
> + (MyReplicationSlot == s && active_pid == MyProcPid))
>   {
>
> As you previously explained [1] "This change applies to all types of
> invalidation, not just inactive_timeout case [...] It's a general
> optimization for the case when the current process is the active PID
> for the slot."
>
> In that case, should this be in a separate patch that can be pushed to
> master by itself, i.e. independent of anything else in this thread
> that is being done for the purpose of implementing the timeout
> feature?

The patch-001 has additional general optimizations similar to the one
you mentioned, which are not strictly required for this feature.
Let’s wait for input from others on splitting the patches or
addressing it in a separate thread.

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

27 January, 13:50:05

On Mon, Jan 27, 2025 at 11:00 AM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> I discussed the above comments further with Peter off-list, and here
> are the v63 patches with the following changes:
>  patch-001: The Assert and related comments have been updated for clarity.
>

The 0001 patch should be discussed in a separate thread as those are
general improvements that are useful even without the main patch we
are trying to achieve in this thread. I suggest we break it into three
patches (a) Ensure the same inactive_since time for all slots, (b)
Raise an error for invalid slots during ReplicationSlotAcquire(); tell
in the commit message, without this patch when such an ERROR would
have otherwise occurred, and (c) Changes in
InvalidatePossiblyObsoleteSlot(), I suggest to leave this change for
later as this impacts the core logic of invalidation.

*
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
  Assert(MyReplicationSlot == NULL);
  Assert(failover || two_phase);

- ReplicationSlotAcquire(name, false);
+ ReplicationSlotAcquire(name, false, false);

Why don't we want to give ERROR during Alter? I think it is okay to
not give ERROR for invalid slots during Drop as we are anyway removing
such slots.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

28 January, 12:56:13

On Mon, Dec 30, 2024 at 11:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> I think we are often too quick to throw out perfectly good tests.
> Citing that some similar GUCs don't do testing as a reason to skip
> them just seems to me like an example of "two wrongs don't make a
> right".
>
> There is a third option.
>
> Keep the tests. Because they take excessive time to run, that simply
> means you should run them *conditionally* based on the PG_TEST_EXTRA
> environment variable so they don't impact the normal BF execution. The
> documentation [1] says this env var is for "resource intensive" tests
> -- AFAIK this is exactly the scenario we find ourselves in, so is
> exactly what this env var was meant for.
>
> Search other *.pl tests for PG_TEST_EXTRA to see some examples.
>

I don't see the long-running tests to be added under PG_TEST_EXTRA as
that will make it unusable after some point. Now, if multiple senior
members feel it is okay to add long-running tests under PG_TEST_EXTRA
then I am open to considering it. We can keep this test as a separate
patch so that the patch is being tested in CI or in manual tests
before commit.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

28 January, 14:56:51

On Mon, Jan 27, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jan 27, 2025 at 11:00 AM Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > I discussed the above comments further with Peter off-list, and here
> > are the v63 patches with the following changes:
> >  patch-001: The Assert and related comments have been updated for clarity.
> >
>
> The 0001 patch should be discussed in a separate thread as those are
> general improvements that are useful even without the main patch we
> are trying to achieve in this thread. I suggest we break it into three
> patches (a) Ensure the same inactive_since time for all slots, (b)
> Raise an error for invalid slots during ReplicationSlotAcquire(); tell
> in the commit message, without this patch when such an ERROR would
> have otherwise occurred, and (c) Changes in
> InvalidatePossiblyObsoleteSlot(), I suggest to leave this change for
> later as this impacts the core logic of invalidation.
>

I have started a new thread for these general improvements and have
separated the changes (a) and (b) into different patches.

You can find the new thread at [1].

> *
> @@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
>   Assert(MyReplicationSlot == NULL);
>   Assert(failover || two_phase);
>
> - ReplicationSlotAcquire(name, false);
> + ReplicationSlotAcquire(name, false, false);
>
> Why don't we want to give ERROR during Alter? I think it is okay to
> not give ERROR for invalid slots during Drop as we are anyway removing
> such slots.
>

Because ReplicationSlotAlter() already handles errors immediately
after acquiring the slot. It raises errors for invalidated slots and
also raises a different error message if the slot is a physical one.
So, In case of ALTER, I feel it is okay to acquire the slot first
without raising errors and then handle errors in the pre-defined way.
Similar immediate error handling is not available at other places.

[1] https://www.postgresql.org/message-id/CABdArM6pBL5hPnSQ%2B5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw%40mail.gmail.com

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

29 January, 09:15:18

On Tue, 28 Jan 2025 at 17:28, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Tue, Jan 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 30, 2024 at 11:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > I think we are often too quick to throw out perfectly good tests.
> > > Citing that some similar GUCs don't do testing as a reason to skip
> > > them just seems to me like an example of "two wrongs don't make a
> > > right".
> > >
> > > There is a third option.
> > >
> > > Keep the tests. Because they take excessive time to run, that simply
> > > means you should run them *conditionally* based on the PG_TEST_EXTRA
> > > environment variable so they don't impact the normal BF execution. The
> > > documentation [1] says this env var is for "resource intensive" tests
> > > -- AFAIK this is exactly the scenario we find ourselves in, so is
> > > exactly what this env var was meant for.
> > >
> > > Search other *.pl tests for PG_TEST_EXTRA to see some examples.
> > >
> >
> > I don't see the long-running tests to be added under PG_TEST_EXTRA as
> > that will make it unusable after some point. Now, if multiple senior
> > members feel it is okay to add long-running tests under PG_TEST_EXTRA
> > then I am open to considering it. We can keep this test as a separate
> > patch so that the patch is being tested in CI or in manual tests
> > before commit.
> >
>
> Please find the attached v64 patches. The changes in this version
> w.r.t. older patch v63 are as -
> - The changes from the v63-0001 patch have been moved to a separate thread [1].
> - The v63-0002 patch has been split into two parts in v64:
>   1) 001 patch: Implements the main feature - inactive timeout-based
> slot invalidation.
>   2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
> as suggested above.
>
> [1] https://www.postgresql.org/message-id/CABdArM6pBL5hPnSQ%2B5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw%40mail.gmail.com

Few comments:
1) We can mention about the slot that do not reserve WAL is also not applicable:
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>

2) Similarly we can mention in the commit message also that it will
not be considered for slot that do not reserve WAL:
Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.

3) Since idle_replication_slot_timeout is somewhat similar to
max_slot_wal_keep_size, we can move idle_replication_slot_timeout
after max_slot_wal_keep_size instead of keeping it after
wal_sender_timeout.
+     <varlistentry id="guc-idle-replication-slot-timeout"
xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname>
(<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname>
configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout
invalidation mechanism.
+        The default is one day. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server
command line.
+       </para>

4) We can try to keep it to less than 80 char wherever possible:
a) Like in this case, "mechanism" can be moved to the next line:
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout
invalidation mechanism.

b) Similarly here too, "slot's" can be moved to the next line:
+        inactive slots. The duration of slot inactivity is calculated
using the slot's
+        <link
linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>

5) You can use new ereport style to exclude brackets around errcode:
+               ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                errmsg("can no longer get changes
from replication slot \"%s\"",
+                                               NameStr(s->data.name)),
+                                errdetail("This slot has been
invalidated because it has remained idle longer than the configured
\"%s\" duration.",
+
"idle_replication_slot_timeout")));

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

29 January, 10:01:04

On Tue, Jan 28, 2025 at 10:58 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> Please find the attached v64 patches. The changes in this version
> w.r.t. older patch v63 are as -
> - The changes from the v63-0001 patch have been moved to a separate thread [1].
> - The v63-0002 patch has been split into two parts in v64:
>   1) 001 patch: Implements the main feature - inactive timeout-based
> slot invalidation.
>   2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
> as suggested above.
>

Hi Nisha.

Some review comments for patch v64-0001.

======
1. General

Too much of this patch v64-0001 is identical/duplicated code with the
recent "spin-off" patch v1-0002 [1]. e.g. Most of v1-0001 is now also
embedded in the v64-0001.

This is making for an unnecessarily tricky 2 x review of all the same
code, and it will also cause rebase hassles later.

Even if you wanted the 'error_in_invalid' stuff to be discussed and
pushed separately, I think it will be much easier to keep a "COPY" of
that v1-0002 patch here as a pre-requisite for v64-0001 so then all of
the current code duplications can be removed.

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

2.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */

and

+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated due to inactive timeout.
+ */
+ if (error_if_invalid && s->data.invalidated == RS_INVAL_IDLE_TIMEOUT)
+ {

Although those comments are correct for v1-0001 [1] it is a misleading
comment in the hacked into v64-0001 because here you are only checking
invalidation cause RS_INVAL_IDLE_TIMEOUT but none of the other
possible causes.

~~~

ReportSlotInvalidation:

3.
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");

I have the same question already asked for my review of patch v1-0002
[1]. e.g. Isn't there some mismatch between using the _() macro which
is for translations, and using the errdetail_internal which is for
strings *not* requiring translation?

~~~

InvalidatePossiblyObsoleteSlot:

4.
/*
 * The logical replication slots shouldn't be invalidated as GUC
 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
 * check_old_cluster_for_valid_slots() where we ensure that no
 * invalidated before the upgrade.
 */
Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));

Unless I am mistaken, all of the v63 cleanups of the above binary
upgrade code assert stuff have vanished somewhere between v63 and v64.
I cannot find them in the spin-off thread. All accidentally lost? (in
2 places)

Not only that but the accompanying comment modification (to mention
"and idle_replication_slot_timeout is set to 0") is also MIA last seen
in v63 (??)

======
[1] https://www.postgresql.org/message-id/CABdArM6pBL5hPnSQ%2B5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

29 January, 10:14:26

On Tue, 28 Jan 2025 at 17:28, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Please find the attached v64 patches. The changes in this version
> w.r.t. older patch v63 are as -
> - The changes from the v63-0001 patch have been moved to a separate thread [1].
> - The v63-0002 patch has been split into two parts in v64:
>   1) 001 patch: Implements the main feature - inactive timeout-based
> slot invalidation.
>   2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
> as suggested above.

Currently the test takes around 220 seconds for me. We could do the
following changes to bring it down to around 70 to 80 seconds:
1) Set idle_replication_slot_timeout to 70 seconds
+# Avoid unpredictability
+$primary->append_conf(
+       'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;

2) I felt just 1 second more is enough unless you anticipate a random
failure, the test passes for me:
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);

3) Since we will be setting it to 70 seconds above, changing the
configuration and reload is not required:
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO
'${idle_timeout_1min}min';
+]);
+$primary->reload;

4) Here you can add some comments that 60s has elapsed and the slot
will get invalidated in another 10 seconds, and pass timeout as 10s to
wait_for_slot_invalidation:
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+       $idle_timeout_1min);

5) We can have another streaming replication cluster setup, may be
primary2 and standby2 nodes and stop the standby2 immediately along
with the first streaming replication cluster itself:
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+       $idle_timeout_1min);

6) We can rename primary to primary or standby1 to standby to keep the
name consistent:
+# Create standby slot on the primary
+$primary->safe_psql(
+       'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name :=
'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

29 January, 11:37:07

On Wed, Jan 29, 2025 at 12:44 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 28 Jan 2025 at 17:28, Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > Please find the attached v64 patches. The changes in this version
> > w.r.t. older patch v63 are as -
> > - The changes from the v63-0001 patch have been moved to a separate thread [1].
> > - The v63-0002 patch has been split into two parts in v64:
> >   1) 001 patch: Implements the main feature - inactive timeout-based
> > slot invalidation.
> >   2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
> > as suggested above.
>
> Currently the test takes around 220 seconds for me. We could do the
> following changes to bring it down to around 70 to 80 seconds:
>

Even then it is too long for a single test to be part of committed
code. So, we can temporarily reduce its time but fixing comments on
this is not a good use of time. We need to write this test in some
other way if we want to see it committed.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

31 January, 08:09:46

Hi Nisha.

Here are some review comments for patch v65-0002

======
src/backend/replication/slot.c

ReportSlotInvalidation:

1.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
+ break;
+

errdetail:

I guess it is no fault of this patch because I see you've only copied
nearby code, but AFAICT this function is still having an each-way bet
by using a mixture of _() macro which is for strings intended be
translated, but then only using them in errdetail_internal() which is
for strings that are NOT intended to be translated. Isn't it
contradictory? Why don't we use errdetail() here?

errhint:

Also, the way the 'hint' is implemented can only be meaningful for
RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was
always strange, but now that this patch has added another kind of
switch (cause) this hint implementation now looks increasingly hacky
to me; it is also inflexible -- e.g. if you ever wanted to add
different hints. A neater implementation would be to make the code
more like how the err_detail is handled, so then the errhint string
would only be assigned within the "case RS_INVAL_WAL_REMOVED:"

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

03 February, 03:46:17

Hi Nisha,

Some review comments for v66-0001.

======
src/backend/replication/slot.c

ReportSlotInvalidation:

1.
  StringInfoData err_detail;
+ StringInfoData err_hint;
  bool hint = false;

  initStringInfo(&err_detail);
+ initStringInfo(&err_hint);


I don't think you still need the 'hint' boolean anymore.

Instead of:
hint ? errhint("%s", err_hint.data) : 0);

You could just do something like:
err_hint.len ? errhint("%s", err_hint.data) : 0);

~~~

2.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "max_slot_wal_keep_size");
  break;
2a.
In this case, shouldn't you really be using macro _("You might need to
increase \"%s\".") so that the common format string would be got using
gettext()?

~

2b.
Should you include a /* translator */ comment here? Other places where
GUC name is substituted do this.

~~~

3.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "idle_replication_slot_timeout");
+ break;

3a.
Ditto above. IMO this common format string should be got using macro.
e.g.: _("You might need to increase \"%s\".")

~

3b.
Should you include a /* translator */ comment here? Other places where
GUC name is substituted do this.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

03 February, 06:33:47

On Fri, Jan 31, 2025 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > ======
> > src/backend/replication/slot.c
> >
> > ReportSlotInvalidation:
> >
> > 1.
> > +
> > + case RS_INVAL_IDLE_TIMEOUT:
> > + Assert(inactive_since > 0);
> > + /* translator: second %s is a GUC variable name */
> > + appendStringInfo(&err_detail,
> > + _("The slot has remained idle since %s, which is longer than the
> > configured \"%s\" duration."),
> > + timestamptz_to_str(inactive_since),
> > + "idle_replication_slot_timeout");
> > + break;
> > +
> >
> > errdetail:
> >
> > I guess it is no fault of this patch because I see you've only copied
> > nearby code, but AFAICT this function is still having an each-way bet
> > by using a mixture of _() macro which is for strings intended be
> > translated, but then only using them in errdetail_internal() which is
> > for strings that are NOT intended to be translated. Isn't it
> > contradictory? Why don't we use errdetail() here?
> >
>
> Your question is valid and I don't have an answer. I encourage you to
> start a new thread to clarify this.
>

I think this was a false alarm.

After studying this more deeply, I've changed my mind and now think
the code is OK as-is.

AFAICT errdetail_internal is used when not wanting to translate the
*fmt* string passed to it (see EVALUATE_MESSAGE in elog.c). Now, here
the format string is just "%s" so it's fine to not translate that.
Meanwhile, the string value being substituted to the "%s" was already
translated because of the _(x) macro aka gettext(x).

I found other examples similar to this -- see the
error_view_not_updatable() function in rewriteHandler.c which does:
ereport(ERROR,
...
 detail ? errdetail_internal("%s", _(detail)) : 0,
...

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

03 February, 08:50:56

On Mon, Feb 3, 2025 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Feb 3, 2025 at 9:04 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Fri, Jan 31, 2025 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > ======
> > > > src/backend/replication/slot.c
> > > >
> > > > ReportSlotInvalidation:
> > > >
> > > > 1.
> > > > +
> > > > + case RS_INVAL_IDLE_TIMEOUT:
> > > > + Assert(inactive_since > 0);
> > > > + /* translator: second %s is a GUC variable name */
> > > > + appendStringInfo(&err_detail,
> > > > + _("The slot has remained idle since %s, which is longer than the
> > > > configured \"%s\" duration."),
> > > > + timestamptz_to_str(inactive_since),
> > > > + "idle_replication_slot_timeout");
> > > > + break;
> > > > +
> > > >
> > > > errdetail:
> > > >
> > > > I guess it is no fault of this patch because I see you've only copied
> > > > nearby code, but AFAICT this function is still having an each-way bet
> > > > by using a mixture of _() macro which is for strings intended be
> > > > translated, but then only using them in errdetail_internal() which is
> > > > for strings that are NOT intended to be translated. Isn't it
> > > > contradictory? Why don't we use errdetail() here?
> > > >
> > >
> > > Your question is valid and I don't have an answer. I encourage you to
> > > start a new thread to clarify this.
> > >
> >
> > I think this was a false alarm.
> >
> > After studying this more deeply, I've changed my mind and now think
> > the code is OK as-is.
> >
> > AFAICT errdetail_internal is used when not wanting to translate the
> > *fmt* string passed to it (see EVALUATE_MESSAGE in elog.c). Now, here
> > the format string is just "%s" so it's fine to not translate that.
> > Meanwhile, the string value being substituted to the "%s" was already
> > translated because of the _(x) macro aka gettext(x).
> >
>
> I didn't get your point about " the "%s" was already translated
> because of ...". If we don't want to translate the message then why
> add '_(' to it in the first place?
>

I think this is same point where I was fooling myself yesterday.  In
fact we do want to translate the message seen by the user.

errdetail_internal really means don't translate the ***format
string***. In our case "%s" is not the message at all -- it is just
the a *format string* so translating "%s" is kind of meaningless.

e.g. Normally....

errdetail("translate me") <-- This would translate the fmt string but
here the fmt is also the message; i.e. it will do gettext("translate
me") internally.

errdetail_internal("translate me") <-- This won't translate anything;
you will have the raw fmt string "translate me"

~~

But since ReportSlotInvalidation is building the message on the fly
there is no single report so it is a bit different....

errdetail("%s", "translate me") <-- this would just use gettext("%s")
which is kind of useless. And the "translate me" is just a raw string
and won't be translated.

errdetail_internal("%s", "translate me") <-- this won't translate
anything; the fmt string and the "translate me" are just raw strings

errdetail_internal("%s", _("translate me"))  <-- This won't translate
the fmt string, but to translate %s is useless anyway. OTOH, the _()
macro means it will do gettext("translate me") so the "translate me"
string will get translated before it is substituted. This is
effectively what the ReportSlotInvalidation code is doing.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

03 February, 12:25:05

On Fri, Jan 31, 2025 at 5:50 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Please find the attached v66 patch set. The base patch(v65-001) is
> committed now, so I have rebased the patches.
>

*
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
...
@@ -2408,7 +2527,9 @@ RestoreSlotFromDisk(const char *name)
  /*
  * Set the time since the slot has become inactive after loading the
  * slot from the disk into memory. Whoever acquires the slot i.e.
- * makes the slot active will reset it.
+ * makes the slot active will reset it. Avoid calling
+ * ReplicationSlotSetInactiveSince() here, as it will not set the time
+ * for invalid slots.
  */
  slot->inactive_since = GetCurrentTimestamp();

It looks inconsistent to set inactive_since on restart for invalid
slots but not at other times. We don't need to set inactive_since for
invalid slots. The invalid slots should not be updated. Ideally, this
should be taken care in the patch that introduces inactive_since but
we can do that now. Let's do this as a separate patch altogether in a
new thread.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

03 February, 15:05:22

On Mon, Feb 3, 2025 at 2:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 31, 2025 at 5:50 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > Please find the attached v66 patch set. The base patch(v65-001) is
> > committed now, so I have rebased the patches.
> >
>
> *
>        <para>
>          The time when the slot became inactive. <literal>NULL</literal> if the
> -        slot is currently being streamed.
> +        slot is currently being streamed. If the slot becomes invalidated,
> +        this value will remain unchanged until server shutdown.
> ...
> @@ -2408,7 +2527,9 @@ RestoreSlotFromDisk(const char *name)
>   /*
>   * Set the time since the slot has become inactive after loading the
>   * slot from the disk into memory. Whoever acquires the slot i.e.
> - * makes the slot active will reset it.
> + * makes the slot active will reset it. Avoid calling
> + * ReplicationSlotSetInactiveSince() here, as it will not set the time
> + * for invalid slots.
>   */
>   slot->inactive_since = GetCurrentTimestamp();
>
> It looks inconsistent to set inactive_since on restart for invalid
> slots but not at other times. We don't need to set inactive_since for
> invalid slots. The invalid slots should not be updated. Ideally, this
> should be taken care in the patch that introduces inactive_since but
> we can do that now. Let's do this as a separate patch altogether in a
> new thread.

Created a new thread [1] to address the inactive_since update for
invalid slots in a separate patch.

[1] https://www.postgresql.org/message-id/CABdArM7QdifQ_MHmMA%3DCc4v8%2BMeckkwKncm2Nn6tX9wSCQ-%2Biw%40mail.gmail.com

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Shlok Kyal

Date:

03 February, 16:05:43

On Fri, 31 Jan 2025 at 17:50, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Fri, Jan 31, 2025 at 2:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > ======
> > > src/backend/replication/slot.c
> > >
> > > ReportSlotInvalidation:
> > >
> > > 1.
> > > +
> > > + case RS_INVAL_IDLE_TIMEOUT:
> > > + Assert(inactive_since > 0);
> > > + /* translator: second %s is a GUC variable name */
> > > + appendStringInfo(&err_detail,
> > > + _("The slot has remained idle since %s, which is longer than the
> > > configured \"%s\" duration."),
> > > + timestamptz_to_str(inactive_since),
> > > + "idle_replication_slot_timeout");
> > > + break;
> > > +
> > >
> > > errdetail:
> > >
> > > I guess it is no fault of this patch because I see you've only copied
> > > nearby code, but AFAICT this function is still having an each-way bet
> > > by using a mixture of _() macro which is for strings intended be
> > > translated, but then only using them in errdetail_internal() which is
> > > for strings that are NOT intended to be translated. Isn't it
> > > contradictory? Why don't we use errdetail() here?
> > >
> >
> > Your question is valid and I don't have an answer. I encourage you to
> > start a new thread to clarify this.
> >
> > > errhint:
> > >
> > > Also, the way the 'hint' is implemented can only be meaningful for
> > > RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was
> > > always strange, but now that this patch has added another kind of
> > > switch (cause) this hint implementation now looks increasingly hacky
> > > to me; it is also inflexible -- e.g. if you ever wanted to add
> > > different hints. A neater implementation would be to make the code
> > > more like how the err_detail is handled, so then the errhint string
> > > would only be assigned within the "case RS_INVAL_WAL_REMOVED:"
> > >
> >
> > This makes sense to me.
> >
> > +
> > + case RS_INVAL_IDLE_TIMEOUT:
> > + Assert(inactive_since > 0);
> > + /* translator: second %s is a GUC variable name */
> > + appendStringInfo(&err_detail,
> > + _("The slot has remained idle since %s, which is longer than the
> > configured \"%s\" duration."),
> > + timestamptz_to_str(inactive_since),
> > + "idle_replication_slot_timeout");
> >
> > I think the above message should be constructed on a model similar to
> > the following nearby message:"The slot's restart_lsn %X/%X exceeds the
> > limit by %llu bytes.". So, how about the following: "The slot's idle
> > time %s exceeds the configured \"%s\" duration"?
> >
> > Also, similar to max_slot_wal_keep_size, we should give a hint in this
> > case to increase idle_replication_slot_timeout.
> >
> > It is not clear why the injection point test is doing
> > pg_sync_replication_slots() etc. in the patch. The test should be
> > simple such that after creating a new physical or logical slot, enable
> > the injection point, then run the manual checkpoint command, and check
> > the invalidation status of the slot.
> >
>
> Thanks for the review! I have incorporated the above comments. The
> test in patch-002 has been optimized as suggested and now completes in
> less than a second.
> Please find the attached v66 patch set. The base patch(v65-001) is
> committed now, so I have rebased the patches.
>
> Thank you, Kuroda-san, for working on patch-002.
>

Hi Nisha,

I reviewed the v66 patch. I have few comments:

1. I also feel the default value should be set to '0' as suggested by
Vignesh in 1st point of [1].

2. Should we allow copying of invalidated slots?
Currently we are able to copy slots which are invalidated:

postgres=# select slot_name, active, restart_lsn, wal_status,
inactive_since , invalidation_reason from pg_replication_slots;
 slot_name | active | restart_lsn | wal_status |
inactive_since          | invalidation_reason
-----------+--------+-------------+------------+----------------------------------+---------------------
 test1     | f      | 0/16FDDE0   | lost       | 2025-02-03
18:28:01.802463+05:30 | idle_timeout
(1 row)

postgres=# select pg_copy_logical_replication_slot('test1', 'test2');
 pg_copy_logical_replication_slot
----------------------------------
 (test2,0/16FDE18)
(1 row)

postgres=# select slot_name, active, restart_lsn, wal_status,
inactive_since , invalidation_reason from pg_replication_slots;
 slot_name | active | restart_lsn | wal_status |
inactive_since          | invalidation_reason
-----------+--------+-------------+------------+----------------------------------+---------------------
 test1     | f      | 0/16FDDE0   | lost       | 2025-02-03
18:28:01.802463+05:30 | idle_timeout
 test2     | f      | 0/16FDDE0   | reserved   | 2025-02-03
18:29:53.478023+05:30 |
(2 rows)

3. We have similar behaviour as above for physical slots.

[1]: https://www.postgresql.org/message-id/CALDaNm14QrW5j6su%2BEAqjwnHbiwXJwO%2Byk73_%3D7yvc5TVY-43g%40mail.gmail.com


Thanks and Regards,
Shlok Kyal

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

04 February, 08:08:39

On Tue, Feb 4, 2025 at 4:02 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Mon, Feb 3, 2025 at 5:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Feb 3, 2025 at 6:16 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > >
> > > 2.
> > > + appendStringInfo(&err_hint, "You might need to increase \"%s\".",
> > > + "max_slot_wal_keep_size");
> > >   break;
> > > 2a.
> > > In this case, shouldn't you really be using macro _("You might need to
> > > increase \"%s\".") so that the common format string would be got using
> > > gettext()?
> > >
> > > ~
> > >
> > >
> > > ~~~
> > >
> > > 3.
> > > + appendStringInfo(&err_hint, "You might need to increase \"%s\".",
> > > + "idle_replication_slot_timeout");
> > > + break;
> > >
> > > 3a.
> > > Ditto above. IMO this common format string should be got using macro.
> > > e.g.: _("You might need to increase \"%s\".")
> > >
> > > ~
> >
> > Instead, we can directly use '_(' in errhint as we are doing in one
> > other similar place "errhint("%s", _(view_updatable_error))));". I
> > think we didn't use it for errdetail because, in one of the cases, it
> > needs to use ngettext
> >
>
> -1 for this suggestion because this will end up causing a gettext() on
> the entire hint where the GUC has already been substituted.
>
> e.g. it is effectively doing
>
> _("You might need to increase \"max_slot_wal_keep_size\".")
> _("You might need to increase \"idle_replication_slot_timeout\".")
>
> But that is contrary to the goal of reducing the burden on translators
> by using *common* messages wherever possible. IMO we should only
> request translation of the *common* part of the hint message.
>

Fair point. So, we can ignore my suggestion.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

04 February, 08:15:09

On Mon, Feb 3, 2025 at 6:35 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> I reviewed the v66 patch. I have few comments:
>
> 1. I also feel the default value should be set to '0' as suggested by
> Vignesh in 1st point of [1].
>

+1. This will ensure that the idle slots won't be invalidated by
default, the same as HEAD. We can change the default value based on
user inputs.

> 2. Should we allow copying of invalidated slots?
> Currently we are able to copy slots which are invalidated:
>
> postgres=# select slot_name, active, restart_lsn, wal_status,
> inactive_since , invalidation_reason from pg_replication_slots;
>  slot_name | active | restart_lsn | wal_status |
> inactive_since          | invalidation_reason
> -----------+--------+-------------+------------+----------------------------------+---------------------
>  test1     | f      | 0/16FDDE0   | lost       | 2025-02-03
> 18:28:01.802463+05:30 | idle_timeout
> (1 row)
>
> postgres=# select pg_copy_logical_replication_slot('test1', 'test2');
>  pg_copy_logical_replication_slot
> ----------------------------------
>  (test2,0/16FDE18)
> (1 row)
>
> postgres=# select slot_name, active, restart_lsn, wal_status,
> inactive_since , invalidation_reason from pg_replication_slots;
>  slot_name | active | restart_lsn | wal_status |
> inactive_since          | invalidation_reason
> -----------+--------+-------------+------------+----------------------------------+---------------------
>  test1     | f      | 0/16FDDE0   | lost       | 2025-02-03
> 18:28:01.802463+05:30 | idle_timeout
>  test2     | f      | 0/16FDDE0   | reserved   | 2025-02-03
> 18:29:53.478023+05:30 |
> (2 rows)
>

Is this related to this patch or the behavior of HEAD? If this
behavior is not introduced by this patch then we should discuss this
in a separate thread. I couldn't think of why anyone wants to copy the
invalid slots, so we should probably prohibit copying invalid slots
but that is a matter of separate discussion unless introduced by this
patch.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Shlok Kyal

Date:

04 February, 12:58:05

On Tue, 4 Feb 2025 at 10:45, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Feb 3, 2025 at 6:35 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > I reviewed the v66 patch. I have few comments:
> >
> > 1. I also feel the default value should be set to '0' as suggested by
> > Vignesh in 1st point of [1].
> >
>
> +1. This will ensure that the idle slots won't be invalidated by
> default, the same as HEAD. We can change the default value based on
> user inputs.
>
> > 2. Should we allow copying of invalidated slots?
> > Currently we are able to copy slots which are invalidated:
> >
> > postgres=# select slot_name, active, restart_lsn, wal_status,
> > inactive_since , invalidation_reason from pg_replication_slots;
> >  slot_name | active | restart_lsn | wal_status |
> > inactive_since          | invalidation_reason
> > -----------+--------+-------------+------------+----------------------------------+---------------------
> >  test1     | f      | 0/16FDDE0   | lost       | 2025-02-03
> > 18:28:01.802463+05:30 | idle_timeout
> > (1 row)
> >
> > postgres=# select pg_copy_logical_replication_slot('test1', 'test2');
> >  pg_copy_logical_replication_slot
> > ----------------------------------
> >  (test2,0/16FDE18)
> > (1 row)
> >
> > postgres=# select slot_name, active, restart_lsn, wal_status,
> > inactive_since , invalidation_reason from pg_replication_slots;
> >  slot_name | active | restart_lsn | wal_status |
> > inactive_since          | invalidation_reason
> > -----------+--------+-------------+------------+----------------------------------+---------------------
> >  test1     | f      | 0/16FDDE0   | lost       | 2025-02-03
> > 18:28:01.802463+05:30 | idle_timeout
> >  test2     | f      | 0/16FDDE0   | reserved   | 2025-02-03
> > 18:29:53.478023+05:30 |
> > (2 rows)
> >
>
> Is this related to this patch or the behavior of HEAD? If this
> behavior is not introduced by this patch then we should discuss this
> in a separate thread. I couldn't think of why anyone wants to copy the
> invalid slots, so we should probably prohibit copying invalid slots
> but that is a matter of separate discussion unless introduced by this
> patch.
>

Hi Amit,

I tested and found that this issue is present in HEAD as well.

There are three types of invalidation in HEAD:
1. "wal_removed"
2. "rows_removed"
3. "wal_level_insufficient"

for copying slot with invalidation "wal_removed" we get an error:

postgres=# select slot_name, active, active_pid, restart_lsn,
wal_status, invalidation_reason from pg_replication_slots;
slot_name | active | active_pid | restart_lsn | wal_status | invalidation_reason
-----------+--------+------------+-------------+------------+---------------------
test1     | f      |            |             | lost       | wal_removed
(1 row)
postgres=#  select pg_copy_logical_replication_slot('test1', 'test2');
ERROR:  cannot copy a replication slot that doesn't reserve WAL


But for slot with invalidation "rows_removed" and
"wal_level_insufficient" we are able to copy the slot:

postgres=# select slot_name, active, active_pid, restart_lsn,
wal_status, invalidation_reason from pg_replication_slots;
slot_name | active | active_pid | restart_lsn | wal_status | invalidation_reason
-----------+--------+------------+-------------+------------+---------------------
slot1     | f      |            | 0/302E718   | lost       | rows_removed
(1 row)
postgres=# select pg_copy_logical_replication_slot('slot1', 'slot2');
pg_copy_logical_replication_slot
----------------------------------
(slot2,0/302E770)
(1 row)
postgres=# select slot_name, active, active_pid, restart_lsn,
wal_status, invalidation_reason from pg_replication_slots;
slot_name | active | active_pid | restart_lsn | wal_status | invalidation_reason
-----------+--------+------------+-------------+------------+---------------------
slot1     | f      |            | 0/302E718   | lost       | rows_removed
slot2     | f      |            | 0/302E718   | reserved   |
(2 rows)

Similarly we can copy slot with invalidation "wal_level_insufficient".
I have started a new thread to address the issue [1].

[1]: https://www.postgresql.org/message-id/CANhcyEU65aH0VYnLiu=OhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg@mail.gmail.com

Thanks and Regards,
Shlok Kyal

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

04 February, 14:11:47

On Tue, 4 Feb 2025 at 15:58, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Here are the v68 patches, incorporating above as well as comments from [1].
>
Few comments:
1) Let's call TimestampDifferenceExceedsSeconds only if
idle_replication_slot_timeout_mins is set to avoid the
TimestampDifferenceExceedsSeconds function call and timestamp diff
calculation if not required:
+ if (CanInvalidateIdleSlot(s) &&
+ TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+   idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since;
+ }
+ break;

2) Let's keep the prototype after TimestampDifferenceExceeds to keep
it consistent with the source file and will also make it easy to
search:
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int  date2isoyear(int year, int mon, int mday);
 extern int     date2isoyearday(int year, int mon, int mday);

 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+
                   TimestampTz stop_time,
+
                   int threshold_sec);

3)How about we change the below:
+#ifdef USE_INJECTION_POINTS
+
+                                               /*
+                                                * To test idle
timeout slot invalidation, if the
+                                                * slot-time-out-inval
injection point is attached,
+                                                * set inactive_since
to a very old timestamp (1
+                                                * microsecond since
epoch) to immediately invalidate
+                                                * the slot.
+                                                */
+                                               if
(IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+                                                       s->inactive_since = 1;
+#endif
to:
#ifdef USE_INJECTION_POINTS
/*
* To test idle timeout slot invalidation, if the
* slot-time-out-inval injection point is attached,
* set inactive_since to current time and invalidate the slot immediately.
*/
if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval") &&
idle_replication_slot_timeout_mins)
{
invalidation_cause = cause;
inactive_since = s->inactive_since = now;
}
#else
/*
* Check if the slot needs to be invalidated due to
* idle_replication_slot_timeout GUC.
*/
if (TimestampDifferenceExceedsSeconds(s->inactive_since, now,
  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
{
invalidation_cause = cause;
inactive_since = s->inactive_since;
}
#endif

We can just invalidate the slot directly without checking the time
difference if idle_replication_slot_timeout_mins is set and
inactive_since can hold the now value.

Regards,
Vignesh

RE: Introduce XID age and inactive timeout based replication slot invalidation

From

"Hayato Kuroda (Fujitsu)"

Date:

04 February, 15:16:28

Dear Nisha,

Thanks for updating the patch! Here are my comments.

01.
```
+# Test for replication slots invalidation
```

Since the file tests only timeout invalidations, the comment seems too general.

02.
```
+       # Check that an invalidated slot cannot be acquired
+       my ($result, $stdout, $stderr);
+       ($result, $stdout, $stderr) = $node->psql(
+               'postgres', qq[
+                       SELECT pg_replication_slot_advance('$slot', '0/1');
+       ]);
+       ok( $stderr =~ /can no longer access replication slot "$slot"/,
+               "detected error upon trying to acquire invalidated slot $slot on node $node_name"
+         )
+         or die
+         "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
```

This part can be removal because this is not directly related with timeout invalidation.
If needed this can be outside the function and we can confirm only once.

03.
```
+# Initialize primary
+my $node = PostgreSQL::Test::Cluster->new('primary');
+$node->init(allows_streaming => 'logical');
```

I think this node is not "primary" because there are no standby nodes. We can use new('node').
Also some comments which used "primary" can be removed.

04.
```
+$node->psql('postgres',
+       q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
```

Please use safe_psql() instead of psql().

05.
```
my $logstart = -s $node->logfile;
```

According to other tests, the variable name can be $log_offset.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

05 February, 07:59:52

On Tue, 4 Feb 2025 at 19:56, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Here is v69 patch set addressing above and Kuroda-san's comments in [1].

Few minor suggestions:
1) In the slot invalidation reporting below:
+               case RS_INVAL_IDLE_TIMEOUT:
+                       Assert(inactive_since > 0);
+
+                       /* translator: second %s is a GUC variable name */
+                       appendStringInfo(&err_detail, _("The slot's
idle time %s exceeds the configured \"%s\" duration."),
+
timestamptz_to_str(inactive_since),
+
"idle_replication_slot_timeout");
+                       /* translator: %s is a GUC variable name */
+                       appendStringInfo(&err_hint, _("You might need
to increase \"%s\"."),
+
"idle_replication_slot_timeout");

It is logged like:
2025-02-05 10:04:11.616 IST [330567] DETAIL:  The slot's idle time
2025-02-05 10:02:49.131631+05:30 exceeds the configured
"idle_replication_slot_timeout" duration.

Here even though we tell idle time, we are logging the inactive_since
value which kind of gives a wrong meaning.

How about we change it to:
The slot has been inactive since 2025-02-05 10:02:49.131631+05:30,
which exceeds the configured "idle_replication_slot_timeout" duration.

2) Here we have mentioned about invalidation happens only for a)
released slots b) inactive slots replication slots c) slot where
communication between pub and sub is down
+                * XXX: Slot invalidation due to 'idle_timeout' applies only to
+                * released slots, and is based on the
'idle_replication_slot_timeout'
+                * GUC. Active slots currently in use for replication
are excluded to
+                * prevent accidental invalidation. Slots where
communication between
+                * the publisher and subscriber is down are also
excluded, as they are
+                * managed by the 'wal_sender_timeout'.
+                */
+               InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+
            0,
+
            InvalidOid,
+
            InvalidTransactionId);
a) Can we include about slots which does not reserve WAL are also not
considered.
b) Could we present this in a bullet-point format like the following:
+                * XXX: Slot invalidation due to 'idle_timeout' applies only to:
+                * 1) released slots, and is based on the
'idle_replication_slot_timeout'
+                * GUC. 2) Active slots currently in use for
replication are excluded to
+                * prevent accidental invalidation. 3) Slots where
communication between
+                * the publisher and subscriber is down are also
excluded, as they are
+                * managed by the 'wal_sender_timeout'.
+                */
c) While I was initially reviewing the patch I also had the similar
thoughts on my mind, if we could mention the one like "Slots where
communication between the publisher and subscriber is down are also
excluded, as they are managed by the 'wal_sender_timeout'" in the
documentation it might be good.

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

05 February, 10:28:09

Hi Nisha,

Some review comments for the patch v69-0002.

======
src/backend/replication/slot.c

1.
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * To test idle timeout slot invalidation, if the
+ * slot-time-out-inval injection point is attached,
+ * immediately invalidate the slot.
+ */
+ if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since = now;
+ break;
+ }
+#endif

1a.
I didn't understand the reason for the assignment ' = now' here. This
is not happening in the normal code path so why do you need to do this
in this test code path? It works for me without doing this.

~

1b.
For testing, I think we should try to keep the injection code
differences minimal -- e.g. share the same (normal build) code as much
as possible. For example, I suggest refactoring like below. Well, it
works for me.

/*
 * Check if the slot needs to be invalidated due to
 * idle_replication_slot_timeout GUC.
 *
 * To test idle timeout slot invalidation, if the
 * "slot-time-out-inval" injection point is attached,
 * immediately invalidate the slot.
 */
if (
#ifdef USE_INJECTION_POINTS
  IS_INJECTION_POINT_ATTACHED("slot-time-out-inval") ||
#endif
  TimestampDifferenceExceedsSeconds(s->inactive_since, now,
    idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
{
  invalidation_cause = cause;
  inactive_since = s->inactive_since;
}

~

1c.
Can we call the injection point "timeout" instead of "time-out"?

======
.../t/044_invalidate_inactive_slots.pl

2.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}

At first, I had no idea how to build for this test. It would be good
to include a link to the injection build instructions in a comment
somewhere near here.

~~~

3.
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset) = @_;
+ my $node_name = $node->name;
+

Might be better to call the variable $slot_name instead of $slot.

Also then it will be consistent with $node_name

~~~

4.
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.

I misread this comment at first -- maybe it is clearer to reverse the wording?

/extension injection_points/’injection_points’ extension/

~~~

5.
+# Run a checkpoint which will invalidate the slots
+$node->safe_psql('postgres', "CHECKPOINT");

The explanation seems a bit terse -- I think the comment should
elaborate a bit more to explain that CHECKPOINT is just where the idle
slot timeout is checked, but since the test is using injection point
and the injection code enforces immediate idle timeout THAT is why it
will invalidate the slots...

~~~

6.
+# Wait for slots to become inactive. Note that nobody has acquired the slot
+# yet, so it must get invalidated due to idle timeout.

IIUC this comment means:

SUGGESTION
Note that since nobody has acquired the slot yet, then if it has been
invalidated that can only be due to the idle timeout mechanism.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

05 February, 12:12:31

On Wed, Feb 5, 2025 at 10:30 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 4 Feb 2025 at 19:56, Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > Here is v69 patch set addressing above and Kuroda-san's comments in [1].
>
> Few minor suggestions:
> 1) In the slot invalidation reporting below:
> +               case RS_INVAL_IDLE_TIMEOUT:
> +                       Assert(inactive_since > 0);
> +
> +                       /* translator: second %s is a GUC variable name */
> +                       appendStringInfo(&err_detail, _("The slot's
> idle time %s exceeds the configured \"%s\" duration."),
> +
> timestamptz_to_str(inactive_since),
> +
> "idle_replication_slot_timeout");
> +                       /* translator: %s is a GUC variable name */
> +                       appendStringInfo(&err_hint, _("You might need
> to increase \"%s\"."),
> +
> "idle_replication_slot_timeout");
>
> It is logged like:
> 2025-02-05 10:04:11.616 IST [330567] DETAIL:  The slot's idle time
> 2025-02-05 10:02:49.131631+05:30 exceeds the configured
> "idle_replication_slot_timeout" duration.
>
> Here even though we tell idle time, we are logging the inactive_since
> value which kind of gives a wrong meaning.
>
> How about we change it to:
> The slot has been inactive since 2025-02-05 10:02:49.131631+05:30,
> which exceeds the configured "idle_replication_slot_timeout" duration.
>

Would it address your concern if we write the actual idle duration
(now - inactive_since) instead of directly using inactive_since in the
above message?

A few other comments:
1.
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.

The 4th point in the above comment and the rest of the comment is
mostly saying the same thing.

2.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)

Can we try and see how the patch looks if we try to invalidate the
slot due to idle time at the same time when we are trying to
invalidate due to WAL?

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

06 February, 05:32:07

On Wed, Feb 5, 2025 at 10:30 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 4 Feb 2025 at 19:56, Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > Here is v69 patch set addressing above and Kuroda-san's comments in [1].
>
> 2) Here we have mentioned about invalidation happens only for a)
> released slots b) inactive slots replication slots c) slot where
> communication between pub and sub is down
> +                * XXX: Slot invalidation due to 'idle_timeout' applies only to
> +                * released slots, and is based on the
> 'idle_replication_slot_timeout'
> +                * GUC. Active slots currently in use for replication
> are excluded to
> +                * prevent accidental invalidation. Slots where
> communication between
> +                * the publisher and subscriber is down are also
> excluded, as they are
> +                * managed by the 'wal_sender_timeout'.
> +                */
> +               InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
> +
>             0,
> +
>             InvalidOid,
> +
>             InvalidTransactionId);
> a) Can we include about slots which does not reserve WAL are also not
> considered.

We have included all the info regarding which slots are excluded in
the documents, so I feel we can remove the XXX: comment from here.
(done in v70).

> c) While I was initially reviewing the patch I also had the similar
> thoughts on my mind, if we could mention the one like "Slots where
> communication between the publisher and subscriber is down are also
> excluded, as they are managed by the 'wal_sender_timeout'" in the
> documentation it might be good.
>

v70 adds the suggested info in the docs.

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

06 February, 05:32:13

On Wed, Feb 5, 2025 at 12:58 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Nisha,
>
> Some review comments for the patch v69-0002.
>
> ======
> .../t/044_invalidate_inactive_slots.pl
>
> 2.
> +if ($ENV{enable_injection_points} ne 'yes')
> +{
> + plan skip_all => 'Injection points not supported by this build';
> +}
>
> At first, I had no idea how to build for this test. It would be good
> to include a link to the injection build instructions in a comment
> somewhere near here.
>

I’ve added comments with build instructions in v70, but I’m not sure
if a link to the documentation is necessary. I didn’t find similar
instructions in other injection point-dependent tests. Let’s see what
others think.

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

vignesh C

Date:

06 February, 16:56:33

On Thu, 6 Feb 2025 at 16:08, Nisha Moond <nisha.moond412@gmail.com> wrote:
> Here are the v71 patches with the above comments incorporated.

Few comments:
1) While changing the switch to an if condition, the behavior of the
break statement has changed. Previously, it would exit the switch, but
now it exits the main for loop without releasing the locks. These
should be replaced with a goto to ensure the locks are properly
released.
+                       if (cause & RS_INVAL_HORIZON)
+                       {
+                               if (!SlotIsLogical(s))
                                        break;
-                               case RS_INVAL_WAL_LEVEL:
-                                       if (SlotIsLogical(s))
-                                               invalidation_cause = cause;
+                               /* invalid DB oid signals a shared relation */
+                               if (dboid != InvalidOid && dboid !=
s->data.database)
                                        break;

2) None of this initialization is required, as we will be setting
these values before using it:
+       int                     minutes = 0;
+       int                     secs = 0;
+       long            elapsed_secs = 0;

Regards,
Vignesh

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

07 February, 05:29:50

Hi Nisha,

Some review comments for v71-0001.

======
src/backend/access/transam/xlog.c

1.
  XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
  KeepLogSeg(recptr, &_logSegNo);
- if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+ if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
RS_INVAL_IDLE_TIMEOUT,
     _logSegNo, InvalidOid,
     InvalidTransactionId))
  {
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
  replayPtr = GetXLogReplayRecPtr(&replayTLI);
  endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
  KeepLogSeg(endptr, &_logSegNo);
- if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+ if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
RS_INVAL_IDLE_TIMEOUT,
     _logSegNo, InvalidOid,
     InvalidTransactionId))

It seems fundamentally strange to me to assign multiple simultaneous
causes like this. IMO you can't invalidate something that is invalid
already. I gues v71 was an attempt to implement Amit's:

> > Can we try and see how the patch looks if we try to invalidate the
> > slot due to idle time at the same time when we are trying to
> > invalidate due to WAL?

But, AFAICT the current code now has a confused mixture of:
'cause' parameter meaning "this is the invalidation cause", versus
'cause' parameter meaning "here is a mask of possible causes"

======
src/backend/replication/slot.c

SlotInvalidationCauses[]

2.
  [RS_INVAL_WAL_REMOVED] = "wal_removed",
  [RS_INVAL_HORIZON] = "rows_removed",
  [RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+ [RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };

By using bit flags in the enum (see slot.h) and designated
initializers here in SlotInvalidationCauses[], you'll end up with 9
entries (0-0x08) instead of 4, and the other undesignated entries will
be all NULL. Maybe it is intended, but if it is I think it is strange
to be indexing by bit flags so at least you should have a comment.

If you really need bitflags then perhaps it is better to maintain them
in addition to the v70 enum values (??)

~~~

3.
 /* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT

Hmm. The impact of using bit flags has (probably) unintended
consequences. e.g. Now you've made the GetSlotInvalidationCause()
function worse than before because now it will be iterating over all
the undesignated NULL entries of the array when searching for the
matching cause.

~~~

4.
+ /* Calculate the idle time duration of the slot */
+ elapsed_secs = (now - inactive_since) / USECS_PER_SEC;
+ minutes = elapsed_secs / SECS_PER_MINUTE;
+ secs = elapsed_secs % SECS_PER_MINUTE;
+
+ /* translator: %s is a GUC variable name */
+ appendStringInfo(&err_detail, _("The slot's idle time of %d minutes
and %d seconds exceeds the configured \"%s\" duration."),
+ minutes, secs, "idle_replication_slot_timeout");

Idleness timeout durations defined like 1d aren't going to look pretty
using this log format. We already discussed off-list about how to make
this better, but not done yet?

~~~

5.
+ if (cause & RS_INVAL_HORIZON)
+ {
+ if (!SlotIsLogical(s))
  break;

The meaning of the 'break' here is different to before. Now breaking
the entire for-loop instead of just breaking from the switch.
(same already posted by Vignesh)

~~~

6.
  ReportSlotInvalidation(invalidation_cause, true, active_pid,
     slotname, restart_lsn,
-    oldestLSN, snapshotConflictHorizon);
+    oldestLSN, snapshotConflictHorizon,
+    inactive_since, now);

  if (MyBackendType == B_STARTUP)
  (void) SendProcSignal(active_pid,
@@ -1785,7 +1881,8 @@
InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,

  ReportSlotInvalidation(invalidation_cause, false, active_pid,
     slotname, restart_lsn,
-    oldestLSN, snapshotConflictHorizon);
+    oldestLSN, snapshotConflictHorizon,
+    inactive_since, now);

If the cause was not already (masked with) RS_INVAL_IDLE_TIMEOUT then
AFAICT 'now' will still be 0 here.

This seems an unexpected quirk, which at best is quite misleading.
Even if the code sty like this I felt ReportSlotInvalidation should
Assert 'now' must be 0 unless the cause passed was
RS_INVAL_IDLE_TIMEOUT.

~~~

CheckPointReplicationSlots:

7.
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.

Since the v70 code was removed in v71, the function now is the same as
master. So did we need the function comment change?

======
src/include/replication/slot.h

8.
- * When adding a new invalidation cause here, remember to update
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
  * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
- RS_INVAL_NONE,
+ RS_INVAL_NONE = 0x00,
  /* required WAL has been removed */
- RS_INVAL_WAL_REMOVED,
+ RS_INVAL_WAL_REMOVED = 0x01,
  /* required rows have been removed */
- RS_INVAL_HORIZON,
+ RS_INVAL_HORIZON = 0x02,
  /* wal_level insufficient for slot */
- RS_INVAL_WAL_LEVEL,
+ RS_INVAL_WAL_LEVEL = 0x04,
+ /* idle slot timeout has occurred */
+ RS_INVAL_IDLE_TIMEOUT = 0x08,
 } ReplicationSlotInvalidationCause;

8a.
IMO enums are intended for  discrete values like "red" or "blue", but
not combinations of values like "reddy-bluey". AFAIK this kind of
usage is not normal and is discouraged in C programming.

So if you need bitflags then really the bit flags should be #define etc.

~

8b.
Does it make sense? You can't invalidate something that is already
invalid, so what does it even mean to have multiple simultaneous
ReplicationSlotInvalidationCause values? AFAICT it was only done like
this to CHECK for multiple **possible**  causes, but this point is not
very clear

~

8c.
This introduces a side-effect that now the char *const
SlotInvalidationCauses[] array in slot.c will have 8 entries, half of
them NULL. Already mentioned elsewhere. And, this will get
increasingly worse if more invalidation reasons get added. 8,16,32,64
mostly unused entries etc...

======
Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Introduce XID age and inactive timeout based replication slot invalidation

From

"Zhijie Hou (Fujitsu)"

Date:

08 February, 09:58:45

On Friday, February 7, 2025 9:06 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> 
> Attached v72 patches, addressed the above comments as well as Vignesh's
> comments in [2].
>  - There are no new changes in patch-002.

Thanks for updating the patch, I have few review comments:

1.
> InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,

I think the type of first parameter 'cause' is not appropriate anymore since
it's now a bitmap flag instead of an enum.

2.
> -StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
> -                 "array length mismatch");
> +#define    RS_INVAL_MAX_CAUSES (sizeof(InvalidationCauses) / sizeof(InvalidationCauses[0]))

I'd like to confirm if the current value of the RS_INVAL_MAX_CAUSES is correct.
Previously, the value is arrary_length - 1, while now it seems equal to the
arrary_length.

And ISTM we could directly call lengthof() here.

3.

+            if (cause & RS_INVAL_HORIZON)
+            {
+                if (!SlotIsLogical(s))
+                    goto invalidation_marked;

I am not sure if this logic is correct. Even if the slot would not be
invalidated due to RS_INVAL_HORIZON, we should continue to check other causes.

Besides, instead of using a goto, I personally prefer to move all these codes
into a separate function which would return a single invalidation cause.

4.
-    if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+    if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
                                            _logSegNo, InvalidOid,
                                            InvalidTransactionId))

I think this change could trigger an unnecessary WAL position re-calculation when
slots are invalidated only due to RS_INVAL_IDLE_TIMEOUT. But since it would not be
a frequent operation so I am OK to leave it unless we have better ideas.

Best Regards,
Hou zj

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

10 February, 09:09:45

On Sat, Feb 8, 2025 at 12:28 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> 3.
>
> +                       if (cause & RS_INVAL_HORIZON)
> +                       {
> +                               if (!SlotIsLogical(s))
> +                                       goto invalidation_marked;
>
> I am not sure if this logic is correct. Even if the slot would not be
> invalidated due to RS_INVAL_HORIZON, we should continue to check other causes.
>

Isn't this comment apply to even the next condition (if (dboid !=
InvalidOid && dboid != s->data.database))? We need to probably
continue to check other invalidation causes unless one is set.

> Besides, instead of using a goto, I personally prefer to move all these codes
> into a separate function which would return a single invalidation cause.
>

Instead of using goto label (invalidation_marked:), won't it be better
if we use a boolean invalidation_marked and convert all if's to if ..
else if .. else cases?

> 4.
> -       if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
> +       if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
>                                                                                    _logSegNo, InvalidOid,
>                                                                                    InvalidTransactionId))
>
> I think this change could trigger an unnecessary WAL position re-calculation when
> slots are invalidated only due to RS_INVAL_IDLE_TIMEOUT.
>

Why is that unnecessary? If some slots got invalidated due to timeout,
we don't want to retain the WAL corresponding to them.


--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

10 February, 09:10:14

On Fri, Feb 7, 2025 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Feb 7, 2025 at 8:00 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > ======
> > src/backend/access/transam/xlog.c
> >
> > 1.
> >   XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
> >   KeepLogSeg(recptr, &_logSegNo);
> > - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
> > + if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
> > RS_INVAL_IDLE_TIMEOUT,
> >      _logSegNo, InvalidOid,
> >      InvalidTransactionId))
> >   {
> > @@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
> >   replayPtr = GetXLogReplayRecPtr(&replayTLI);
> >   endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
> >   KeepLogSeg(endptr, &_logSegNo);
> > - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
> > + if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
> > RS_INVAL_IDLE_TIMEOUT,
> >      _logSegNo, InvalidOid,
> >      InvalidTransactionId))
> >
> > It seems fundamentally strange to me to assign multiple simultaneous
> > causes like this. IMO you can't invalidate something that is invalid
> > already. I gues v71 was an attempt to implement Amit's:
> >
>
> The idea is to invalidate the slot either due to WAL_REMOVED or
> IDLE_TIMEOUT in one go during the checkpoint instead of taking
> multiple passes over the slots during the checkpoint. Feel free to
> suggest if you can think of a better way to implement it.
>

Hi Amit,

My preference already suggested was to have a separation between the
concepts of *actual* causes (e.g. discrete enum values like in v70)
and *possible* causes to be checked (using #defines for bit flags).

My v72-0001 review [1] includes a top-up patch to show what doing it
this way might look like.

======
[1] https://www.postgresql.org/message-id/CAHut%2BPupn_S0mrM2zB%2BFwAbPqVak7jwSjRhU3WyA18QC1HU__g%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

10 February, 12:45:07

On Mon, Feb 10, 2025 at 11:33 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Some review comments for v72-0001.
>
> ======
> GENERAL
>
> My preference was to just keep the enum as per v70 for the *actual*
> cause, and introduce a separate set of bit flags for *possible* causes
> to be checked. This creates a clear code separation between the actual
> and possible. It also eliminates the need to jump through hoops just
> to map a cause to its name.
>
> You wrote:
>
> > OTOH, keeping the enums as they are in v70, and defining new macros
> for the very similar purpose could add unnecessary complexity to code
> management.
>
> Since both the enum and the bit flags would be defined in slot.h
> adjacent to each other I don't foresee much complexity. I concede, a
> dev might write code and accidentally muddle the enum instead of the
> flag  or vice versa but that's an example of sloppiness, not
> complexity. Certainly there would be fewer necessary changes than what
> are in the v72 patch due to all the cause/causename mappings. For
> example,
>
> slot.h - Now, introduces a NEW typedef SlotInvalidationCauseMap
> slot.h - Now, need extern for NEW function GetSlotInvalidationCauseName
>
> slot.c - Now, needed minor rewrite of GetSlotInvalidationCause instead
> of leaving it as-is
> slot.c - Now, needs a whole NEW looping function
> GetSlotInvalidationCauseName instead of direct array index.
>
> Several place now must call to the GetSlotInvalidationCauseName where
> previously a simple direct array lookup was done
> slot.c - NEW call in ReplicationSlotAcquire
> slotfuncs.c - NEW call in pg_get_replication_slots
>
> ~
>
> FWIW, I've attached a topup patch using my idea just to see what it
> might look like. The result was 20 lines less code.
>

I don't like the idea of maintaining the same information in two
different ways (as enum and bit flags). We already have a few cases of
defining bit flags as part of enums like ScanOptions and relopt_kind,
so I feel following that model would be a better approach.

--
With Regards,
Amit Kapila.

RE: Introduce XID age and inactive timeout based replication slot invalidation

From

"Zhijie Hou (Fujitsu)"

Date:

10 February, 12:56:49

On Monday, February 10, 2025 2:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Sat, Feb 8, 2025 at 12:28 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> wrote:
> >
> > 3.
> >
> > +                       if (cause & RS_INVAL_HORIZON)
> > +                       {
> > +                               if (!SlotIsLogical(s))
> > +                                       goto invalidation_marked;
> >
> > I am not sure if this logic is correct. Even if the slot would not be
> > invalidated due to RS_INVAL_HORIZON, we should continue to check other
> causes.
> >
> 
> Isn't this comment apply to even the next condition (if (dboid != InvalidOid &&
> dboid != s->data.database))? We need to probably continue to check other
> invalidation causes unless one is set.

Yes, both places need to be fixed.

> 
> > Besides, instead of using a goto, I personally prefer to move all
> > these codes into a separate function which would return a single invalidation
> cause.
> >
> 
> Instead of using goto label (invalidation_marked:), won't it be better if we use a
> boolean invalidation_marked and convert all if's to if ..
> else if .. else cases?

Yes, I think that would be better.

> 
> > 4.
> > -       if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
> > +       if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED
> |
> > + RS_INVAL_IDLE_TIMEOUT,
> >
> _logSegNo, InvalidOid,
> >
> > InvalidTransactionId))
> >
> > I think this change could trigger an unnecessary WAL position
> > re-calculation when slots are invalidated only due to
> RS_INVAL_IDLE_TIMEOUT.
> >
> 
> Why is that unnecessary? If some slots got invalidated due to timeout, we don't
> want to retain the WAL corresponding to them.

Sorry, I mistakenly thought that the slot only protected dead tuples.
Please disregard this comment.

Best Regards,
Hou zj

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

11 February, 09:17:34

On Tue, Feb 11, 2025 at 8:49 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
>
> InvalidatePossiblyObsoleteSlot:
>
> 2.
> + if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
> + {
> + /*
> + * Assign the current time here to avoid system call overhead
> + * while holding the spinlock in subsequent code.
> + */
> + now = GetCurrentTimestamp();
> + }
> +
>
> I felt that any minuscule benefit gained from having this conditional
> 'now' assignment is outweighed by the subsequent confusion/doubt
> caused by passing around a 'now' to other functions where it may or
> may not still be zero depending on different processing. IMO we should
> just remove all doubts and always assign it so that "now always means
> now".
>

I think we shouldn't pass now to another function, but rather do all
required computations in the caller (probably in a separate inline
function). And keep the above code as is.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nisha Moond

Date:

11 February, 16:36:49

On Tue, Feb 11, 2025 at 11:42 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, February 10, 2025 8:03 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > On Sat, Feb 8, 2025 at 12:28 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> > wrote:
> > >
> >
> > > 3.
> > >
> > > +                       if (cause & RS_INVAL_HORIZON)
> > > +                       {
> > > +                               if (!SlotIsLogical(s))
> > > +                                       goto invalidation_marked;
> > >
> > > I am not sure if this logic is correct. Even if the slot would not be
> > > invalidated due to RS_INVAL_HORIZON, we should continue to check other
> > causes.
> > >
> >
> > Used goto here since we do not expect RS_INVAL_HORIZON to be combined
> > with any other "cause" and to keep the pgHead behavior.
> > However, with the bitflag approach, the code should be future-safe, so
> > replacing goto in v73 should handle this now.
>
> I think the following logic needs some adjustments.
>
> +                       if (invalidation_cause == RS_INVAL_NONE &&
> +                               (possible_causes & RS_INVAL_HORIZON))
> +                       {
> +                               if (SlotIsLogical(s) &&
> +                               /* invalid DB oid signals a shared relation */
> +                                       (dboid == InvalidOid || dboid == s->data.database) &&
> +                                       TransactionIdIsValid(initial_effective_xmin) &&
> +                                       TransactionIdPrecedesOrEquals(initial_effective_xmin,
> +
snapshotConflictHorizon))
> +                                       invalidation_cause = RS_INVAL_HORIZON;
> +                               else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
> +                                                TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
> +
snapshotConflictHorizon))
> +                                       invalidation_cause = RS_INVAL_HORIZON;
> +                       }
>
> I think we assign RS_INVAL_HORIZON to invalidation_cause only when the slot is
> logical and the dboid is valid, but it is not guaranteed in the second if
> condition ("else if (TransactionIdIsValid(initial_catalog_effective_xmin)").
>
> Here is a top-up patch to fix this.

Thank you for reviewing and providing the fix! v75 addresses this bug
with a slightly different approach after introducing the new function
EvaluateSlotInvalidationCause().

--
Thanks,
Nisha

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Álvaro Herrera

Date:

11 February, 17:22:49

Hello,

I find this proposed patch a bit strange and I feel it needs more
explanation.

When this thread started, Bharath justified his patches saying that a
slot that's inactive for a very long time could be problematic because
of XID wraparound.  Fine, that sounds a reasonable feature.  If you
wanted to invalidate slots whose xmins were too old, I would support
that.  He submitted that as his 0004 patch then.

However, he also chose to submit 0003 with invalidation based on a
timeout.  This is far less convincing a feature to me.  The
justification for the time out seems to be that ... it's difficult to
have a one-size-fits-all value because size of disks vary. (???)
Or something like that.  Really?  I mean -- yes, this will prevent
problems in toy databases when run in developer's laptops.  It will not
prevent any problems in production databases.  Do we really want a
setting that is only useful for toy situations rather than production?


Anyway, the thread is way too long, but after some initial pieces were
committed, Nisha took over and submitting patches derived from Bharath's
0003, and at some point the initial 0004 was dropped.  But 0004 was the
more useful one, I thought, so what's going on?

I'm baffled.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nathan Bossart

Date:

11 February, 19:09:26

On Tue, Feb 11, 2025 at 03:22:49PM +0100, Álvaro Herrera wrote:
> I find this proposed patch a bit strange and I feel it needs more
> explanation.
> 
> When this thread started, Bharath justified his patches saying that a
> slot that's inactive for a very long time could be problematic because
> of XID wraparound.  Fine, that sounds a reasonable feature.  If you
> wanted to invalidate slots whose xmins were too old, I would support
> that.  He submitted that as his 0004 patch then.
> 
> However, he also chose to submit 0003 with invalidation based on a
> timeout.  This is far less convincing a feature to me.  The
> justification for the time out seems to be that ... it's difficult to
> have a one-size-fits-all value because size of disks vary. (???)
> Or something like that.  Really?  I mean -- yes, this will prevent
> problems in toy databases when run in developer's laptops.  It will not
> prevent any problems in production databases.  Do we really want a
> setting that is only useful for toy situations rather than production?
> 
> 
> Anyway, the thread is way too long, but after some initial pieces were
> committed, Nisha took over and submitting patches derived from Bharath's
> 0003, and at some point the initial 0004 was dropped.  But 0004 was the
> more useful one, I thought, so what's going on?
> 
> I'm baffled.

I agree, and I am also baffled because I think this discussion has happened
at least once already on this thread.  I still feel like the XID-based
parameter makes more sense.  For replication slots, two primary concerns
are 1) storage, for which we have max_slot_wal_keep_size and 2) XID
wraparound, for which we don't really have anything today.  A timeout might
be useful in some contexts, but if the goal is to prevent wraparound, why
not target that directly?

-- 
nathan

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

12 February, 00:23:37

On Wed, Feb 12, 2025 at 12:36 AM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Tue, Feb 11, 2025 at 8:49 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Nisha.
> >
> > Some review comments about v74-0001
> >
> > ======
> > src/backend/replication/slot.c
> >
> > 1.
> >  /* Maximum number of invalidation causes */
> > -#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
> > -
> > -StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
> > - "array length mismatch");
> > +#define RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)
> >
> > The static assert was here to protect against dev mistakes in keeping
> > the lookup table up-to-date with the enum of slot.h. So it's not a
> > good idea to remove it...
> >
> > IMO the RS_INVAL_MAX_CAUSES should be relocated to slot.h where the
> > enum is defined and where the devs know exactly how many invalidation
> > types there are. Then this static assert can be put back in to do its
> > job of ensuring the integrity properly again for this lookup table.
> >
>
> How about keeping RS_INVAL_MAX_CAUSES dynamic in slot.c (as it was)
> and updating the static assert to ensure the lookup table stays
> up-to-date with the enums?
> The change has been implemented in v75.
>

Latest v75-001 patch code looks like:

+static const SlotInvalidationCauseMap InvalidationCauses[] = {
+ {RS_INVAL_NONE, "none"},
+ {RS_INVAL_WAL_REMOVED, "wal_removed"},
+ {RS_INVAL_HORIZON, "rows_removed"},
+ {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
+ {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };

 /* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)

-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
+/*
+ * Ensure that the lookup table is up-to-date with the enums defined in
+ * ReplicationSlotInvalidationCause. Shifting 1 left by
+ * (RS_INVAL_MAX_CAUSES - 1) should give the highest defined value in
+ * the enum.
+ */
+StaticAssertDecl(RS_INVAL_IDLE_TIMEOUT == (1 << (RS_INVAL_MAX_CAUSES - 1)),
  "array length mismatch");

Where:
1.  RS_INVAL_MAX_CAUSES is based on the length of lookup table so it is 4
2.  the StaticAssert then confirms that the enum RS_INVAL_IDLE_TIMEOUT
 is the 4th enum entry

AFAICT that is not useful. The purpose of the static assert is (like
your comment says) to "Ensure that the lookup table is up-to-date with
the enums". Imagine if I added another (5th cause) enum called
RS_INVAL_BANANA but accidentally overlook updating the lookup table.
The code above isn't going to detect that -- the lookup table length
is still 4 (instead of 5) but RS_INVAL_IDLE_TIMEOUT is still the 4th
enum so the assert is happy. Hence my original suggestion to define
RS_INVAL_MAX_CAUSES adjacent to the enum in slot.h.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

12 February, 06:55:53

On Tue, Feb 11, 2025 at 9:39 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Tue, Feb 11, 2025 at 03:22:49PM +0100, Álvaro Herrera wrote:
> > I find this proposed patch a bit strange and I feel it needs more
> > explanation.
> >
> > When this thread started, Bharath justified his patches saying that a
> > slot that's inactive for a very long time could be problematic because
> > of XID wraparound.  Fine, that sounds a reasonable feature.  If you
> > wanted to invalidate slots whose xmins were too old, I would support
> > that.  He submitted that as his 0004 patch then.
> >
> > However, he also chose to submit 0003 with invalidation based on a
> > timeout.  This is far less convincing a feature to me.  The
> > justification for the time out seems to be that ... it's difficult to
> > have a one-size-fits-all value because size of disks vary. (???)
> > Or something like that.  Really?  I mean -- yes, this will prevent
> > problems in toy databases when run in developer's laptops.  It will not
> > prevent any problems in production databases.  Do we really want a
> > setting that is only useful for toy situations rather than production?
> >
> >
...
> >
> > I'm baffled.
>
> I agree, and I am also baffled because I think this discussion has happened
> at least once already on this thread.
>

Yes, we previously discussed this topic and Robert seems to prefer a
time-based parameter for invalidating the slot (1)(2) as it is easier
to reason in terms of time. The other points discussed previously were
that there are tools that create a lot of slots and sometimes forget
to clean up slots. Bharath has seen this in production and we now have
the tool pg_createsubscriber that creates a slot-per-database, so if
for some reason, such slots are not cleaned on the tool's exit, such a
parameter could save the cluster. See (3)(4).

Also, we previously didn't have a good experience with XID-based
threshold parameters like vacuum_defer_cleanup_age as mentioned by
Robert (1). AFAICU from the previous discussion we need a time-based
parameter and we didn't rule out xid_age based parameter as another
parameter.

(1) - https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
(2) - https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
(3) - https://www.postgresql.org/message-id/CALj2ACVFV%3DyUa3DXXfJLOtJxUM8qzC_mEECMJ2iekDGPeQLkTw%40mail.gmail.com
(4) - https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com

--
With Regards,
Amit Kapila.

RE: Introduce XID age and inactive timeout based replication slot invalidation

From

"Zhijie Hou (Fujitsu)"

Date:

12 February, 10:46:22

On Wednesday, February 12, 2025 11:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Tue, Feb 11, 2025 at 9:39 PM Nathan Bossart
> <nathandbossart@gmail.com> wrote:
> >
> > On Tue, Feb 11, 2025 at 03:22:49PM +0100, Álvaro Herrera wrote:
> > > I find this proposed patch a bit strange and I feel it needs more
> > > explanation.
> > >
> > > When this thread started, Bharath justified his patches saying that
> > > a slot that's inactive for a very long time could be problematic
> > > because of XID wraparound.  Fine, that sounds a reasonable feature.
> > > If you wanted to invalidate slots whose xmins were too old, I would
> > > support that.  He submitted that as his 0004 patch then.
> > >
> > > However, he also chose to submit 0003 with invalidation based on a
> > > timeout.  This is far less convincing a feature to me.  The
> > > justification for the time out seems to be that ... it's difficult
> > > to have a one-size-fits-all value because size of disks vary. (???)
> > > Or something like that.  Really?  I mean -- yes, this will prevent
> > > problems in toy databases when run in developer's laptops.  It will
> > > not prevent any problems in production databases.  Do we really want
> > > a setting that is only useful for toy situations rather than production?
> > >
> > >
> ...
> > >
> > > I'm baffled.
> >
> > I agree, and I am also baffled because I think this discussion has
> > happened at least once already on this thread.
> >
> 
> Yes, we previously discussed this topic and Robert seems to prefer a
> time-based parameter for invalidating the slot (1)(2) as it is easier to reason in
> terms of time. The other points discussed previously were that there are tools
> that create a lot of slots and sometimes forget to clean up slots. Bharath has
> seen this in production and we now have the tool pg_createsubscriber that
> creates a slot-per-database, so if for some reason, such slots are not cleaned
> on the tool's exit, such a parameter could save the cluster. See (3)(4).
> 
> Also, we previously didn't have a good experience with XID-based threshold
> parameters like vacuum_defer_cleanup_age as mentioned by Robert (1).
> AFAICU from the previous discussion we need a time-based parameter and we
> didn't rule out xid_age based parameter as another parameter.

Yeah, I think the primary purpose of this time-based option is to invalidate dormant
replication slots that have been inactive for a long period, in which case the
slots are no longer useful.
 
Such slots can remain if a subscriber is down due to a system error or
inaccessible because of network issues. If this situation persists, it might be
more practical to recreate the subscriber rather than attempt to recover the
node and wait for it to catch up, which could be time-consuming.

Parameters like max_slot_wal_keep_size and max_slot_xid_id_age do not
differentiate between active and inactive replication slots. Some customers I
met are hesitant about using these settings, as they can sometimes invalidate
a slot unnecessarily and break the replication.


> (1) -
> https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzx
> AdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
> (2) -
> https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2
> S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
> (3) -
> https://www.postgresql.org/message-id/CALj2ACVFV%3DyUa3DXXfJLOtJxU
> M8qzC_mEECMJ2iekDGPeQLkTw%40mail.gmail.com
> (4) -
> https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoF
> EQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com

Best Regards,
Hou zj

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Peter Smith

Date:

17 February, 01:18:28

Some review comments for v78-0001.

======
src/backend/replication/slot.c

ReportSlotInvalidation:

1.
+ int minutes;
+ int secs;

The variables 'minutes' and 'seconds' are only used by case
RS_INVAL_IDLE_TIMEOUT, so I think it would be better to make a new
code block for that case where you can declare and initialise these in
one go at that scope.

~~~

DetermineSlotInvalidationCause:

2.
+ if (SlotIsLogical(s) &&
+ /* invalid DB oid signals a shared relation */
+ (dboid == InvalidOid || dboid == s->data.database))
+ {

The comment placement in the master code was ok because then there
were different statements, but now in this patch multiple conditions
are combined, and this comment seems strangely placed.

~~~

GetSlotInvalidationCause:

3.
I understand your argument "let's not change anything unless
absolutely necessary for our patch", but in this case since half the
function is changing anyway it seems a missed opportunity to not
simplify the rest of the code "in passing" to make it consistent with
the newly added partner function GetSlotInvalidationCauseName. My
question is "if not now, then when?", because I expect a future patch
to do this might be rejected as being too trivial, so by not changing
now probably these functions are doomed to be inconsistent forever.
Anyway it is just my opinion -- leave it as-is if you wish.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

17 February, 05:27:22

On Wed, Feb 12, 2025 at 1:16 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, February 12, 2025 11:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Feb 11, 2025 at 9:39 PM Nathan Bossart
> > <nathandbossart@gmail.com> wrote:
> > >
> > > On Tue, Feb 11, 2025 at 03:22:49PM +0100, Álvaro Herrera wrote:
> > > > I find this proposed patch a bit strange and I feel it needs more
> > > > explanation.
> > > >
> > > > When this thread started, Bharath justified his patches saying that
> > > > a slot that's inactive for a very long time could be problematic
> > > > because of XID wraparound.  Fine, that sounds a reasonable feature.
> > > > If you wanted to invalidate slots whose xmins were too old, I would
> > > > support that.  He submitted that as his 0004 patch then.
> > > >
> > > > However, he also chose to submit 0003 with invalidation based on a
> > > > timeout.  This is far less convincing a feature to me.  The
> > > > justification for the time out seems to be that ... it's difficult
> > > > to have a one-size-fits-all value because size of disks vary. (???)
> > > > Or something like that.  Really?  I mean -- yes, this will prevent
> > > > problems in toy databases when run in developer's laptops.  It will
> > > > not prevent any problems in production databases.  Do we really want
> > > > a setting that is only useful for toy situations rather than production?
> > > >
> > > >
> > ...
> > > >
> > > > I'm baffled.
> > >
> > > I agree, and I am also baffled because I think this discussion has
> > > happened at least once already on this thread.
> > >
> >
> > Yes, we previously discussed this topic and Robert seems to prefer a
> > time-based parameter for invalidating the slot (1)(2) as it is easier to reason in
> > terms of time. The other points discussed previously were that there are tools
> > that create a lot of slots and sometimes forget to clean up slots. Bharath has
> > seen this in production and we now have the tool pg_createsubscriber that
> > creates a slot-per-database, so if for some reason, such slots are not cleaned
> > on the tool's exit, such a parameter could save the cluster. See (3)(4).
> >
> > Also, we previously didn't have a good experience with XID-based threshold
> > parameters like vacuum_defer_cleanup_age as mentioned by Robert (1).
> > AFAICU from the previous discussion we need a time-based parameter and we
> > didn't rule out xid_age based parameter as another parameter.
>
> Yeah, I think the primary purpose of this time-based option is to invalidate dormant
> replication slots that have been inactive for a long period, in which case the
> slots are no longer useful.
>
> Such slots can remain if a subscriber is down due to a system error or
> inaccessible because of network issues. If this situation persists, it might be
> more practical to recreate the subscriber rather than attempt to recover the
> node and wait for it to catch up, which could be time-consuming.
>
> Parameters like max_slot_wal_keep_size and max_slot_xid_id_age do not
> differentiate between active and inactive replication slots. Some customers I
> met are hesitant about using these settings, as they can sometimes invalidate
> a slot unnecessarily and break the replication.
>

Alvaro, Nathan, do let us know if you would like to discuss more on
the use case for this new GUC idle_replication_slot_timeout?
Otherwise, we can proceed with this patch.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Nathan Bossart

Date:

17 February, 19:48:49

On Mon, Feb 17, 2025 at 07:57:22AM +0530, Amit Kapila wrote:
> On Wed, Feb 12, 2025 at 1:16 PM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
>> On Wednesday, February 12, 2025 11:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > Also, we previously didn't have a good experience with XID-based threshold
>> > parameters like vacuum_defer_cleanup_age as mentioned by Robert (1).
>> > AFAICU from the previous discussion we need a time-based parameter and we
>> > didn't rule out xid_age based parameter as another parameter.

I am not sure I buy the comparison with vacuum_defer_cleanup_age.  That is
a very different feature than max_slot_xid_age, and we still have a number
of XID-based parameters (vacuum_freeze_table_age, vacuum_freeze_min_age,
vacuum_failsafe_age, the multixact versions of those parameters, and the
autovacuum versions).

>> Yeah, I think the primary purpose of this time-based option is to invalidate dormant
>> replication slots that have been inactive for a long period, in which case the
>> slots are no longer useful.
>>
>> Such slots can remain if a subscriber is down due to a system error or
>> inaccessible because of network issues. If this situation persists, it might be
>> more practical to recreate the subscriber rather than attempt to recover the
>> node and wait for it to catch up, which could be time-consuming.
>>
>> Parameters like max_slot_wal_keep_size and max_slot_xid_id_age do not
>> differentiate between active and inactive replication slots. Some customers I
>> met are hesitant about using these settings, as they can sometimes invalidate
>> a slot unnecessarily and break the replication.

Sure, an inactive-timeout feature won't break replication, but it's also
not going to be terribly effective against wraparound-related issues.  It
seems weird to me to allow an active replication slot to take priority over
imminent storage/XID issues it causes.

> Alvaro, Nathan, do let us know if you would like to discuss more on
> the use case for this new GUC idle_replication_slot_timeout?
> Otherwise, we can proceed with this patch.

I guess I'm not mortally opposed to it.  I just think we really need
proper backstops against the storage/XID issues more than we need this one,
and I don't want it to be mistaken for a solution to those problems.

-- 
nathan

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

18 February, 06:12:06

On Mon, Feb 17, 2025 at 10:18 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Mon, Feb 17, 2025 at 07:57:22AM +0530, Amit Kapila wrote:
>
> > Alvaro, Nathan, do let us know if you would like to discuss more on
> > the use case for this new GUC idle_replication_slot_timeout?
> > Otherwise, we can proceed with this patch.
>
> I guess I'm not mortally opposed to it.  I just think we really need
> proper backstops against the storage/XID issues more than we need this one,
> and I don't want it to be mistaken for a solution to those problems.
>

Fair enough. I see your point and would like to discuss the other
parameter in a separate thread. I plan to push the 0001 tomorrow after
some more review/testing unless I see any further arguments or
comments.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

19 February, 13:13:30

On Tue, Feb 18, 2025 at 8:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Feb 17, 2025 at 10:18 PM Nathan Bossart
> <nathandbossart@gmail.com> wrote:
> >
> > On Mon, Feb 17, 2025 at 07:57:22AM +0530, Amit Kapila wrote:
> >
> > > Alvaro, Nathan, do let us know if you would like to discuss more on
> > > the use case for this new GUC idle_replication_slot_timeout?
> > > Otherwise, we can proceed with this patch.
> >
> > I guess I'm not mortally opposed to it.  I just think we really need
> > proper backstops against the storage/XID issues more than we need this one,
> > and I don't want it to be mistaken for a solution to those problems.
> >
>
> Fair enough. I see your point and would like to discuss the other
> parameter in a separate thread. I plan to push the 0001 tomorrow after
> some more review/testing unless I see any further arguments or
> comments.
>

Pushed after minor modifications.

--
With Regards,
Amit Kapila.

Re: Introduce XID age and inactive timeout based replication slot invalidation

From

Amit Kapila

Date:

20 February, 09:11:27

On Wed, Feb 19, 2025 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Pushed after minor modifications.
>

I have closed the corresponding CF entry. Please feel free to start a
new thread for xid age based parameter.

--
With Regards,
Amit Kapila.