Re: Introduce XID age and inactive timeout based replication slot invalidation - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Introduce XID age and inactive timeout based replication slot invalidation
Date
Msg-id Zf1TOYzC9fniRft1@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Introduce XID age and inactive timeout based replication slot invalidation  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Introduce XID age and inactive timeout based replication slot invalidation
Re: Introduce XID age and inactive timeout based replication slot invalidation
List pgsql-hackers
Hi,

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:
> On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > > > Please find the v14-0001 patch for now.
> >
> > Thanks!
> >
> > > LGTM. Let's wait for Bertrand to see if he has more comments on 0001
> > > and then I'll push it.
> >
> > LGTM too.
> 
> 
> Please see the attached v14 patch set. No change in the attached
> v14-0001 from the previous patch.

Looking at v14-0002:

1 ===

@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
                ConditionVariableBroadcast(&slot->active_cv);
        }

+       if (slot->data.persistency == RS_PERSISTENT)
+       {
+               SpinLockAcquire(&slot->mutex);
+               slot->last_inactive_at = GetCurrentTimestamp();
+               SpinLockRelease(&slot->mutex);
+       }

I'm not sure we should do system calls while we're holding a spinlock.
Assign a variable before?

2 ===

Also, what about moving this here?

"
    if (slot->data.persistency == RS_PERSISTENT)
    {
        /*
         * Mark persistent slot inactive.  We're not freeing it, just
         * disconnecting, but wake up others that may be waiting for it.
         */
        SpinLockAcquire(&slot->mutex);
        slot->active_pid = 0;
        SpinLockRelease(&slot->mutex);
        ConditionVariableBroadcast(&slot->active_cv);
    }
"

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".

3 ===

@@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name)

                slot->in_use = true;
                slot->active_pid = 0;
+               slot->last_inactive_at = 0;

I think we should put GetCurrentTimestamp() here. It's done in v14-0006 but I
think it's better to do it in 0002 (and not taking care of inactive_timeout).

4 ===

    Track last_inactive_at in pg_replication_slots

 doc/src/sgml/system-views.sgml       | 11 +++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/replication/slot.c       | 16 ++++++++++++++++
 src/backend/replication/slotfuncs.c  |  7 ++++++-
 src/include/catalog/pg_proc.dat      |  6 +++---
 src/include/replication/slot.h       |  3 +++
 src/test/regress/expected/rules.out  |  5 +++--
 7 files changed, 44 insertions(+), 7 deletions(-)

Worth to add some tests too (or we postpone them in future commits because we're
confident enough they will follow soon)?

5 ===

Most of the fields that reflect a time (not duration) in the system views are
xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
something like "last_inactive_time"?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Next
From: Bertrand Drouvot
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation