Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From vignesh C
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CALDaNm1_V+WPThOkZy+R9_sWgHH5H6hN6UtEmq4Mj3QbUc3G8g@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Conflict detection for update_deleted in logical replication
List pgsql-hackers
On Wed, 25 Dec 2024 at 08:13, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, December 23, 2024 2:15 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Hou,
> >
> > Thanks for updating the patch. Few comments:
>
> Thanks for the comments!
>
> > 02.  ErrorOnReservedSlotName()
> >
> > Currently the function is callsed from three points -
> > create_physical_replication_slot(),
> > create_logical_replication_slot() and CreateReplicationSlot().
> > Can we move them to the ReplicationSlotCreate(), or combine into
> > ReplicationSlotValidateName()?
>
> I am not sure because moving the check into these functions because that would
> prevent the launcher from creating the slot as well unless we add a new
> parameter for these functions, but I am not sure if it's worth it at this
> stage.
>
> >
> > 03. advance_conflict_slot_xmin()
> >
> > ```
> >       Assert(TransactionIdIsValid(MyReplicationSlot->data.xmin));
> > ```
> >
> > Assuming the case that the launcher crashed just after
> > ReplicationSlotCreate(CONFLICT_DETECTION_SLOT).
> > After the restart, the slot can be acquired since
> > SearchNamedReplicationSlot(CONFLICT_DETECTION_SLOT)
> > is true, but the process would fail the assert because data.xmin is still invalid.
> >
> > I think we should re-create the slot when the xmin is invalid. Thought?
>
> After thinking more, the standard approach to me would be to mark the slot as
> EPHEMERAL during creation and persist it after initializing, so changed like
> that.
>
> > 05. check_remote_recovery()
> >
> > Can we add a test case related with this?
>
> I think the code path is already tested, and I am a bit unsure if we want to setup
> a standby to test the ERROR case, so didn't add this.
>
> ---
>
> Attach the new version patch set which addressed all other comments.

Few comments:
1) In case there are no logical replication workers, the launcher
process just logs a warning "out of logical replication worker slots"
and continues. Whereas in case of "pg_conflict_detection" replication
slot creation launcher throw an error and the launcher exits, can we
throw a warning in this case too:
2025-01-02 10:24:41.899 IST [4280] ERROR:  all replication slots are in use
2025-01-02 10:24:41.899 IST [4280] HINT:  Free one or increase
"max_replication_slots".
2025-01-02 10:24:42.148 IST [4272] LOG:  background worker "logical
replication launcher" (PID 4280) exited with exit code 1

2) Currently, we do not detect when the track_commit_timestamp setting
is disabled for a subscription immediately after the worker starts.
Instead, it is detected later during conflict detection. Since
changing the track_commit_timestamp GUC requires a server restart, it
would be beneficial for DBAs if the error were raised immediately.
This way, DBAs would be aware of the issue as they monitor the server
restart and can take the necessary corrective actions without delay.

3) Tab completion missing for CREATE SUBSCRIPTION does not include
detect_update_deleted option:
postgres=# create subscription sub3 CONNECTION 'dbname=postgres
host=localhost port=5432' publication pub1 with (
BINARY              COPY_DATA           DISABLE_ON_ERROR    FAILOVER
         PASSWORD_REQUIRED   SLOT_NAME           SYNCHRONOUS_COMMIT
CONNECT             CREATE_SLOT         ENABLED             ORIGIN
         RUN_AS_OWNER        STREAMING           TWO_PHASE

4) Tab completion missing for ALTER SUBSCRIPTION does not include
detect_update_deleted option:
ALTER SUBSCRIPTION sub3 SET (
BINARY              FAILOVER            PASSWORD_REQUIRED   SLOT_NAME
         SYNCHRONOUS_COMMIT
DISABLE_ON_ERROR    ORIGIN              RUN_AS_OWNER        STREAMING
         TWO_PHASE

5) Copyright year can be updated to 2025:
+++ b/src/test/subscription/t/035_confl_update_deleted.pl
@@ -0,0 +1,169 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test the CREATE SUBSCRIPTION 'detect_update_deleted' parameter and its
+# interaction with the xmin value of replication slots.
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;

6) This include is not required, I was able to compile without it:
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -173,12 +173,14 @@
 #include "replication/logicalrelation.h"
 #include "replication/logicalworker.h"
 #include "replication/origin.h"
+#include "replication/slot.h"
 #include "replication/walreceiver.h"

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Conflict detection for update_deleted in logical replication
Next
From: Amit Kapila
Date:
Subject: Re: Conflict detection for update_deleted in logical replication