Re: Handle infinite recursion in logical replication setup - Mailing list pgsql-hackers

From Jonathan S. Katz
Subject Re: Handle infinite recursion in logical replication setup
Date
Msg-id 957e1b95-7dcd-a880-3cf4-7a2f7b7cbe97@postgresql.org
Whole thread Raw
In response to Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Responses Re: Handle infinite recursion in logical replication setup  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On 9/12/22 1:23 AM, vignesh C wrote:
> On Fri, 9 Sept 2022 at 11:12, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Thu, Sep 8, 2022 at 9:32 AM vignesh C <vignesh21@gmail.com> wrote:
>>>
>>>
>>> The attached patch has the changes to handle the same.
>>>
>>
>> Pushed. I am not completely sure whether we want the remaining
>> documentation patch in this thread in its current form or by modifying
>> it. Johnathan has shown some interest in it. I feel you can start a
>> separate thread for it to see if there is any interest in the same and
>> close the CF entry for this work.
> 
> Thanks for pushing the patch. I have closed this entry in commitfest.
> I will wait for some more time and see the response regarding the
> documentation patch and then start a new thread if required.

I've been testing this patch in advancing of working on the 
documentation and came across a behavior I wanted to note. Specifically, 
I am hitting a deadlock while trying to synchronous replicate between 
the two instances at any `synchronous_commit` level above `local`.

Here is my set up. I have two instances, "A" and "B".

On A and B, run:

   CREATE TABLE sync (id int PRIMARY KEY, info float);
   CREATE PUBLICATION sync FOR TABLE sync;

On A, run:

   CREATE SUBSCRIPTION sync
   CONNECTION 'connstr-to-B'
   PUBLICATION sync
   WITH (
     streaming=true, copy_data=false,
     origin=none, synchronous_commit='on');

On B, run:

   CREATE SUBSCRIPTION sync
   CONNECTION 'connstr-to-A'
   PUBLICATION sync
   WITH (
     streaming=true, copy_data=false,
     origin=none, synchronous_commit='on');

On A and B, run:

   ALTER SYSTEM SET synchronous_standby_names TO 'sync';
   SELECT pg_reload_conf();

Verify on A and B that pg_stat_replication.sync_state is set to "sync"

   SELECT application_name, sync_state = 'sync' AS is_sync
   FROM pg_stat_replication
   WHERE application_name = 'sync';

The next to commands should be run simultaneously on A and B:

-- run this on A
INSERT INTO sync
SELECT x, random() FROM generate_series(1,2000000, 2) x;

-- run this on B
INSERT INTO sync
SELECT x, random() FROM generate_series(2,2000000, 2) x;

This consistently created the deadlock in my testing.

Discussing with Masahiko off-list, this is due to a deadlock from 4 
processes: the walsenders on A and B, and the apply workers on A and B. 
The walsenders are waiting for feedback from the apply workers, and the 
apply workers are waiting for the walsenders to synchronize (I may be 
oversimplifying).

He suggested I try the above example instead with `synchronous_commit` 
set to `local`. In this case, I verified that there is no more deadlock, 
but he informed me that we would not be able to use cascading 
synchronous replication when "origin=none".

If we decide that this is a documentation issue, I'd suggest we improve 
the guidance around using `synchronous_commit`[1] on the CREATE 
SUBSCRIPTION page, as the GUC page[2] warns against using `local`:

"The setting local causes commits to wait for local flush to disk, but 
not for replication. This is usually not desirable when synchronous 
replication is in use, but is provided for completeness."

Thanks,

Jonathan

[1] https://www.postgresql.org/docs/devel/sql-createsubscription.html
[2] 
https://www.postgresql.org/docs/devel/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: ATTACH PARTITION seems to ignore column generation status
Next
From: "Regina Obe"
Date:
Subject: RE: [PATCH] Support % wildcard in extension upgrade filenames