Re: Excessive number of replication slots for 12->14 logical replication - Mailing list pgsql-bugs

From Ajin Cherian
Subject Re: Excessive number of replication slots for 12->14 logical replication
Date
Msg-id CAFPTHDbppsYxW3Ttg0fpYrJ8F8Zx1HVFr3Sn56XLK54C2GUYCg@mail.gmail.com
Whole thread Raw
In response to Re: Excessive number of replication slots for 12->14 logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Excessive number of replication slots for 12->14 logical replication  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
List pgsql-bugs
On Sun, Jul 24, 2022 at 6:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jul 18, 2022 at 3:13 PM hubert depesz lubaczewski
> <depesz@depesz.com> wrote:
> >
> > On Mon, Jul 18, 2022 at 09:07:35AM +0530, Amit Kapila wrote:
> >
> > First error:
> > #v+
> > 2022-07-18 09:22:07.046 UTC,,,4145917,,62d5263f.3f42fd,2,,2022-07-18 09:22:07
UTC,28/21641,1219146,ERROR,53400,"couldnot find free replication state slot for replication origin with OID
51",,"Increasemax_replication_slots and try again.",,,,,,,"","logical replication worker",,0 
> > #v-
> >
> > Nothing else errored out before, no warning, no fatals.
> >
> > from the first ERROR I was getting them in the range of 40-70 per minute.
> >
> > At the same time I was logging data from `select now(), * from pg_replication_slots`, every 2 seconds.
> >
> ...
> >
> > So, it looks that there are up to 10 focal slots, all active, and then there are sync slots with weirdly high
countsfor inactive ones. 
> >
> > At most, I had 11 active sync slots.
> >
> > Looks like some kind of timing issue, which would be inline with what
> > Kyotaro Horiguchi wrote initially.
> >
>
> I think this is a timing issue similar to what Horiguchi-San has
> pointed out but due to replication origins. We drop the replication
> origin after the sync worker that has used it is finished. This is
> done by the apply worker because we don't allow to drop the origin
> till the process owning the origin is alive. I am not sure of
> repercussions but maybe we can allow dropping the origin by the
> process that owns it.
>

I have written a patch which will do the dropping of replication
origins in the sync worker itself.
I had to reset the origin session (which also resets the owned by
flag) prior to the dropping of the slots.

regards,
Ajin Cherian
Fujitsu Australia

Attachment

pgsql-bugs by date:

Previous
From: Marco Boeringa
Date:
Subject: Re: Fwd: "SELECT COUNT(*) FROM" still causing issues (deadlock) in PostgreSQL 14.3/4?
Next
From: PG Bug reporting form
Date:
Subject: BUG #17558: 15beta2: Endless loop with UNIQUE NULLS NOT DISTINCT and INSERT ... ON CONFLICT