Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo
Date
Msg-id CAD21AoBF0EGGshy0LGdfRoBUz+m9QZ-U24eJP6bheamh9cMmoA@mail.gmail.com
Whole thread Raw
In response to subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo
List pgsql-hackers
On Thu, Jan 18, 2024 at 8:16 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi,
>
> I had reported a possible subscription 'disable_on_error' bug found
> while reviewing another patch.
>
> I am including my initial report and Nisha's analysis again here so
> that this topic has its own thread.
>
> ==================
> INITIAL REPORT [1]
> ==================
>
> ...
> I see now that any ALTER of the subscription's connection, even to
> some value that fails, will restart a new worker (like ALTER of any
> other subscription parameters). For a bad connection, it will continue
> to relaunch-worker/ERROR over and over. e.g.
>
> ----------
> test_sub=# \r2024-01-17 09:34:28.665 AEDT [11274] LOG:  logical
> replication apply worker for subscription "sub4" has started
> 2024-01-17 09:34:28.666 AEDT [11274] ERROR:  could not connect to the
> publisher: invalid port number: "-1"
> 2024-01-17 09:34:28.667 AEDT [928] LOG:  background worker "logical
> replication apply worker" (PID 11274) exited with exit code 1
> 2024-01-17 09:34:33.669 AEDT [11391] LOG:  logical replication apply
> worker for subscription "sub4" has started
> 2024-01-17 09:34:33.669 AEDT [11391] ERROR:  could not connect to the
> publisher: invalid port number: "-1"
> 2024-01-17 09:34:33.670 AEDT [928] LOG:  background worker "logical
> replication apply worker" (PID 11391) exited with exit code 1
> etc...
> ----------
>
> While experimenting with the bad connection ALTER I also tried setting
> 'disable_on_error' like below:
>
> ALTER SUBSCRIPTION sub4 SET (disable_on_error);
> ALTER SUBSCRIPTION sub4 CONNECTION 'port = -1';
>
> ...but here the subscription did not become DISABLED as I expected it
> would do on the next connection error iteration. It remains enabled
> and just continues to loop relaunch/ERROR indefinitely same as before.
>
> That looks like it may be a bug. Thoughts?

Although we can improve it to handle this case too, I'm not sure it's
a bug. The doc says[1]:

Specifies whether the subscription should be automatically disabled if
any errors are detected by subscription workers during data
replication from the publisher.

When an apply worker is trying to establish a connection, it's not
replicating data from the publisher.

Regards,

[1]
https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-PARAMS-WITH-DISABLE-ON-ERROR

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: Strange Bitmapset manipulation in DiscreteKnapsack()
Next
From: Dave Cramer
Date:
Subject: compiling postgres on windows arm using meson