Thread: [HACKERS] Logical replication in the same cluster

[HACKERS] Logical replication in the same cluster

From
Bruce Momjian
Date:
I tried setting up logical replication on the same server between two
different databases, and got, from database test:
test=> CREATE TABLE test (x INT PRIMARY KEY);CREATE TABLEtest=>test=> INSERT INTO test VALUES (1);INSERT 0 1test=>
CREATEPUBLICATION mypub FOR TABLE test;CREATE PUBLICATION
 

then from database test2:
test2=> CREATE TABLE test (x INT PRIMARY KEY);CREATE TABLEtest2=> CREATE SUBSCRIPTION mysub CONNECTION 'dbname=test
port=5432'PUBLICATIONmypub;NOTICE:  synchronized table states
 

and it just hangs.  My server logs say:
2017-04-26 12:50:53.694 EDT [29363] LOG:  logical decoding found initialstarting point at 0/15FF3E02017-04-26
12:50:53.694EDT [29363] DETAIL:  1 transaction needs tofinish.
 

Is this expected?  I can get it working from two different clusters.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: [HACKERS] Logical replication in the same cluster

From
Petr Jelinek
Date:
On 26/04/17 18:59, Bruce Momjian wrote:
> I tried setting up logical replication on the same server between two
> different databases, and got, from database test:
> 
>     test=> CREATE TABLE test (x INT PRIMARY KEY);
>     CREATE TABLE
>     test=>
>     test=> INSERT INTO test VALUES (1);
>     INSERT 0 1
>     test=> CREATE PUBLICATION mypub FOR TABLE test;
>     CREATE PUBLICATION
> 
> then from database test2:
> 
>     test2=> CREATE TABLE test (x INT PRIMARY KEY);
>     CREATE TABLE
>     test2=> CREATE SUBSCRIPTION mysub CONNECTION 'dbname=test port=5432'
>     PUBLICATION mypub;
>     NOTICE:  synchronized table states
> 
> and it just hangs.  My server logs say:
> 
>     2017-04-26 12:50:53.694 EDT [29363] LOG:  logical decoding found initial
>     starting point at 0/15FF3E0
>     2017-04-26 12:50:53.694 EDT [29363] DETAIL:  1 transaction needs to
>     finish.
> 
> Is this expected?  I can get it working from two different clusters.
> 

Yes that's result of how logical replication slots work, the transaction
that needs to finish is your transaction. It can be worked around by
creating the slot manually via the SQL interface for example and create
the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Logical replication in the same cluster

From
Bruce Momjian
Date:
On Wed, Apr 26, 2017 at 11:41:51PM +0200, Petr Jelinek wrote:
> On 26/04/17 18:59, Bruce Momjian wrote:
> > I tried setting up logical replication on the same server between two
> > different databases, and got, from database test:
> > 
> >     test=> CREATE TABLE test (x INT PRIMARY KEY);
> >     CREATE TABLE
> >     test=>
> >     test=> INSERT INTO test VALUES (1);
> >     INSERT 0 1
> >     test=> CREATE PUBLICATION mypub FOR TABLE test;
> >     CREATE PUBLICATION
> > 
> > then from database test2:
> > 
> >     test2=> CREATE TABLE test (x INT PRIMARY KEY);
> >     CREATE TABLE
> >     test2=> CREATE SUBSCRIPTION mysub CONNECTION 'dbname=test port=5432'
> >     PUBLICATION mypub;
> >     NOTICE:  synchronized table states
> > 
> > and it just hangs.  My server logs say:
> > 
> >     2017-04-26 12:50:53.694 EDT [29363] LOG:  logical decoding found initial
> >     starting point at 0/15FF3E0
> >     2017-04-26 12:50:53.694 EDT [29363] DETAIL:  1 transaction needs to
> >     finish.
> > 
> > Is this expected?  I can get it working from two different clusters.
> > 
> 
> Yes that's result of how logical replication slots work, the transaction
> that needs to finish is your transaction. It can be worked around by
> creating the slot manually via the SQL interface for example and create
> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .

Oh, OK.  Is there a way we can give users a hint do that if they try
what I did?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: [HACKERS] Logical replication in the same cluster

From
Tom Lane
Date:
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
> On 26/04/17 18:59, Bruce Momjian wrote:
>> ... it just hangs.  My server logs say:

> Yes that's result of how logical replication slots work, the transaction
> that needs to finish is your transaction. It can be worked around by
> creating the slot manually via the SQL interface for example and create
> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .

If that's a predictable deadlock, I think a minimum expectation is that
the system should notice it and throw an error, not just hang.  (Then
the error could give a hint about how to work around it.)  But the case
Bruce has in mind doesn't seem like a crazy use-case to me.  Can't we
make it "just work"?
        regards, tom lane



Re: [HACKERS] Logical replication in the same cluster

From
Michael Paquier
Date:
On Thu, Apr 27, 2017 at 7:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
>> On 26/04/17 18:59, Bruce Momjian wrote:
>>> ... it just hangs.  My server logs say:
>
>> Yes that's result of how logical replication slots work, the transaction
>> that needs to finish is your transaction. It can be worked around by
>> creating the slot manually via the SQL interface for example and create
>> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .
>
> If that's a predictable deadlock, I think a minimum expectation is that
> the system should notice it and throw an error, not just hang.  (Then
> the error could give a hint about how to work around it.)  But the case
> Bruce has in mind doesn't seem like a crazy use-case to me.  Can't we
> make it "just work"?

Perhaps using some detection with the replication origins? Just an
instinctive idea.. The current behavior is confusing for users, I have
fallen into this trap a couple of times already.
-- 
Michael



Re: [HACKERS] Logical replication in the same cluster

From
Robert Haas
Date:
On Wed, Apr 26, 2017 at 6:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
>> On 26/04/17 18:59, Bruce Momjian wrote:
>>> ... it just hangs.  My server logs say:
>
>> Yes that's result of how logical replication slots work, the transaction
>> that needs to finish is your transaction. It can be worked around by
>> creating the slot manually via the SQL interface for example and create
>> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .
>
> If that's a predictable deadlock, I think a minimum expectation is that
> the system should notice it and throw an error, not just hang.  (Then
> the error could give a hint about how to work around it.)  But the case
> Bruce has in mind doesn't seem like a crazy use-case to me.  Can't we
> make it "just work"?

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Logical replication in the same cluster

From
Peter Eisentraut
Date:
On 4/26/17 19:18, Michael Paquier wrote:
>> If that's a predictable deadlock, I think a minimum expectation is that
>> the system should notice it and throw an error, not just hang.  (Then
>> the error could give a hint about how to work around it.)  But the case
>> Bruce has in mind doesn't seem like a crazy use-case to me.  Can't we
>> make it "just work"?
> 
> Perhaps using some detection with the replication origins? Just an
> instinctive idea.. The current behavior is confusing for users, I have
> fallen into this trap a couple of times already.

We had some discussions early on about detecting connections to the same
server, but it's not entirely clear how to do that and it didn't seem
worth it at the time.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Logical replication in the same cluster

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
>>> If that's a predictable deadlock, I think a minimum expectation is that
>>> the system should notice it and throw an error, not just hang.

> We had some discussions early on about detecting connections to the same
> server, but it's not entirely clear how to do that and it didn't seem
> worth it at the time.

I wonder whether we actually need to detect connections to the same
server per se.  I'm thinking about the one end taking some special
heavyweight lock, and the other end taking the same lock, which would
generally be free as long as the two ends aren't on the same server.
Cascading replication might be a problem though ...
        regards, tom lane



Re: [HACKERS] Logical replication in the same cluster

From
Petr Jelinek
Date:
On 27/04/17 04:50, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
>>>> If that's a predictable deadlock, I think a minimum expectation is that
>>>> the system should notice it and throw an error, not just hang.
> 
>> We had some discussions early on about detecting connections to the same
>> server, but it's not entirely clear how to do that and it didn't seem
>> worth it at the time.
> 
> I wonder whether we actually need to detect connections to the same
> server per se.  I'm thinking about the one end taking some special
> heavyweight lock, and the other end taking the same lock, which would
> generally be free as long as the two ends aren't on the same server.
> Cascading replication might be a problem though ...
> 

Well cascading might not be problem. I mean this issue exists only
during the slot creation which is one time operation. Which is why the
workaround solves it. But I don't see what we could lock that's common
between publisher and subscriber unless we invent some database object
specifically for this purpose.

My idea in the original thread was to put the info about xid and sysid
somewhere in shmem when creating subscription and checking that on the
other side if the sysid is same as local one and the xid is active. It
would serialize the subscription creation but I don't see that as big
issue, it's not like it's common to create thousands of them in parallel
nor it is something where we care about shaving microseconds of runtime.

Back when writing the original patch set, I was also playing with the
idea of having CREATE SUBSCRIPTION do multiple committed steps in
similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind
on failure which also wasn't very popular outcome. I wonder how bad it
would be if we created all the stuff for subscription but in disabled
form, then committed, then created slot outside of tx (slot creation is
not transactional anyway) and then switched the subscription to enabled
(if needed) in next tx. It would still leave subscription behind on
failure but a) user would see the failure, b) the subscription would be
inactive so no active harm from it. We also already prevent running
CREATE SUBSCRIPTION inside transaction block when automatic slot
creation is chosen so there is no difference from that perspective.

Just for info, in pglogical, we solve this by having the worker create
slot, not the user command, so then it just works. The reason why I
didn't do this in core is that from practice it does not seem to be very
user friendly in case there are errors (not enough slots free,
connecting not only to same server but same db, etc) because they will
only see the errors in logs after the fact (and often they don't look).
I am already unhappy about the fact that we have no facility for
bgworker to save the last error before dying into place that is
accessible via SQL and I'd rather not hide even more errors in the log.

Note that the workaround for all of this is not all that complex, you do
same thing (create slot manually) you'd do for physical replication with
slots.

Thoughts?

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Logical replication in the same cluster

From
Peter Eisentraut
Date:
On 4/27/17 04:08, Petr Jelinek wrote:
> Note that the workaround for all of this is not all that complex, you do
> same thing (create slot manually) you'd do for physical replication with
> slots.

Maybe we should just document this issue for now.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Logical replication in the same cluster

From
Robert Haas
Date:
On Thu, Apr 27, 2017 at 4:08 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
> Back when writing the original patch set, I was also playing with the
> idea of having CREATE SUBSCRIPTION do multiple committed steps in
> similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind
> on failure which also wasn't very popular outcome. I wonder how bad it
> would be if we created all the stuff for subscription but in disabled
> form, then committed, then created slot outside of tx (slot creation is
> not transactional anyway) and then switched the subscription to enabled
> (if needed) in next tx. It would still leave subscription behind on
> failure but a) user would see the failure, b) the subscription would be
> inactive so no active harm from it. We also already prevent running
> CREATE SUBSCRIPTION inside transaction block when automatic slot
> creation is chosen so there is no difference from that perspective.

Sounds like a solid approach.  There's no way to end up with a remote
object without also ending up with a logical object, which seems like
it greatly reduces the chances of confusion and error.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Logical replication in the same cluster

From
Andres Freund
Date:
On 2017-04-26 23:41:51 +0200, Petr Jelinek wrote:
> Yes that's result of how logical replication slots work, the transaction
> that needs to finish is your transaction. It can be worked around by
> creating the slot manually via the SQL interface for example and create
> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .

Is there any chance the creation of the slot could be moved ahead, to
before an xid has been assigned?

- Andres



Re: [HACKERS] Logical replication in the same cluster

From
Peter Geoghegan
Date:
On Fri, Apr 28, 2017 at 9:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 27, 2017 at 4:08 AM, Petr Jelinek
> <petr.jelinek@2ndquadrant.com> wrote:
>> Back when writing the original patch set, I was also playing with the
>> idea of having CREATE SUBSCRIPTION do multiple committed steps in
>> similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind
>> on failure which also wasn't very popular outcome.

There is no inherent reason why the CREATE INDEX CONCURRENTLY style of
using multiple transactions makes it necessary to leave a mess behind
in the event of an error or hard crash. Is someone going to get around
to fixing the problem for CREATE INDEX CONCURRENTLY (e.g., having
extra steps to drop the useless index during recovery)? IIRC, this was
always the plan.


-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/



Re: [HACKERS] Logical replication in the same cluster

From
Andres Freund
Date:
On 2017-05-01 11:22:47 -0700, Peter Geoghegan wrote:
> On Fri, Apr 28, 2017 at 9:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Thu, Apr 27, 2017 at 4:08 AM, Petr Jelinek
> > <petr.jelinek@2ndquadrant.com> wrote:
> >> Back when writing the original patch set, I was also playing with the
> >> idea of having CREATE SUBSCRIPTION do multiple committed steps in
> >> similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind
> >> on failure which also wasn't very popular outcome.
> 
> There is no inherent reason why the CREATE INDEX CONCURRENTLY style of
> using multiple transactions makes it necessary to leave a mess behind
> in the event of an error or hard crash. Is someone going to get around
> to fixing the problem for CREATE INDEX CONCURRENTLY (e.g., having
> extra steps to drop the useless index during recovery)? IIRC, this was
> always the plan.

Doing catalog changes in recovery is frought with problems. Essentially
requires starting one worker per database, before allowing access.

- Andres



Re: [HACKERS] Logical replication in the same cluster

From
Peter Geoghegan
Date:
On Mon, May 1, 2017 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote:
> Doing catalog changes in recovery is frought with problems. Essentially
> requires starting one worker per database, before allowing access.

Do you think it's worth just covering the case where you get an error,
such as a duplicate violation? I imagine that that's the much more
common case.

-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/



Re: [HACKERS] Logical replication in the same cluster

From
Andres Freund
Date:
On 2017-05-01 11:31:53 -0700, Peter Geoghegan wrote:
> On Mon, May 1, 2017 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote:
> > Doing catalog changes in recovery is frought with problems. Essentially
> > requires starting one worker per database, before allowing access.
> 
> Do you think it's worth just covering the case where you get an error,
> such as a duplicate violation? I imagine that that's the much more
> common case.

What exactly are you proposing to do?  You mean catching errors in the
creating backend, if it didn't crash?  That doesn't strike me as a good
idea, because it'll push down the likelihood of the issue below where
people will see it, but it'll still be likely enough for it to create
problems.

- Andres



Re: [HACKERS] Logical replication in the same cluster

From
Peter Geoghegan
Date:
On Mon, May 1, 2017 at 11:37 AM, Andres Freund <andres@anarazel.de> wrote:
> What exactly are you proposing to do?  You mean catching errors in the
> creating backend, if it didn't crash?

That is what I meant, though I'm not actually proposing to do anything.

> That doesn't strike me as a good
> idea, because it'll push down the likelihood of the issue below where
> people will see it, but it'll still be likely enough for it to create
> problems.

I was concerned about that too. I have a hard time defending changes
like this to myself, but it doesn't hurt to ask.

-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/



Re: [HACKERS] Logical replication in the same cluster

From
Greg Stark
Date:
On 1 May 2017 at 19:24, Andres Freund <andres@anarazel.de> wrote:
>> There is no inherent reason why the CREATE INDEX CONCURRENTLY style of
>> using multiple transactions makes it necessary to leave a mess behind
>> in the event of an error or hard crash. Is someone going to get around
>> to fixing the problem for CREATE INDEX CONCURRENTLY (e.g., having
>> extra steps to drop the useless index during recovery)? IIRC, this was
>> always the plan.
>
> Doing catalog changes in recovery is frought with problems. Essentially
> requires starting one worker per database, before allowing access.

The "plan" was to add more layers PG_TRY and transactions so that if
there was an error during building the index all the remnants of the
failed index build got cleaned up. But when I went tried to actually
do it the problem seemed to metastatize and it was going to require
two or three layers of messy nested PG_TRY and extra transactions.
Perhaps there's a cleaner way to structure it and I should look again.

I don't recall ever having a plan to do anything in recovery. I think
we did talk about why it was hard to mark hash indexes invalid during
recovery which was probably the same problem.

-- 
greg



Re: [HACKERS] Logical replication in the same cluster

From
Peter Eisentraut
Date:
On 4/30/17 20:31, Andres Freund wrote:
> On 2017-04-26 23:41:51 +0200, Petr Jelinek wrote:
>> Yes that's result of how logical replication slots work, the transaction
>> that needs to finish is your transaction. It can be worked around by
>> creating the slot manually via the SQL interface for example and create
>> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .
> 
> Is there any chance the creation of the slot could be moved ahead, to
> before an xid has been assigned?

The trend has rather been to do most of the stuff before creating the
slot, so that all the error checking is done.  See for example
dcb39c37c1d3b90115e1501af8efb7af59c341c3.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Logical replication in the same cluster

From
Petr Jelinek
Date:
On 02/05/17 04:14, Peter Eisentraut wrote:
> On 4/30/17 20:31, Andres Freund wrote:
>> On 2017-04-26 23:41:51 +0200, Petr Jelinek wrote:
>>> Yes that's result of how logical replication slots work, the transaction
>>> that needs to finish is your transaction. It can be worked around by
>>> creating the slot manually via the SQL interface for example and create
>>> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .
>>
>> Is there any chance the creation of the slot could be moved ahead, to
>> before an xid has been assigned?
> 
> The trend has rather been to do most of the stuff before creating the
> slot, so that all the error checking is done.  See for example
> dcb39c37c1d3b90115e1501af8efb7af59c341c3.
> 

Yes because otherwise we risk leaving slot on the upstream if the
command fails downstream. We could move the slot creation outside of tx
as I proposed, then we risk leaving subscription downstream without slot
upstream though but I think that's less of a issues from user
friendliness and mainly safety (slot reserves wal and global catalog
xmin while subscription does not reserve anything except the name of the
subscription).

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Logical replication in the same cluster

From
Andres Freund
Date:
On 2017-05-02 09:17:27 +0200, Petr Jelinek wrote:
> Yes because otherwise we risk leaving slot on the upstream if the
> command fails downstream.

Shouldn't temporary slots be able to solve that concern?  Create it as
temporary at the beginning, mark it as permanent at the end?

Greetings,

Andres Freund



Re: [HACKERS] Logical replication in the same cluster

From
Petr Jelinek
Date:
On 02/05/17 16:37, Andres Freund wrote:
> On 2017-05-02 09:17:27 +0200, Petr Jelinek wrote:
>> Yes because otherwise we risk leaving slot on the upstream if the
>> command fails downstream.
> 
> Shouldn't temporary slots be able to solve that concern?  Create it as
> temporary at the beginning, mark it as permanent at the end?
> 

So we need ALTER_REPLICATION_SLOT? :)

But that aside, based on the conversation nearby [1], we'll see if we
even want to create slots in CREATE SUBSCRIPTION.

[1]

https://www.postgresql.org/message-id/flat/CA%2BTgmoZmkbpAWRzVKDVcHnTBkYjJEFS8%3D09RL-G3zgdozCLFHQ%40mail.gmail.com#CA+TgmoZmkbpAWRzVKDVcHnTBkYjJEFS8=09RL-G3zgdozCLFHQ@mail.gmail.com

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Logical replication in the same cluster

From
Peter Eisentraut
Date:
On 4/27/17 15:06, Peter Eisentraut wrote:
> On 4/27/17 04:08, Petr Jelinek wrote:
>> Note that the workaround for all of this is not all that complex, you do
>> same thing (create slot manually) you'd do for physical replication with
>> slots.
> 
> Maybe we should just document this issue for now.

done

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Logical replication in the same cluster

From
Albe Laurenz
Date:
Tom Lane wrote:
> Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
>> On 26/04/17 18:59, Bruce Momjian wrote:
>>> ... it just hangs.  My server logs say:
> 
>> Yes that's result of how logical replication slots work, the transaction
>> that needs to finish is your transaction. It can be worked around by
>> creating the slot manually via the SQL interface for example and create
>> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') .
> 
> If that's a predictable deadlock, I think a minimum expectation is that
> the system should notice it and throw an error, not just hang.  (Then
> the error could give a hint about how to work around it.)  But the case
> Bruce has in mind doesn't seem like a crazy use-case to me.  Can't we
> make it "just work"?

+1

I think that many people who start experimenting with logical replication
will run into this (like I did).

If nothing else, it's bad PR.
People will get the first impression that logical replication doesn't
work well.

Yours,
Laurenz Albe