Thread: [HACKERS] Logical replication in the same cluster
I tried setting up logical replication on the same server between two different databases, and got, from database test: test=> CREATE TABLE test (x INT PRIMARY KEY);CREATE TABLEtest=>test=> INSERT INTO test VALUES (1);INSERT 0 1test=> CREATEPUBLICATION mypub FOR TABLE test;CREATE PUBLICATION then from database test2: test2=> CREATE TABLE test (x INT PRIMARY KEY);CREATE TABLEtest2=> CREATE SUBSCRIPTION mysub CONNECTION 'dbname=test port=5432'PUBLICATIONmypub;NOTICE: synchronized table states and it just hangs. My server logs say: 2017-04-26 12:50:53.694 EDT [29363] LOG: logical decoding found initialstarting point at 0/15FF3E02017-04-26 12:50:53.694EDT [29363] DETAIL: 1 transaction needs tofinish. Is this expected? I can get it working from two different clusters. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 26/04/17 18:59, Bruce Momjian wrote: > I tried setting up logical replication on the same server between two > different databases, and got, from database test: > > test=> CREATE TABLE test (x INT PRIMARY KEY); > CREATE TABLE > test=> > test=> INSERT INTO test VALUES (1); > INSERT 0 1 > test=> CREATE PUBLICATION mypub FOR TABLE test; > CREATE PUBLICATION > > then from database test2: > > test2=> CREATE TABLE test (x INT PRIMARY KEY); > CREATE TABLE > test2=> CREATE SUBSCRIPTION mysub CONNECTION 'dbname=test port=5432' > PUBLICATION mypub; > NOTICE: synchronized table states > > and it just hangs. My server logs say: > > 2017-04-26 12:50:53.694 EDT [29363] LOG: logical decoding found initial > starting point at 0/15FF3E0 > 2017-04-26 12:50:53.694 EDT [29363] DETAIL: 1 transaction needs to > finish. > > Is this expected? I can get it working from two different clusters. > Yes that's result of how logical replication slots work, the transaction that needs to finish is your transaction. It can be worked around by creating the slot manually via the SQL interface for example and create the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Apr 26, 2017 at 11:41:51PM +0200, Petr Jelinek wrote: > On 26/04/17 18:59, Bruce Momjian wrote: > > I tried setting up logical replication on the same server between two > > different databases, and got, from database test: > > > > test=> CREATE TABLE test (x INT PRIMARY KEY); > > CREATE TABLE > > test=> > > test=> INSERT INTO test VALUES (1); > > INSERT 0 1 > > test=> CREATE PUBLICATION mypub FOR TABLE test; > > CREATE PUBLICATION > > > > then from database test2: > > > > test2=> CREATE TABLE test (x INT PRIMARY KEY); > > CREATE TABLE > > test2=> CREATE SUBSCRIPTION mysub CONNECTION 'dbname=test port=5432' > > PUBLICATION mypub; > > NOTICE: synchronized table states > > > > and it just hangs. My server logs say: > > > > 2017-04-26 12:50:53.694 EDT [29363] LOG: logical decoding found initial > > starting point at 0/15FF3E0 > > 2017-04-26 12:50:53.694 EDT [29363] DETAIL: 1 transaction needs to > > finish. > > > > Is this expected? I can get it working from two different clusters. > > > > Yes that's result of how logical replication slots work, the transaction > that needs to finish is your transaction. It can be worked around by > creating the slot manually via the SQL interface for example and create > the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . Oh, OK. Is there a way we can give users a hint do that if they try what I did? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes: > On 26/04/17 18:59, Bruce Momjian wrote: >> ... it just hangs. My server logs say: > Yes that's result of how logical replication slots work, the transaction > that needs to finish is your transaction. It can be worked around by > creating the slot manually via the SQL interface for example and create > the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . If that's a predictable deadlock, I think a minimum expectation is that the system should notice it and throw an error, not just hang. (Then the error could give a hint about how to work around it.) But the case Bruce has in mind doesn't seem like a crazy use-case to me. Can't we make it "just work"? regards, tom lane
On Thu, Apr 27, 2017 at 7:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Petr Jelinek <petr.jelinek@2ndquadrant.com> writes: >> On 26/04/17 18:59, Bruce Momjian wrote: >>> ... it just hangs. My server logs say: > >> Yes that's result of how logical replication slots work, the transaction >> that needs to finish is your transaction. It can be worked around by >> creating the slot manually via the SQL interface for example and create >> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . > > If that's a predictable deadlock, I think a minimum expectation is that > the system should notice it and throw an error, not just hang. (Then > the error could give a hint about how to work around it.) But the case > Bruce has in mind doesn't seem like a crazy use-case to me. Can't we > make it "just work"? Perhaps using some detection with the replication origins? Just an instinctive idea.. The current behavior is confusing for users, I have fallen into this trap a couple of times already. -- Michael
On Wed, Apr 26, 2017 at 6:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Petr Jelinek <petr.jelinek@2ndquadrant.com> writes: >> On 26/04/17 18:59, Bruce Momjian wrote: >>> ... it just hangs. My server logs say: > >> Yes that's result of how logical replication slots work, the transaction >> that needs to finish is your transaction. It can be worked around by >> creating the slot manually via the SQL interface for example and create >> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . > > If that's a predictable deadlock, I think a minimum expectation is that > the system should notice it and throw an error, not just hang. (Then > the error could give a hint about how to work around it.) But the case > Bruce has in mind doesn't seem like a crazy use-case to me. Can't we > make it "just work"? +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 4/26/17 19:18, Michael Paquier wrote: >> If that's a predictable deadlock, I think a minimum expectation is that >> the system should notice it and throw an error, not just hang. (Then >> the error could give a hint about how to work around it.) But the case >> Bruce has in mind doesn't seem like a crazy use-case to me. Can't we >> make it "just work"? > > Perhaps using some detection with the replication origins? Just an > instinctive idea.. The current behavior is confusing for users, I have > fallen into this trap a couple of times already. We had some discussions early on about detecting connections to the same server, but it's not entirely clear how to do that and it didn't seem worth it at the time. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >>> If that's a predictable deadlock, I think a minimum expectation is that >>> the system should notice it and throw an error, not just hang. > We had some discussions early on about detecting connections to the same > server, but it's not entirely clear how to do that and it didn't seem > worth it at the time. I wonder whether we actually need to detect connections to the same server per se. I'm thinking about the one end taking some special heavyweight lock, and the other end taking the same lock, which would generally be free as long as the two ends aren't on the same server. Cascading replication might be a problem though ... regards, tom lane
On 27/04/17 04:50, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >>>> If that's a predictable deadlock, I think a minimum expectation is that >>>> the system should notice it and throw an error, not just hang. > >> We had some discussions early on about detecting connections to the same >> server, but it's not entirely clear how to do that and it didn't seem >> worth it at the time. > > I wonder whether we actually need to detect connections to the same > server per se. I'm thinking about the one end taking some special > heavyweight lock, and the other end taking the same lock, which would > generally be free as long as the two ends aren't on the same server. > Cascading replication might be a problem though ... > Well cascading might not be problem. I mean this issue exists only during the slot creation which is one time operation. Which is why the workaround solves it. But I don't see what we could lock that's common between publisher and subscriber unless we invent some database object specifically for this purpose. My idea in the original thread was to put the info about xid and sysid somewhere in shmem when creating subscription and checking that on the other side if the sysid is same as local one and the xid is active. It would serialize the subscription creation but I don't see that as big issue, it's not like it's common to create thousands of them in parallel nor it is something where we care about shaving microseconds of runtime. Back when writing the original patch set, I was also playing with the idea of having CREATE SUBSCRIPTION do multiple committed steps in similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind on failure which also wasn't very popular outcome. I wonder how bad it would be if we created all the stuff for subscription but in disabled form, then committed, then created slot outside of tx (slot creation is not transactional anyway) and then switched the subscription to enabled (if needed) in next tx. It would still leave subscription behind on failure but a) user would see the failure, b) the subscription would be inactive so no active harm from it. We also already prevent running CREATE SUBSCRIPTION inside transaction block when automatic slot creation is chosen so there is no difference from that perspective. Just for info, in pglogical, we solve this by having the worker create slot, not the user command, so then it just works. The reason why I didn't do this in core is that from practice it does not seem to be very user friendly in case there are errors (not enough slots free, connecting not only to same server but same db, etc) because they will only see the errors in logs after the fact (and often they don't look). I am already unhappy about the fact that we have no facility for bgworker to save the last error before dying into place that is accessible via SQL and I'd rather not hide even more errors in the log. Note that the workaround for all of this is not all that complex, you do same thing (create slot manually) you'd do for physical replication with slots. Thoughts? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 4/27/17 04:08, Petr Jelinek wrote: > Note that the workaround for all of this is not all that complex, you do > same thing (create slot manually) you'd do for physical replication with > slots. Maybe we should just document this issue for now. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Apr 27, 2017 at 4:08 AM, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > Back when writing the original patch set, I was also playing with the > idea of having CREATE SUBSCRIPTION do multiple committed steps in > similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind > on failure which also wasn't very popular outcome. I wonder how bad it > would be if we created all the stuff for subscription but in disabled > form, then committed, then created slot outside of tx (slot creation is > not transactional anyway) and then switched the subscription to enabled > (if needed) in next tx. It would still leave subscription behind on > failure but a) user would see the failure, b) the subscription would be > inactive so no active harm from it. We also already prevent running > CREATE SUBSCRIPTION inside transaction block when automatic slot > creation is chosen so there is no difference from that perspective. Sounds like a solid approach. There's no way to end up with a remote object without also ending up with a logical object, which seems like it greatly reduces the chances of confusion and error. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-04-26 23:41:51 +0200, Petr Jelinek wrote: > Yes that's result of how logical replication slots work, the transaction > that needs to finish is your transaction. It can be worked around by > creating the slot manually via the SQL interface for example and create > the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . Is there any chance the creation of the slot could be moved ahead, to before an xid has been assigned? - Andres
On Fri, Apr 28, 2017 at 9:28 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Apr 27, 2017 at 4:08 AM, Petr Jelinek > <petr.jelinek@2ndquadrant.com> wrote: >> Back when writing the original patch set, I was also playing with the >> idea of having CREATE SUBSCRIPTION do multiple committed steps in >> similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind >> on failure which also wasn't very popular outcome. There is no inherent reason why the CREATE INDEX CONCURRENTLY style of using multiple transactions makes it necessary to leave a mess behind in the event of an error or hard crash. Is someone going to get around to fixing the problem for CREATE INDEX CONCURRENTLY (e.g., having extra steps to drop the useless index during recovery)? IIRC, this was always the plan. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/
On 2017-05-01 11:22:47 -0700, Peter Geoghegan wrote: > On Fri, Apr 28, 2017 at 9:28 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Apr 27, 2017 at 4:08 AM, Petr Jelinek > > <petr.jelinek@2ndquadrant.com> wrote: > >> Back when writing the original patch set, I was also playing with the > >> idea of having CREATE SUBSCRIPTION do multiple committed steps in > >> similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind > >> on failure which also wasn't very popular outcome. > > There is no inherent reason why the CREATE INDEX CONCURRENTLY style of > using multiple transactions makes it necessary to leave a mess behind > in the event of an error or hard crash. Is someone going to get around > to fixing the problem for CREATE INDEX CONCURRENTLY (e.g., having > extra steps to drop the useless index during recovery)? IIRC, this was > always the plan. Doing catalog changes in recovery is frought with problems. Essentially requires starting one worker per database, before allowing access. - Andres
On Mon, May 1, 2017 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote: > Doing catalog changes in recovery is frought with problems. Essentially > requires starting one worker per database, before allowing access. Do you think it's worth just covering the case where you get an error, such as a duplicate violation? I imagine that that's the much more common case. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/
On 2017-05-01 11:31:53 -0700, Peter Geoghegan wrote: > On Mon, May 1, 2017 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote: > > Doing catalog changes in recovery is frought with problems. Essentially > > requires starting one worker per database, before allowing access. > > Do you think it's worth just covering the case where you get an error, > such as a duplicate violation? I imagine that that's the much more > common case. What exactly are you proposing to do? You mean catching errors in the creating backend, if it didn't crash? That doesn't strike me as a good idea, because it'll push down the likelihood of the issue below where people will see it, but it'll still be likely enough for it to create problems. - Andres
On Mon, May 1, 2017 at 11:37 AM, Andres Freund <andres@anarazel.de> wrote: > What exactly are you proposing to do? You mean catching errors in the > creating backend, if it didn't crash? That is what I meant, though I'm not actually proposing to do anything. > That doesn't strike me as a good > idea, because it'll push down the likelihood of the issue below where > people will see it, but it'll still be likely enough for it to create > problems. I was concerned about that too. I have a hard time defending changes like this to myself, but it doesn't hurt to ask. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/
On 1 May 2017 at 19:24, Andres Freund <andres@anarazel.de> wrote: >> There is no inherent reason why the CREATE INDEX CONCURRENTLY style of >> using multiple transactions makes it necessary to leave a mess behind >> in the event of an error or hard crash. Is someone going to get around >> to fixing the problem for CREATE INDEX CONCURRENTLY (e.g., having >> extra steps to drop the useless index during recovery)? IIRC, this was >> always the plan. > > Doing catalog changes in recovery is frought with problems. Essentially > requires starting one worker per database, before allowing access. The "plan" was to add more layers PG_TRY and transactions so that if there was an error during building the index all the remnants of the failed index build got cleaned up. But when I went tried to actually do it the problem seemed to metastatize and it was going to require two or three layers of messy nested PG_TRY and extra transactions. Perhaps there's a cleaner way to structure it and I should look again. I don't recall ever having a plan to do anything in recovery. I think we did talk about why it was hard to mark hash indexes invalid during recovery which was probably the same problem. -- greg
On 4/30/17 20:31, Andres Freund wrote: > On 2017-04-26 23:41:51 +0200, Petr Jelinek wrote: >> Yes that's result of how logical replication slots work, the transaction >> that needs to finish is your transaction. It can be worked around by >> creating the slot manually via the SQL interface for example and create >> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . > > Is there any chance the creation of the slot could be moved ahead, to > before an xid has been assigned? The trend has rather been to do most of the stuff before creating the slot, so that all the error checking is done. See for example dcb39c37c1d3b90115e1501af8efb7af59c341c3. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 02/05/17 04:14, Peter Eisentraut wrote: > On 4/30/17 20:31, Andres Freund wrote: >> On 2017-04-26 23:41:51 +0200, Petr Jelinek wrote: >>> Yes that's result of how logical replication slots work, the transaction >>> that needs to finish is your transaction. It can be worked around by >>> creating the slot manually via the SQL interface for example and create >>> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . >> >> Is there any chance the creation of the slot could be moved ahead, to >> before an xid has been assigned? > > The trend has rather been to do most of the stuff before creating the > slot, so that all the error checking is done. See for example > dcb39c37c1d3b90115e1501af8efb7af59c341c3. > Yes because otherwise we risk leaving slot on the upstream if the command fails downstream. We could move the slot creation outside of tx as I proposed, then we risk leaving subscription downstream without slot upstream though but I think that's less of a issues from user friendliness and mainly safety (slot reserves wal and global catalog xmin while subscription does not reserve anything except the name of the subscription). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2017-05-02 09:17:27 +0200, Petr Jelinek wrote: > Yes because otherwise we risk leaving slot on the upstream if the > command fails downstream. Shouldn't temporary slots be able to solve that concern? Create it as temporary at the beginning, mark it as permanent at the end? Greetings, Andres Freund
On 02/05/17 16:37, Andres Freund wrote: > On 2017-05-02 09:17:27 +0200, Petr Jelinek wrote: >> Yes because otherwise we risk leaving slot on the upstream if the >> command fails downstream. > > Shouldn't temporary slots be able to solve that concern? Create it as > temporary at the beginning, mark it as permanent at the end? > So we need ALTER_REPLICATION_SLOT? :) But that aside, based on the conversation nearby [1], we'll see if we even want to create slots in CREATE SUBSCRIPTION. [1] https://www.postgresql.org/message-id/flat/CA%2BTgmoZmkbpAWRzVKDVcHnTBkYjJEFS8%3D09RL-G3zgdozCLFHQ%40mail.gmail.com#CA+TgmoZmkbpAWRzVKDVcHnTBkYjJEFS8=09RL-G3zgdozCLFHQ@mail.gmail.com -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 4/27/17 15:06, Peter Eisentraut wrote: > On 4/27/17 04:08, Petr Jelinek wrote: >> Note that the workaround for all of this is not all that complex, you do >> same thing (create slot manually) you'd do for physical replication with >> slots. > > Maybe we should just document this issue for now. done -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tom Lane wrote: > Petr Jelinek <petr.jelinek@2ndquadrant.com> writes: >> On 26/04/17 18:59, Bruce Momjian wrote: >>> ... it just hangs. My server logs say: > >> Yes that's result of how logical replication slots work, the transaction >> that needs to finish is your transaction. It can be worked around by >> creating the slot manually via the SQL interface for example and create >> the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . > > If that's a predictable deadlock, I think a minimum expectation is that > the system should notice it and throw an error, not just hang. (Then > the error could give a hint about how to work around it.) But the case > Bruce has in mind doesn't seem like a crazy use-case to me. Can't we > make it "just work"? +1 I think that many people who start experimenting with logical replication will run into this (like I did). If nothing else, it's bad PR. People will get the first impression that logical replication doesn't work well. Yours, Laurenz Albe