Thread: Strange issues with 9.2 pg_basebackup & replication
Doing some beta testing, managed to produce this issue using the daily snapshot from Tuesday: 1. Created master server, loaded it with a couple dummy databases. 2. Created standby server. 3. Did pg_basebackup -x stream on standby server 4. Started standby server. 5. Realized I'd forgotten to create a recovery.conf. Shut down the standby server, wrote a recovery.conf, and restarted it. 6. Standby server looked normal and appeared to be replicating. Master server showed it as replicating: postgres=# select * from pg_stat_replication;pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state ------+----------+------------+------------------+----------------+-----------------+-------------+------ -------------------------+-----------+---------------+----------------+----------------+----------------- +---------------+------------1278 | 16393 | replicator | walreceiver | ###.###.61.227 | | 45391 |2012- 05-13 18:44:18.603122+00 | streaming | 0/70000B8 | 0/70000B8 | 0/70000B8 | 0/70000E0 | 0 | async 7. Did a "create table" on the master server, creating an empty table. 8. Got this fatal error on the standby server: LOG: record with incorrect prev-link 0/70000B8 at 0/70000E0 LOG: record with incorrect prev-link 0/70000B8 at 0/70000E0 ... this error message repeated every 5s. Either the swap of the standby into proper standby mode should have been OK (since there were no writes on the master or the standby in that time), or it should have failed immediately. Clearly there's something broken here. Note that I more-or-less did the same test on 9.1, and it didn't break in this way. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
More issues: the pg_basebackup -x stream on the cascading replica won't complete until the xlog rotates on the master. (again, this is Tuesday's snapshot). Servers: .226 == master-master, the writeable master .227 == master-replica, a direct replica of master-master .228 == replica-replica, a cascading replica of master-replica 1. recreated master-master, loaded it with some dummy databases. 2. created master-replica 3. took pg_basebackup -x stream of master-master on master-replica 4. edited recovery.conf and started master-replica. started normally. 5. created a table on master-master. change replicated to master-replica. 6. created replica-replica. started a pg_basebackup -x stream from master-replica 7. pg_basebackup hung forever. output of -v: xlog start point: 0/A000020 pg_basebackup: starting background WAL receiver xlog end point: 0/A01C188 pg_basebackup: waiting for background process to finish streaming... 8. tried creating a table on master-master to create a write. this had no effect (although the table did replicate to master-replica). Here's pg_stat_replication on master-master while the basebackup is hung: 1385 | 16393 | replicator | walreceiver | ###.###.61.227 | | 45396 | 2012- 05-13 19:05:11.972471+00 | streaming | 0/A024F50 | 0/A024F50 | 0/A024F50 | 0/A024F50 | 0 | async Here's pg_stat_replication on master-replica while basebackup is hung: 1243 | 16393 | replicator | pg_basebackup | ###.###.61.228 | | 49218 | 2012- 05-13 19:06:07.606378+00 | startup | 0/0 | | | | 0 | async1244 | 16393 | replicator | pg_basebackup | ###.###.61.228 | | 49219 | 2012- 05-13 19:06:07.611996+00 | streaming | 0/A024F50 | | | | 0 | async 9. Ran pg_xlog_switch() on the master. 10. basebackup completed on replica-replica. 11. edited recovery.conf on replica-replica and started it. works fine. 12. created a new table on master-master, changed replicated to replica-replica. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
More issues: promoting intermediate standby breaks replication. To be a bit blunt here, has anyone tested cascading replication *at all* before this? So, same setup as previous message. 1. Shut down master-master. 2. pg_ctl promote master-replica 3. replication breaks. error message on replica-replica: FATAL: timeline 2 of the primary does not match recovery target timeline 1 4. No amount of adjustment on replica-replica will get it replicating again. Note that replica-replica was configured with: recovery_target_timeline = 'latest' -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 13 May 2012 20:23, Josh Berkus <josh@agliodbs.com> wrote: > More issues: the pg_basebackup -x stream on the cascading replica won't > complete until the xlog rotates on the master. (again, this is > Tuesday's snapshot). This is already on the open items list: http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Open_Items#pg_basebackup.2Fpg_receivexlog -- Thom
On May 13, 2012, at 3:08 PM, Josh Berkus wrote: > More issues: promoting intermediate standby breaks replication. > > To be a bit blunt here, has anyone tested cascading replication *at all* > before this? Josh, do you have scripts that you're using to do this testing? If so can you post them somewhere? AFAIK we don't have any regression tests for all this replication stuff, but ISTM that we need some... -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Jim, I didn't get as far as running any tests, actually. All I did was try to set up 3 servers in cascading replication. ThenI tried shutting down master-master and promoting master-replica. That's it. ----- Original Message ----- > On May 13, 2012, at 3:08 PM, Josh Berkus wrote: > > More issues: promoting intermediate standby breaks replication. > > > > To be a bit blunt here, has anyone tested cascading replication *at > > all* > > before this? > > Josh, do you have scripts that you're using to do this testing? If so > can you post them somewhere? > > AFAIK we don't have any regression tests for all this replication > stuff, but ISTM that we need some... > -- > Jim C. Nasby, Database Architect jim@nasby.net > 512.569.9461 (cell) http://jim.nasby.net > >
On 13 May 2012 16:08, Josh Berkus <josh@agliodbs.com> wrote: > More issues: promoting intermediate standby breaks replication. > > To be a bit blunt here, has anyone tested cascading replication *at all* > before this? > > So, same setup as previous message. > > 1. Shut down master-master. > > 2. pg_ctl promote master-replica > > 3. replication breaks. error message on replica-replica: > > FATAL: timeline 2 of the primary does not match recovery target timeline 1 > > 4. No amount of adjustment on replica-replica will get it replicating > again. > > Note that replica-replica was configured with: > > recovery_target_timeline = 'latest' I can recreate this "issue", although the docs say: "Promoting a cascading standby terminates the immediate downstream replication connections which it serves. This is because the timeline becomes different between standbys, and they can no longer continue replication. The affected standby(s) may reconnect to reestablish streaming replication." (http://www.postgresql.org/docs/9.2/static/warm-standby.html#CASCADING-REPLICATION) However, this isn't true when I restart the standby. I've been informed that this should work fine if a WAL archive has been configured (which should be used anyway). But one new problem I appear to have is that once I set up archiving and restart, then try pg_basebackup, it gets stuck and never shows any progress. If I terminate pg_basebackup in this state and attempt to restart it more times than max_wal_senders, it can no longer run, as pg_basebackup didn't disconnect the stream, so ends up using all senders. And these show up in pg_stat_replication. I have a theory that if archiving is enabled, restart postgres then generate some WAL to the point there is a file or two in the archive, pg_basebackup can't stream anything. Once I restart the server, it's fine and continues as normal. This has the same symptoms of the "pg_basebackup from running standby with streaming" issue. Steps to recreate: 1) initdb new cluster 2) start new cluster 3) make archive dir (in my case, /tmp/arch) and set the following: wal_level = hot_standby max_wal_senders = 3 archive_mode= on archive_command = 'cp %p /tmp/arch/%f' 4) Set pg_hba.conf to allow streaming replication connections 5) Restart the cluster 6) Create a table and insert a few hundred thousand rows until /tmp/arch shows some WAL files 7) Run: pg_basebackup -x stream -D s1 -Pv This actually does finish eventually but it appears to need some encouragement by generating some WAL and issuing a checkpoint: thom@swift:~/Development$ time pg_basebackup -x stream -D s1 -Pv xlog start point: 0/4000020 pg_basebackup: starting background WAL receiver 53951/53951 kB (100%), 1/1 tablespace xlog end point: 0/5DE15E0 pg_basebackup: waiting for background process to finish streaming... pg_basebackup: base backup completed real 2m37.456s user 0m0.016s sys 0m0.724s If I terminate pg_basebackup and restart it without generating additional WAL, it doesn't appear to release the streaming connection ever (or not within my patience limit of a few minutes). And I can't free these connections without restarting the cluster. But once I get the standby up and running and acting as a hot standby, and ignore the current issue with it getting stuck creating a standby from a standby, I still get the mismatched timeline issue, so the addition of WAL archiving didn't appear to resolve this for me. -- Thom
On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote: > However, this isn't true when I restart the standby. I've been > informed that this should work fine if a WAL archive has been > configured (which should be used anyway). The WAL archive should be shared by master-replica and replica-replica, and recovery_target_timeline should be set to latest in replica-replica. If you configure that way, replica-replica would successfully reconnect to master-replica with no need to restart it. > But one new problem I appear to have is that once I set up archiving > and restart, then try pg_basebackup, it gets stuck and never shows any > progress. If I terminate pg_basebackup in this state and attempt to > restart it more times than max_wal_senders, it can no longer run, as > pg_basebackup didn't disconnect the stream, so ends up using all > senders. And these show up in pg_stat_replication. I have a theory > that if archiving is enabled, restart postgres then generate some WAL > to the point there is a file or two in the archive, pg_basebackup > can't stream anything. Once I restart the server, it's fine and > continues as normal. This has the same symptoms of the "pg_basebackup > from running standby with streaming" issue. This seems to be caused by spread checkpoint which is requested by pg_basebackup. IOW, this looks a normal behavior rather than a bug or an issue. What if you specify "-c fast" option in pg_basebackup? Regards, -- Fujii Masao
On Mon, May 14, 2012 at 4:04 AM, Josh Berkus <josh@agliodbs.com> wrote: > Doing some beta testing, managed to produce this issue using the daily > snapshot from Tuesday: > > 1. Created master server, loaded it with a couple dummy databases. > > 2. Created standby server. > > 3. Did pg_basebackup -x stream on standby server > > 4. Started standby server. > > 5. Realized I'd forgotten to create a recovery.conf. Shut down the > standby server, wrote a recovery.conf, and restarted it. Before restarting it, you need to do pg_basebackup and make a base backup onto the standby again. Since you started the standby without recovery.conf, a series of WAL in the standby has gotten inconsistent with that in the master. So you need a fresh backup to restart the standby. Regards, -- Fujii Masao
On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote: > On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote: >> However, this isn't true when I restart the standby. I've been >> informed that this should work fine if a WAL archive has been >> configured (which should be used anyway). > > The WAL archive should be shared by master-replica and replica-replica, > and recovery_target_timeline should be set to latest in replica-replica. > If you configure that way, replica-replica would successfully reconnect to > master-replica with no need to restart it. I had set the archive_command on the primary, then produced a base backup which would have copied the archive settings, but I also added a corresponding recovery_command setting, so everything was pointing at the same archive. >> But one new problem I appear to have is that once I set up archiving >> and restart, then try pg_basebackup, it gets stuck and never shows any >> progress. If I terminate pg_basebackup in this state and attempt to >> restart it more times than max_wal_senders, it can no longer run, as >> pg_basebackup didn't disconnect the stream, so ends up using all >> senders. And these show up in pg_stat_replication. I have a theory >> that if archiving is enabled, restart postgres then generate some WAL >> to the point there is a file or two in the archive, pg_basebackup >> can't stream anything. Once I restart the server, it's fine and >> continues as normal. This has the same symptoms of the "pg_basebackup >> from running standby with streaming" issue. > > This seems to be caused by spread checkpoint which is requested by > pg_basebackup. IOW, this looks a normal behavior rather than a bug > or an issue. What if you specify "-c fast" option in pg_basebackup? Yes, it works fine with that option. And it appears this isn't to do with there being an archive as I get the same symptoms without setting one up. But in any case, shouldn't the replication connection be terminated when pg_basebackup is terminated? -- Thom
Fujii, Wait, are you telling me that we *still* can't remaster from streaming replication? Why wasn't that fixed in 9.2? And: if we still have to ship logs, what's the point in even having cascading replication? ----- Original Message ----- > On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote: > > However, this isn't true when I restart the standby. I've been > > informed that this should work fine if a WAL archive has been > > configured (which should be used anyway). > > The WAL archive should be shared by master-replica and > replica-replica, > and recovery_target_timeline should be set to latest in > replica-replica. > If you configure that way, replica-replica would successfully > reconnect to > master-replica with no need to restart it. > > > But one new problem I appear to have is that once I set up > > archiving > > and restart, then try pg_basebackup, it gets stuck and never shows > > any > > progress. If I terminate pg_basebackup in this state and attempt > > to > > restart it more times than max_wal_senders, it can no longer run, > > as > > pg_basebackup didn't disconnect the stream, so ends up using all > > senders. And these show up in pg_stat_replication. I have a > > theory > > that if archiving is enabled, restart postgres then generate some > > WAL > > to the point there is a file or two in the archive, pg_basebackup > > can't stream anything. Once I restart the server, it's fine and > > continues as normal. This has the same symptoms of the > > "pg_basebackup > > from running standby with streaming" issue. > > This seems to be caused by spread checkpoint which is requested by > pg_basebackup. IOW, this looks a normal behavior rather than a bug > or an issue. What if you specify "-c fast" option in pg_basebackup? > > Regards, > > -- > Fujii Masao >
> Before restarting it, you need to do pg_basebackup and make a base > backup > onto the standby again. Since you started the standby without > recovery.conf, > a series of WAL in the standby has gotten inconsistent with that in > the master. > So you need a fresh backup to restart the standby. You're not understanding the bug. The problem is that the standby came up and reported that it was replicating OK, whenclearly it wasn't. --Josh
On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom@linux.com> wrote: > On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote: >>> However, this isn't true when I restart the standby. I've been >>> informed that this should work fine if a WAL archive has been >>> configured (which should be used anyway). >> >> The WAL archive should be shared by master-replica and replica-replica, >> and recovery_target_timeline should be set to latest in replica-replica. >> If you configure that way, replica-replica would successfully reconnect to >> master-replica with no need to restart it. > > I had set the archive_command on the primary, then produced a base > backup which would have copied the archive settings, but I also added > a corresponding recovery_command setting, so everything was pointing > at the same archive. Hmm.. when doing the same, the replica-replica successfully reconnected to the master-replica after I shutdown the master-master and promoted the master-replica. archive_command is the same in three servers, restore_command is the same in two standby servers (i.e., master-replica and replica-replica), and recovery_target_timeline is set to 'latest' in two standby servers. >>> But one new problem I appear to have is that once I set up archiving >>> and restart, then try pg_basebackup, it gets stuck and never shows any >>> progress. If I terminate pg_basebackup in this state and attempt to >>> restart it more times than max_wal_senders, it can no longer run, as >>> pg_basebackup didn't disconnect the stream, so ends up using all >>> senders. And these show up in pg_stat_replication. I have a theory >>> that if archiving is enabled, restart postgres then generate some WAL >>> to the point there is a file or two in the archive, pg_basebackup >>> can't stream anything. Once I restart the server, it's fine and >>> continues as normal. This has the same symptoms of the "pg_basebackup >>> from running standby with streaming" issue. >> >> This seems to be caused by spread checkpoint which is requested by >> pg_basebackup. IOW, this looks a normal behavior rather than a bug >> or an issue. What if you specify "-c fast" option in pg_basebackup? > > Yes, it works fine with that option. And it appears this isn't to do > with there being an archive as I get the same symptoms without setting > one up. Yes. > But in any case, shouldn't the replication connection be > terminated when pg_basebackup is terminated? +1 To do this, we would need to define SIGINT signal handler and make it send QueryCancel packet when Ctrl-C is typed. Regards, -- Fujii Masao
On Wed, May 16, 2012 at 3:42 AM, Joshua Berkus <josh@agliodbs.com> wrote: > Fujii, > > Wait, are you telling me that we *still* can't remaster from streaming replication? What's the "remaster"? > And: if we still have to ship logs, what's the point in even having cascading replication? At least cascading replication (1) allows you to adopt more flexible configuration of servers, (2) reduces the number of standby servers which directly connect to the master, which reduces the overhead of the master, and (3) provides the infrastructure of standby-only base backup feature. Regards, -- Fujii Masao
On Wed, May 16, 2012 at 3:43 AM, Joshua Berkus <josh@agliodbs.com> wrote: > >> Before restarting it, you need to do pg_basebackup and make a base >> backup >> onto the standby again. Since you started the standby without >> recovery.conf, >> a series of WAL in the standby has gotten inconsistent with that in >> the master. >> So you need a fresh backup to restart the standby. > > You're not understanding the bug. The problem is that the standby came up and reported that it was replicating OK, whenclearly it wasn't. > 8. Got this fatal error on the standby server: > > LOG: record with incorrect prev-link 0/70000B8 at 0/70000E0 > LOG: record with incorrect prev-link 0/70000B8 at 0/70000E0 > > ... this error message repeated every 5s. According to your first report, ISTM you got error messages. Regards, -- Fujii Masao
On 16 May 2012 11:36, Fujii Masao <masao.fujii@gmail.com> wrote: > On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom@linux.com> wrote: >> On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote: >>> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote: >>>> However, this isn't true when I restart the standby. I've been >>>> informed that this should work fine if a WAL archive has been >>>> configured (which should be used anyway). >>> >>> The WAL archive should be shared by master-replica and replica-replica, >>> and recovery_target_timeline should be set to latest in replica-replica. >>> If you configure that way, replica-replica would successfully reconnect to >>> master-replica with no need to restart it. >> >> I had set the archive_command on the primary, then produced a base >> backup which would have copied the archive settings, but I also added >> a corresponding recovery_command setting, so everything was pointing >> at the same archive. > > Hmm.. when doing the same, the replica-replica successfully reconnected > to the master-replica after I shutdown the master-master and promoted the > master-replica. archive_command is the same in three servers, > restore_command is the same in two standby servers (i.e., master-replica > and replica-replica), and recovery_target_timeline is set to 'latest' in two > standby servers. I didn't shut down the master-master, but I didn't expect to need to. I also had recovery_target_timeline set to latest. I also tried explicitly setting it to the new timeline, and got an error saying there was no such timeline. >> But in any case, shouldn't the replication connection be >> terminated when pg_basebackup is terminated? > > +1 To do this, we would need to define SIGINT signal handler and make it > send QueryCancel packet when Ctrl-C is typed. Also could we provide some feedback when using the -c spread option, when there isn't progress within a short period of time? Something like "Waiting for checkpoint. This can take up to %checkpoint_timeout%", or something similar, rather than seeing nothing happening and wondering if something has gone wrong. And also a note in the documentation saying that, on "quiet" clusters, it may take some time before the base backup commences. In fact, since pg_start_backup will exhibit the same behaviour (i.e. no feedback when waiting for a checkpoint), maybe that should return a notice (if there are dirty pages) stating that it will complete when the next checkpoint occurs. -- Thom
Well, that is a form of testing. :) My point was that we need some kind of regression tests around all the new replication stuff, and if you had some scriptsthat would be a useful starting point. But it sounds like you haven't gotten that far with it, so... On 5/15/12 10:12 AM, Joshua Berkus wrote: > Jim, > > I didn't get as far as running any tests, actually. All I did was try to set up 3 servers in cascading replication. ThenI tried shutting down master-master and promoting master-replica. That's it. > > ----- Original Message ----- >> On May 13, 2012, at 3:08 PM, Josh Berkus wrote: >>> More issues: promoting intermediate standby breaks replication. >>> >>> To be a bit blunt here, has anyone tested cascading replication *at >>> all* >>> before this? >> >> Josh, do you have scripts that you're using to do this testing? If so >> can you post them somewhere? >> >> AFAIK we don't have any regression tests for all this replication >> stuff, but ISTM that we need some... >> -- >> Jim C. Nasby, Database Architect jim@nasby.net >> 512.569.9461 (cell) http://jim.nasby.net >> >> > -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On 5/16/12 10:53 AM, Fujii Masao wrote: > On Wed, May 16, 2012 at 3:43 AM, Joshua Berkus<josh@agliodbs.com> wrote: >> >>> Before restarting it, you need to do pg_basebackup and make a base >>> backup >>> onto the standby again. Since you started the standby without >>> recovery.conf, >>> a series of WAL in the standby has gotten inconsistent with that in >>> the master. >>> So you need a fresh backup to restart the standby. >> >> You're not understanding the bug. The problem is that the standby came up and reported that it was replicating OK, whenclearly it wasn't. > >> 8. Got this fatal error on the standby server: >> >> LOG: record with incorrect prev-link 0/70000B8 at 0/70000E0 >> LOG: record with incorrect prev-link 0/70000B8 at 0/70000E0 >> >> ... this error message repeated every 5s. > > According to your first report, ISTM you got error messages. Only *after* it was correctly setup. Josh's point is that if you flub the configuration, you should get an error, which is not what's happening now. Right nowit just comes up and acts as if nothing's wrong. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Thu, May 17, 2012 at 1:07 AM, Thom Brown <thom@linux.com> wrote: > On 16 May 2012 11:36, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom@linux.com> wrote: >>> On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote: >>>> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote: >>>>> However, this isn't true when I restart the standby. I've been >>>>> informed that this should work fine if a WAL archive has been >>>>> configured (which should be used anyway). >>>> >>>> The WAL archive should be shared by master-replica and replica-replica, >>>> and recovery_target_timeline should be set to latest in replica-replica. >>>> If you configure that way, replica-replica would successfully reconnect to >>>> master-replica with no need to restart it. >>> >>> I had set the archive_command on the primary, then produced a base >>> backup which would have copied the archive settings, but I also added >>> a corresponding recovery_command setting, so everything was pointing >>> at the same archive. >> >> Hmm.. when doing the same, the replica-replica successfully reconnected >> to the master-replica after I shutdown the master-master and promoted the >> master-replica. archive_command is the same in three servers, >> restore_command is the same in two standby servers (i.e., master-replica >> and replica-replica), and recovery_target_timeline is set to 'latest' in two >> standby servers. > > I didn't shut down the master-master, but I didn't expect to need to. > > I also had recovery_target_timeline set to latest. I also tried > explicitly setting it to the new timeline, and got an error saying > there was no such timeline. What did the replica-replica do after you got such an error? Repeated such an error? Emit PANIC error and exited? Got stuck? Successfully reconnected to the master-replica? .... In theory, the gap of timeline should be resolved as follows: 1. promote master-replica, which terminates cascade replication. 2. while replica-replica is repeating to reconnect to master-replica, if it finds new timeline history file in the archive,it adjusts its timeline to new one. 3. as the result of promotion, master-replica increments its timeline, creates the timeline history file and archives it. 4. finally replica-replica finds new timeline history file in the archive, adjusts its timeline to new one, and successfullyreconnects to the master-replica. Note that you might see the timeline mismatch error some times before replication is successfully restarted because of the timing problem. > >>> But in any case, shouldn't the replication connection be >>> terminated when pg_basebackup is terminated? >> >> +1 To do this, we would need to define SIGINT signal handler and make it >> send QueryCancel packet when Ctrl-C is typed. > > Also could we provide some feedback when using the -c spread option, > when there isn't progress within a short period of time? Something > like "Waiting for checkpoint. This can take up to > %checkpoint_timeout%", or something similar, rather than seeing > nothing happening and wondering if something has gone wrong. +1, at least for the case where -P option is specified in pg_basebackup. Regards, -- Fujii Masao
> > And: if we still have to ship logs, what's the point in even having > > cascading replication? > > At least cascading replication (1) allows you to adopt more flexible > configuration of servers, I'm just pretty shocked. The last time we talked about this, at the end of the 9.1 development cycle, you almost had remasteringusing streaming-only replication working, you just ran out of time. Now it appears that you've abandoned workingon that completely. What's going on?
Jim, Fujii, Even more fun: 1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on ) 2) Connect the server to *itself* as a replica. 3) This will work and report success, up until you do your first write. 4) Then ... segfault! ----- Original Message ----- > On 5/16/12 10:53 AM, Fujii Masao wrote: > > On Wed, May 16, 2012 at 3:43 AM, Joshua Berkus<josh@agliodbs.com> > > wrote: > >> > >>> Before restarting it, you need to do pg_basebackup and make a > >>> base > >>> backup > >>> onto the standby again. Since you started the standby without > >>> recovery.conf, > >>> a series of WAL in the standby has gotten inconsistent with that > >>> in > >>> the master. > >>> So you need a fresh backup to restart the standby. > >> > >> You're not understanding the bug. The problem is that the standby > >> came up and reported that it was replicating OK, when clearly it > >> wasn't. > > > >> 8. Got this fatal error on the standby server: > >> > >> LOG: record with incorrect prev-link 0/70000B8 at 0/70000E0 > >> LOG: record with incorrect prev-link 0/70000B8 at 0/70000E0 > >> > >> ... this error message repeated every 5s. > > > > According to your first report, ISTM you got error messages. > > Only *after* it was correctly setup. > > Josh's point is that if you flub the configuration, you should get an > error, which is not what's happening now. Right now it just comes up > and acts as if nothing's wrong. > -- > Jim C. Nasby, Database Architect jim@nasby.net > 512.569.9461 (cell) http://jim.nasby.net >
On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus <josh@agliodbs.com> wrote: > Even more fun: > > 1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on ) > > 2) Connect the server to *itself* as a replica. > > 3) This will work and report success, up until you do your first write. > > 4) Then ... segfault! I cannot reproduce this. Attached is the script that I use for cascade replication testing. With it I can see the replica connecting to itself but no segfault. Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de
Attachment
On Thu, May 17, 2012 at 12:01 PM, Joshua Berkus <josh@agliodbs.com> wrote: > >> > And: if we still have to ship logs, what's the point in even having >> > cascading replication? >> >> At least cascading replication (1) allows you to adopt more flexible >> configuration of servers, > > I'm just pretty shocked. The last time we talked about this, at the end of the 9.1 development cycle, you almost had remasteringusing streaming-only replication working, you just ran out of time. Now it appears that you've abandoned workingon that completely. What's going on? You mean that "remaster" is, after promoting one of standby servers, to make remaining standby servers reconnect to new master and resolve the timeline gap without the shared archive? Yep, that's one of my TODO items, but I'm not sure if I have enough time to implement that for 9.3.... Regards, -- Fujii Masao
On Thu, May 17, 2012 at 10:42 PM, Ants Aasma <ants@cybertec.at> wrote: > On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus <josh@agliodbs.com> wrote: >> Even more fun: >> >> 1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on ) >> >> 2) Connect the server to *itself* as a replica. >> >> 3) This will work and report success, up until you do your first write. >> >> 4) Then ... segfault! > > I cannot reproduce this. Me, neither. Josh, could you show me the more detail procedure to reproduce the problem? Regards, -- Fujii Masao
Yeah, I don't know how I produced the crash in the first place, because of course the self-replica should block all writes,and retesting it I can't get it to accept a write. Not sure how I did it in the first place. So the bug is just that you can connect a server to itself as its own replica. Since I can't think of any good reason todo this, we should simply error out on startup if someone sets things up that way. How can we detect that we've connectedstreaming replication to the same server? ----- Original Message ----- > On Thu, May 17, 2012 at 10:42 PM, Ants Aasma <ants@cybertec.at> > wrote: > > On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus <josh@agliodbs.com> > > wrote: > >> Even more fun: > >> > >> 1) Set up a server as a cascading replica (e.g. max_wal_senders = > >> 3, standby_mode = on ) > >> > >> 2) Connect the server to *itself* as a replica. > >> > >> 3) This will work and report success, up until you do your first > >> write. > >> > >> 4) Then ... segfault! > > > > I cannot reproduce this. > > Me, neither. > > Josh, could you show me the more detail procedure to reproduce the > problem? > > Regards, > > -- > Fujii Masao >
On Fri, May 18, 2012 at 3:57 AM, Joshua Berkus <josh@agliodbs.com> wrote: > Yeah, I don't know how I produced the crash in the first place, because of course the self-replica should block all writes,and retesting it I can't get it to accept a write. Not sure how I did it in the first place. > > So the bug is just that you can connect a server to itself as its own replica. Since I can't think of any good reasonto do this, we should simply error out on startup if someone sets things up that way. How can we detect that we'veconnected streaming replication to the same server? It might be easy to detect the situation where the standby has connected to itself, e.g., by assigning ID for each instance and checking whether IDs of two servers are the same. But it seems not easy to detect the circularly-connected two or more standbys. Regards, -- Fujii Masao
> It might be easy to detect the situation where the standby has > connected to itself, > e.g., by assigning ID for each instance and checking whether IDs of > two servers > are the same. But it seems not easy to detect the > circularly-connected > two or more > standbys. Well, I think it would be fine not to worry about circles for now.
Fujii, > > You mean that "remaster" is, after promoting one of standby servers, > to make > remaining standby servers reconnect to new master and resolve the > timeline > gap without the shared archive? Yep, that's one of my TODO items, but > I'm not > sure if I have enough time to implement that for 9.3.... Well, not remastering from stream is the single largest usability obstacle for streaming replication, and severely limitsthe utility of cascading replication. Is there any way you could get it done for 9.3? I'm happy to spend lots oftime testing it, if necessary. --Josh Berkus