Re: Strange issues with 9.2 pg_basebackup & replication - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Strange issues with 9.2 pg_basebackup & replication |
Date | |
Msg-id | CAHGQGwEVHVy0xBri8hNDvL4GSrE4Cjo+OrURZnWYjVo5SO7FWw@mail.gmail.com Whole thread Raw |
In response to | Re: Strange issues with 9.2 pg_basebackup & replication (Thom Brown <thom@linux.com>) |
Responses |
Re: Strange issues with 9.2 pg_basebackup & replication
|
List | pgsql-hackers |
On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom@linux.com> wrote: > On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote: >>> However, this isn't true when I restart the standby. I've been >>> informed that this should work fine if a WAL archive has been >>> configured (which should be used anyway). >> >> The WAL archive should be shared by master-replica and replica-replica, >> and recovery_target_timeline should be set to latest in replica-replica. >> If you configure that way, replica-replica would successfully reconnect to >> master-replica with no need to restart it. > > I had set the archive_command on the primary, then produced a base > backup which would have copied the archive settings, but I also added > a corresponding recovery_command setting, so everything was pointing > at the same archive. Hmm.. when doing the same, the replica-replica successfully reconnected to the master-replica after I shutdown the master-master and promoted the master-replica. archive_command is the same in three servers, restore_command is the same in two standby servers (i.e., master-replica and replica-replica), and recovery_target_timeline is set to 'latest' in two standby servers. >>> But one new problem I appear to have is that once I set up archiving >>> and restart, then try pg_basebackup, it gets stuck and never shows any >>> progress. If I terminate pg_basebackup in this state and attempt to >>> restart it more times than max_wal_senders, it can no longer run, as >>> pg_basebackup didn't disconnect the stream, so ends up using all >>> senders. And these show up in pg_stat_replication. I have a theory >>> that if archiving is enabled, restart postgres then generate some WAL >>> to the point there is a file or two in the archive, pg_basebackup >>> can't stream anything. Once I restart the server, it's fine and >>> continues as normal. This has the same symptoms of the "pg_basebackup >>> from running standby with streaming" issue. >> >> This seems to be caused by spread checkpoint which is requested by >> pg_basebackup. IOW, this looks a normal behavior rather than a bug >> or an issue. What if you specify "-c fast" option in pg_basebackup? > > Yes, it works fine with that option. And it appears this isn't to do > with there being an archive as I get the same symptoms without setting > one up. Yes. > But in any case, shouldn't the replication connection be > terminated when pg_basebackup is terminated? +1 To do this, we would need to define SIGINT signal handler and make it send QueryCancel packet when Ctrl-C is typed. Regards, -- Fujii Masao
pgsql-hackers by date: