Re: 'replication checkpoint has wrong magic' on the newly clonedreplicas - Mailing list pgsql-admin

From Alex Kliukin
Subject Re: 'replication checkpoint has wrong magic' on the newly clonedreplicas
Date
Msg-id F6E25B76-0385-4A82-B26C-D69B5F88EE54@fastmail.com
Whole thread Raw
In response to Re: 'replication checkpoint has wrong magic' on the newly cloned replicas  (Stephen Frost <sfrost@snowman.net>)
Responses Re: 'replication checkpoint has wrong magic' on the newly cloned replicas  (Stephen Frost <sfrost@snowman.net>)
List pgsql-admin
> On 29. Nov 2017, at 18:52, Stephen Frost wrote: > > Greetings, > > On Wed, Nov 29, 2017 at 12:41 Oleksii Kliukin > wrote: > Hi Stephen, > > > On 29. Nov 2017, at 15:54, Stephen Frost > wrote: > > > > Greetings, > > > > * Alex Kliukin (alexk@hintbits.com ) wrote: > >> The cloning itself is done by copying a compressed image via ssh, > >> running the > >> following command from the replica: > >> > >> """ssh {master} 'cd {master_datadir} && tar -lcp --exclude "*.conf" \ > >> --exclude "recovery.done" \ > >> --exclude "pacemaker_instanz" \ > >> --exclude "dont_start" \ > >> --exclude "pg_log" \ > >> --exclude "pg_xlog" \ > >> --exclude "postmaster.pid" \ > >> --exclude "recovery.done" \ > >> * | pigz -1 -p 4' | pigz -d -p 4 | tar -xpmUv -C > >> {slave_datadir}"" > >> > >> The WAL archiving starts before the copy starts, as the script that > >> clones the > >> replica checks that the WALs archiving is running before the cloning. > > > > Maybe you've doing it and haven't mentioned it, but you have to use > > pg_start/stop_backup > > Sorry for not mentioning it, as it seemed obvious, but we are calling pg_start_backup and pg_stop_backup at the right time. > > Ah, not something I can assume, heh. > > Then it depends on which version of PG and if you’re able to run start/stop on the replica or not. If you can’t run it on the replica and have to run it on the primary (prior to 9.6) then you need to make sure to wait for things to happen on the primary and for that to be replicated before you can start. We are using exclusive backups from the master. First, the script checks that WAL files are shipped to the NFS, where the replica expects to find them (we check the md5 checksum of the file in order to make sure that the NFS actually delivers the file that the master has archived) . Then pg_start_backup runs on the master and its status is checked. On success, the copy command runs. When the copy command finishes, pg_stop_backup is executed. Once pg_stop_backup finishes successfully, replica configuration files (postgesql.conf, pg_hba.conf. pg_ident.conf) are linked from their location in the repository and the replica is started. This is a fairly typical procedure, which, I believe, is also well described in the docs. > > If you’re on 9.6 and using non-exclusive backup, you need to be sure to capture the contents of the stop backup and write it into backup_label before you start the system back up. We don’t use non-exclusive backups altogether. Cheers, Alex

pgsql-admin by date:

Previous
From: Stephen Frost
Date:
Subject: Re: 'replication checkpoint has wrong magic' on the newly cloned replicas
Next
From: Stephen Frost
Date:
Subject: Re: 'replication checkpoint has wrong magic' on the newly cloned replicas