Re: Wal archive way behind in streaming replication - Mailing list pgsql-admin
From | Gilberto Castillo |
---|---|
Subject | Re: Wal archive way behind in streaming replication |
Date | |
Msg-id | 43325.192.168.207.54.1403718135.squirrel@webmail.etecsa.cu Whole thread Raw |
In response to | Re: Wal archive way behind in streaming replication (John Scalia <jayknowsunix@gmail.com>) |
List | pgsql-admin |
Stop to Master, copy folder pg_xlog you slave. To me it has worked for me > > A little examination of the pgarch.c file showed what the archive process > on the primary is doing. Anyway, to ensure that the primary knows that it > has transmitted all the up to > date WALs, I went into the primary's data/pg_xlog/archive_status directory > and performed "touch 00000003000000900000036.ready" and repeated this > command for the other WALs up to > *44.ready. This really shouldn't have been a problem as the most recent > WAL file in pg_xlog was *45. The archiver then picked up all those WAL > files and transmitted them to the > standbys. At least I saw them appear on the standby in the directory > specified in the recovery.conf file. > > Now, what I really don't understand is the standby's behavior. After the > WALs arrived, I saw nothing in today's pg_log/Wed.log file showing it saw > them. I then issued a service > postgresql-9.3 restart and this is what was spit out in the log: > > LOG: entering standby mode > LOG: restored log file "00000000300000000900000035" from archive > LOG: unexpected pageaddr 9/1B000000 in log segment > 00000000300000000900000035, offset 0 > LOG: started streaming WAL from primary at 9/35000000 on timeline 3 > FATAL: the database system is starting up > LOG: consistent recovery state reached at 9/350000C8 > LOG: redo starts at 9/350000C8 > LOG: database system is ready to accept read only connections > > Two things stand out here. First, the standby didn't seem to process the > newly arrived WAL files, and second. what's with the FATAL: in the > logfile? > -- > Jay > > On 6/24/2014 2:52 PM, Andrew Krause wrote: >> You shouldn’t have to touch the files as long as they aren’t compressed. >> You may have to restart the standby instance to get the recovery to >> begin though. I’d suggest tailing your instance log and restarting the >> standby instance. It should show that the logs from the gap are >> applying right away at startup. >> >> >> Andrew Krause >> >> >> >> >> On Jun 24, 2014, at 1:19 PM, John Scalia <jayknowsunix@gmail.com> wrote: >> >>> Ok, I did the copy from pg_xlog directory into the restore.conf >>> specifieddirectory. The standby servers seem fine with that, however, >>> just copying does not inform the primary that >>> the copy has happened. The archive_status directory under pg_xlog on >>> the primary still thinks the last WAL sent was *B7 and yet it's now >>> writing *C9. When I did the copy it was >>> only up to *C7 and nothing else has shown in the standby's directory. >>> >>> Now, the *.done files in archive_status are just zero length, but I'm a >>> bit hesitant to just do a touch for the ones I manually copied as I >>> don't know if this is from an in-memory >>> queue or if it Postgresql reads the contents of this regularly in order >>> to decide what to copy. >>> >>> Is that safe to do? >>> >>> On 6/24/2014 9:56 AM, Andrew Krause wrote: >>>> You can copy all of the WAL logs from your gap to the standby. If you >>>> place them in the correct location (directory designated for restore) >>>> theinstance will automatically apply them all. >>>> >>>> >>>> Andrew Krause >>>> >>>> >>>> >>>> On Jun 23, 2014, at 9:24 AM, John Scalia <jayknowsunix@gmail.com> >>>> wrote: >>>> >>>>> Came in this morning to numerous complaints from pgpool about the >>>>> standby servers being behind from the primary. Looking into it, no >>>>> WAL files had been transferred since late Friday. All I did was >>>>> restart the primaryand the WAL archving resumed, however, looking at >>>>> the WAL files on the standby servers, this is never going to catch >>>>> up. Now, I've got the archive_timeout on the primary = 600 or 10 >>>>> minutes and I see WAL files in pg_xlog every 10 minutes. As they show >>>>> up on the standby servers, they're also 10 minutes apart, but the >>>>> primary is writing *21 and the standby's areonly up to *10. Now, like >>>>> I said prior, with there being 10 minutes (600seconds) between >>>>> transfers (the same pace as the WALs are generated) it will never >>>>> catch up. Is this really the intended behavior? How would I get the >>>>> additional WAL files over to the standbys without waiting 10 minutes >>>>> to copy them one at a time? >>>>> -- >>>>> Jay >>>>> >>>>> >>>>> -- >>>>> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) >>>>> To make changes to your subscription: >>>>> http://www.postgresql.org/mailpref/pgsql-admin >>> >>> >>> >>> >>> >>> -- >>> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-admin >> > > > > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin--- > This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE > running at host imx2.etecsa.cu > Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com> > Saludos, Gilberto Castillo La Habana, Cuba --- This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE running at host imx3.etecsa.cu Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com>
pgsql-admin by date: