Re: [HACKERS] Broken hint bits (freeze) - Mailing list pgsql-hackers

From Sergey Burladyan
Subject Re: [HACKERS] Broken hint bits (freeze)
Date
Msg-id 8760fq7pfe.fsf@seb.koffice.internal
Whole thread Raw
In response to Re: [HACKERS] Broken hint bits (freeze)  (Vladimir Borodin <root@simply.name>)
Responses Re: [HACKERS] Broken hint bits (freeze)  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Amit Kapila <amit.kapila16@gmail.com> writes:

> On Tue, Jun 20, 2017 at 3:40 PM, Sergey Burladyan <eshkinkot@gmail.com> wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> >
> >> On Mon, Jun 19, 2017 at 10:59:19PM -0400, Bruce Momjian wrote:
> >> > On Tue, Jun 20, 2017 at 03:50:29AM +0300, Sergey Burladyan wrote:
> >> > > 20 июн. 2017 г. 1:21 пользователь "Bruce Momjian" <bruce@momjian.us> написал:
> >> > >
> >> > >
> >> > >     We are saying that Log-Shipping should match "Latest checkpoint
> >> > >     location", but the WAL for that will not be sent to the standby, so it
> >> > >     will not match, but that is OK since the only thing in the non-shipped
> >> > >     WAL file is the checkpoint record.  How should we modify the wording on
> >> > >     this?
> >> > >
> >> > >
> >> > > I am afraid that without this checkpoint record standby cannot make
> >> > > restartpoint
> >> > > and without restartpoint it does not sync shared buffers into disk at
> >> > > shutdown.
> >> >
>
> It seems to me at shutdown time on standby servers we specifically
> make restart points.  See below code in ShutdownXLOG()
>
> ..
> if (RecoveryInProgress())
> CreateRestartPoint(CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_IMMEDIATE);
> ..
>
> Do you have something else in mind?

What buffers this restartpoint will save into disk? I think what it can
save only buffers with LSN lower or equal to "Latest checkpoint
location". Buffers with LSN between "Minimum recovery ending
location" and "Latest checkpoint location" will not saved at all.

I set log_min_messages=debug2 and it is more clearly what happened here:
2017-06-20 13:18:32 GMT LOG:  restartpoint starting: xlog
...
2017-06-20 13:18:33 GMT DEBUG:  postmaster received signal 15
2017-06-20 13:18:33 GMT LOG:  received smart shutdown request
2017-06-20 13:18:33 GMT DEBUG:  updated min recovery point to 0/12000000
2017-06-20 13:18:33 GMT CONTEXT:  writing block 2967 of relation base/16384/16385
2017-06-20 13:18:33 GMT DEBUG:  checkpoint sync: number=1 file=global/12587 time=0.001 msec
2017-06-20 13:18:33 GMT DEBUG:  checkpoint sync: number=2 file=base/16384/12357 time=0.000 msec
2017-06-20 13:18:33 GMT DEBUG:  checkpoint sync: number=3 file=base/16384/16385 time=0.000 msec
2017-06-20 13:18:33 GMT DEBUG:  attempting to remove WAL segments older than log file 00000001000000000000000B
2017-06-20 13:18:33 GMT DEBUG:  recycled transaction log file "00000001000000000000000B"
2017-06-20 13:18:33 GMT DEBUG:  recycled transaction log file "00000001000000000000000A"
2017-06-20 13:18:33 GMT DEBUG:  recycled transaction log file "000000010000000000000009"
2017-06-20 13:18:33 GMT DEBUG:  SlruScanDirectory invoking callback on pg_subtrans/0000
2017-06-20 13:18:33 GMT LOG:  restartpoint complete: wrote 1824 buffers (44.5%); 0 transaction log file(s) added, 0
removed,3 recycled; write=1.389 s, sync=0.000 s, total=1.389 s; sync files=3, longest=0.000 s, average=0.000 s 
2017-06-20 13:18:33 GMT LOG:  recovery restart point at 0/F008D28
2017-06-20 13:18:33 GMT DETAIL:  last completed transaction was at log time 2017-06-20 13:18:29.282645+00
2017-06-20 13:18:33 GMT LOG:  shutting down
2017-06-20 13:18:33 GMT DEBUG:  skipping restartpoint, already performed at 0/F008D28
2017-06-20 13:18:33 GMT LOG:  database system is shut down
========

I use pg 9.2 and "skipping restartpoint, already performed at" is from
src/backend/access/transam/xlog.c:8643
after this statement it return from CreateRestartPoint() and do not run  8687     CheckPointGuts(lastCheckPoint.redo,
flags);

> >> > Uh, as I understand it the rsync is going to copy the missing WAL file
> >> > from the new master to the standby, right, and I think pg_controldata
> >> > too, so it should be fine.  Have you tested to see if it fails?
> >
> > It need old WAL files from old version for correct restore heap
> > files. New WAL files from new version does not have this information.
> >
>
> So in such a case can we run rsync once before pg_upgrade?

I just copy last WAL from stopped old master into running old standby
before it shutdown and wait till it replayed. After that standby can
issue restartpoint at the same location as in stopped master.

I am not sure about rsync, in my production server I have for example
111 GB in pg_xlog and if I run rsync for pg_xlog it must send ~ 40GB
of new WALs I think.

--
Sergey Burladyan



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Default Partition for Range
Next
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Regarding Postgres Dynamic Shared Memory (DSA)