Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation() - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()
Date
Msg-id CAA8=A78qz_1xqpLumGiVWOXC+LK57xYwuGH-00RgDQmQZjc4=Q@mail.gmail.com
Whole thread Raw
In response to Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()  (Pavan Deolasee <pavan.deolasee@gmail.com>)
List pgsql-hackers
On Wed, Mar 28, 2018 at 1:21 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
>
>
> TBH I still don't see why this does not provide the same guarantee that the
> current code provides, but given the concerns expressed by others, I am not
> gonna pursue beyond a point. But one last time :-)
>
> The current code uses xl_prev to cross-verify the record B, read after
> record A, indeed follows A and has a valid back-link to A. This deals with
> problems where B might actually be an old WAL record, carried over from a
> stale WAL file.
>
> Now if we store xl_curr, we can definitely guarantee that B is ahead of A
> because B->xl_curr will be greater than A->xl_curr (otherwise we bail out).
> So that deals with the problem of stale WAL records. In addition, we also
> know where A ends (we can deduce that even for XLOG_SWITCH records knowing
> where the next record will start after the switch) and hence we know where B
> should start. So we read at B and also confirm that B->xl_curr matches B's
> position. If it does not, we declare end-of-WAL and bail out. So where is
> the problem?
>


This seems to have got a bit lost in subsequent discussion.


>>
>> > 2. Does the new logic in pg_rewind to search backward for a checkpoint
>> > work reliably, and will it be slow?
>>
>> If you have to search backwards, this breaks it.  Full stop.
>
>
> We don't really need to fetch the previous record. We really need to find
> the last checkpoint prior to a given LSN. That can be done by reading WAL
> segments forward. It can be a little slow, but hopefully not a whole lot.
>
> A <- B <- C <- CHKPT <- D <- E <- F <- G
>
> So today, if we want to find last checkpoint prio to G, we go through the
> back-links until we find the first checkpoint record. In the proposed code,
> we read forward the current WAL segment, remember the last CHKPT record seen
> and once we see G, we know we have found the prior checkpoint. If the
> checkpoint does not exist in the current WAL, we read forward the previous
> WAL and return the last checkpoint record in that WAL and so on. So in the
> worst case, we might read a WAL segment extra before we find the checkpoint
> record. That's not ideal but not too bad given that only pg_rewind needs
> this and that too only once.
>


Some degree of slowdown in pg_rewind seems an acceptable price to pay
as long as it doesn't introduce errors.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility
Next
From: Haribabu Kommi
Date:
Subject: Re: Enhance pg_stat_wal_receiver view to display connected host