Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept
Date
Msg-id CAD21AoCWY9LWOrGx57UNOFG7bc-yd4TXcVRSWpQcvJs_kRYZtQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept  (Ivan Kartyshov <i.kartyshov@postgrespro.ru>)
List pgsql-hackers
On Sat, Nov 4, 2017 at 7:04 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Wed, Nov 1, 2017 at 5:55 AM, Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
>>
>> On Tue, Oct 31, 2017 at 6:17 PM, Alexander Korotkov
>> <a.korotkov@postgrespro.ru> wrote:
>> > On Tue, Oct 31, 2017 at 5:16 AM, Masahiko Sawada <sawada.mshk@gmail.com>
>> > wrote:
>> >>
>> >> On Mon, Oct 30, 2017 at 10:16 PM, Robert Haas <robertmhaas@gmail.com>
>> >> wrote:
>> >> > On Tue, Oct 24, 2017 at 1:26 PM, Ivan Kartyshov
>> >> > <i.kartyshov@postgrespro.ru> wrote:
>> >> >> Hello. I made some bugfixes and rewrite the patch.
>> >> >
>> >> > I don't think it's a good idea to deliberately leave the state of the
>> >> > standby different from the state of the  master on the theory that it
>> >> > won't matter.  I feel like that's something that's likely to come
>> >> > back
>> >> > to bite us.
>> >>
>> >> I agree with Robert. What happen if we intentionally don't apply the
>> >> truncation WAL and switched over? If we insert a tuple on the new
>> >> master server to a block that has been truncated on the old master,
>> >> the WAL apply on the new standby will fail? I guess there are such
>> >> corner cases causing failures of WAL replay after switch-over.
>> >
>> >
>> > Yes, that looks dangerous.  One approach to cope that could be teaching
>> > heap
>> > redo function to handle such these situations.  But I expect this
>> > approach
>> > to be criticized for bad design.  And I understand fairness of this
>> > criticism.
>> >
>> > However, from user prospective of view, current behavior of
>> > hot_standby_feedback is just broken, because it both increases bloat and
>> > doesn't guarantee that read-only query on standby wouldn't be cancelled
>> > because of vacuum.  Therefore, we should be looking for solution: if one
>> > approach isn't good enough, then we should look for another approach.
>> >
>> > I can propose following alternative approach: teach read-only queries on
>> > hot
>> > standby to tolerate concurrent relation truncation.  Therefore, when
>> > non-existent heap page is accessed on hot standby, we can know that it
>> > was
>> > deleted by concurrent truncation and should be assumed to be empty.  Any
>> > thoughts?
>> >
>>
>> You also meant that the applying WAL for AccessExclusiveLock is always
>> skipped on standby servers to allow scans to access the relation?
>
>
> Definitely not every AccessExclusiveLock WAL records should be skipped, but
> only whose were emitted during heap truncation.  There are other cases when
> AccessExclusiveLock WAL records are emitted, for instance, during DDL
> operations.  But, I'd like to focus on AccessExclusiveLock WAL records
> caused by VACUUM for now.  It's kind of understandable for users that DDL
> might cancel read-only query on standby.  So, if you're running long report
> query then you should wait with your DDL.  But VACUUM is a different story.
> It runs automatically when you do normal DML queries.
>
> AccessExclusiveLock WAL records by VACUUM could be either not emitted, or
> somehow distinguished and skipped on standby.  I haven't thought out that
> level of detail for now.
>

I understood. I'm concerned the fact that we cannot distinguish that
AccessExclusiveLock WAL came from the vacuum truncation or other
operation required AccessExclusiveLock so far. So I agree that we need
to invent a way for that.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [Sender Address Forgery]Re: [HACKERS] path toward fasterpartition pruning
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] dropping partitioned tables without CASCADE