Re: Why latestRemovedXid|cuteoff_xid are always sent? - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Why latestRemovedXid|cuteoff_xid are always sent?
Date
Msg-id CAH2-Wzmgh8VXs3i=zoTQKTdPD7tA83mPf87noEuOUGXbKkbkFA@mail.gmail.com
Whole thread Raw
In response to Why latestRemovedXid|cuteoff_xid are always sent?  (Michail Nikolaev <michail.nikolaev@gmail.com>)
Responses Re: Why latestRemovedXid|cuteoff_xid are always sent?  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Sat, Jan 2, 2021 at 8:00 AM Michail Nikolaev
<michail.nikolaev@gmail.com> wrote:
> Working on some stuff, I realized I do not understand why
> latestRemovedXid|cuteoff_xid (in different types of WAL records) are
> sent every time they appear on the primary side.
>
> latestRemovedXid|cuteoff_xid is used to call
> ResolveRecoveryConflictWithSnapshot and cancel conflicting backend on
> Standby. In some of the cases, snapshot conflict resolving is the only
> work REDO does (heap_xlog_cleanup_info
>  or btree_xlog_reuse_page, for example).

But you can say the same thing about fields from certain WAL record
types, too. It's not that uncommon for code to make a conceptually
optional piece of information into a normal WAL record struct field,
even though that approach has unnecessary space overhead in the cases
that don't need the information. Often this makes hardly any
difference due to factors like alignment and the simple fact that we
don't expect very many WAL records (with or without the optional
information) in practice.

Of course, it's possible that the question of whether or not it's
worth it has been misjudged for any given case. And maybe these
particular WAL records are one such case where somebody got it wrong,
affecting a real workload (I am ignoring the complexity of making it
work for latestRemovedXid in particular for now). But I tend to doubt
that the space saving would be noticeable, from what I've seen with
pg_waldump.

> Could we try to somehow optimistically advance the latest sent
> latestRemovedXid value in shared memory on the primary and skip
> sending it if the newer xid was sent already? In such a way we could
> reduce the number of ResolveRecoveryConflictWithSnapshot calls on
> Standby and even skip some WAL records.
>
> At least we could do the same optimization on the standby side
> (skipping ResolveRecoveryConflictWithSnapshot if it was called with
> newer xid already).

Maybe it makes sense to add a fast path to
ResolveRecoveryConflictWithSnapshot() so that it falls out early
without scanning the proc array (in cases where it will still do so
today, since of course ResolveRecoveryConflictWithSnapshot() has the
obvious InvalidTransactionId fast path already).

> Is it a sane idea or I have missed something huge?

It seems like two almost distinct ideas to me. Though the important
thing in both cases is the savings in real world conditions.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: WIP: BRIN multi-range indexes
Next
From: Michael Paquier
Date:
Subject: Re: doc review for v14