Re: Thoughts on "killed tuples" index hint bits support on standby - Mailing list pgsql-hackers

From Michail Nikolaev
Subject Re: Thoughts on "killed tuples" index hint bits support on standby
Date
Msg-id CANtu0ojqqL7de9W7vTea12Dnn64fHjCbAWswnZeag_gEYFFe2Q@mail.gmail.com
Whole thread Raw
In response to Re: Thoughts on "killed tuples" index hint bits support on standby  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Thoughts on "killed tuples" index hint bits support on standby  (Michail Nikolaev <michail.nikolaev@gmail.com>)
List pgsql-hackers
Hello, Peter.

> Let me make sure I understand your position:

> You're particularly concerned about cases where there are relatively
> few page splits, and the standby has to wait for VACUUM to run on the
> primary before dead index tuples get cleaned up. The primary itself
> probably has no problem with setting LP_DEAD bits to avoid having
> index scans visiting the heap unnecessarily. Or maybe the queries are
> different on the standby anyway, so it matters to the standby that
> certain index pages get LP_DEAD bits set quickly, though not to the
> primary (at least not for the same pages). Setting the LP_DEAD bits on
> the standby (in about the same way as we can already on the primary)
> is a "night and day" level difference.
> Right?

Yes, exactly.

My initial attempts were too naive (first and second letter) - but you and
Andres gave me some hints on how to make it reliable.

The main goal is to make the standby to be able to use and set LP_DEAD almost
as a primary does. Of course, standby could receive LP_DEAD with FPI from
primary at any moment - so, some kind of cancellation logic is required. Also,
we should keep the frequency of query cancellation at the same level - for that
reason LP_DEAD bits better to be used only by standbys with
hot_standby_feedback enabled. So, I am just repeating myself from the previous
letter here.

> And we're willing to account
> for FPIs on the primary (and the LP_DEAD bits set there) just to be
> able to also set LP_DEAD bits on the standby.

Yes, metaphorically saying - master sending WAL record with the letter:
"Attention, it is possible to receive FPI from me with LP_DEAD set for tuple
with xmax=ABCD, so, if you using LP_DEAD - your xmin should be greater or you
should cancel yourself". And such a letter is required only if this horizon is
moved forward.

And... Looks like it works - queries are mush faster, results look correct,
additional WAL traffic is low, cancellation at the same level... As far as I
can see - the basic concept is correct and effective (but of course, I
could miss something).

The patch is hard to look into - I'll try to split it into several patches
later. And of course, a lot of polishing is required (and there are few places
I am not sure about yet).

Thanks,
Michail.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Catalog invalidations vs catalog scans vs ScanPgRelation()
Next
From: Jesse Zhang
Date:
Subject: Properly mark NULL returns in numeric aggregates