Re: index-only scans vs. Hot Standby, round two - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: index-only scans vs. Hot Standby, round two
Date
Msg-id CA+U5nML5wLmP9SHqKr5meTzW-5riWKg3=+xrV-YZhERUs2EGhg@mail.gmail.com
Whole thread Raw
In response to Re: index-only scans vs. Hot Standby, round two  (Noah Misch <noah@leadboat.com>)
Responses Re: index-only scans vs. Hot Standby, round two
List pgsql-hackers
On Mon, Apr 16, 2012 at 8:02 AM, Noah Misch <noah@leadboat.com> wrote:
> On Fri, Apr 13, 2012 at 12:33:06PM -0400, Robert Haas wrote:
>> In the department of query cancellations, I believe Noah argued
>> previously that this wasn't really going to cause a problem.  And,
>> indeed, if the master has a mix of inserts, updates, and deletes, then
>> it seems likely that any recovery conflicts generated this way would
>> be hitting about the same place in the XID space as the ones caused by
>> pruning away dead tuples.  What will be different, if we go this
>> route, is that an insert-only master, which right now only generates
>> conflicts at freezing time, will instead generate conflicts much
>> sooner, as soon as the relation is vacuumed.  I do not use Hot Standby
>> enough to know whether this is a problem, and I'm not trying to block
>> this approach, but I do want to make sure that everyone agrees that
>> it's acceptable before we go do it, because inevitably somebody is
>> going to get bit.
>
> Given sufficient doubt, we could add a GUC, say hot_standby_use_all_visible.
> A standby with the GUC "off" would ignore all-visible indicators and also
> decline to generate conflicts when flipping them on.

I'm against adding a new GUC, especially for that fairly thin reason.


>> 1. vacuum on master sets the page-level bit and the visibility map bit
>> 2. the heap page with the bit is written to disk BEFORE the WAL entry
>> generated in (1) gets flushed; this is allowable, since it's not an
>> error for the page-level bit to be set while the visibility-map bit is
>> unset, only the other way around
>> 3. crash (still before the WAL entry is flushed)
>> 4. some operation on the master emits an FPW for the page without
>> first rendering it not-all-visible
>>
>> At present, I'm not sure there's any real problem here, since normally
>> an all-visible heap page is only going to get another WAL entry if
>> somebody does an insert, update, or delete on it, in which case the
>> visibility map bit is going to get cleared anyway.  Freezing the page
>> might emit a new FPW, but that's going to generate conflicts anyway,
>> so no problem there.  So I think there's no real problem here, but it
>> doesn't seem totally future-proof - any future type of WAL record that
>> might modify the page without rendering it not-all-visible would
>> create a very subtle hazard for Hot Standby.  We could dodge the whole
>> issue by leaving the hack in heapgetpage() intact and just be happy
>> with making index-only scans work, or maybe somebody has a more clever
>> idea.
>
> Good point.  We could finagle things so the standby notices end-of-recovery
> checkpoints and blocks the optimization for older snapshots.  For example,
> maintain an integer count of end-of-recovery checkpoints seen and store that
> in each Snapshot instead of takenDuringRecovery; use 0 on a master.  When the
> global value is greater than the snapshot's value, disable the optimization
> for that snapshot.  I don't know whether the optimization is powerful enough
> to justify that level of trouble, but it's an option to consider.
>
> For a different approach, targeting the fragility, we could add assertions to
> detect unexpected cases of a page bearing PD_ALL_VISIBLE submitted as a
> full-page image.

We don't need a future proof solution, especially not at this stage of
the release cycle. Every time you add a WAL record, we need to
consider the possible conflict impact.

We just need a simple and clear solution. I'll work on that.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: "Etsuro Fujita"
Date:
Subject: Re: not null validation option in contrib/file_fdw
Next
From: Simon Riggs
Date:
Subject: Re: 9.3 Pre-proposal: Range Merge Join