Re: index-only scans vs. Hot Standby, round two - Mailing list pgsql-hackers

From Robert Haas
Subject Re: index-only scans vs. Hot Standby, round two
Date
Msg-id CA+TgmobEJCm=0oXcdeDernCYV4+82yzBttYi1p8UC=tcE+efJQ@mail.gmail.com
Whole thread Raw
In response to Re: index-only scans vs. Hot Standby, round two  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On Mon, Apr 16, 2012 at 3:02 AM, Noah Misch <noah@leadboat.com> wrote:
> Do you refer to PD_ALL_VISIBLE as "not merely a hint" due to the requirement
> to prevent a page from simultaneously having a negative PD_ALL_VISIBLE and a
> positive visibility map bit?  That is to say, PD_ALL_VISIBLE is fully a hint
> in its role as a witness of tuple visibility, but it is not a hint in its role
> as a witness of the visibility map status?

Exactly.

> The large comment in SetBufferCommitInfoNeedsSave() seems incorrect.  On
> systems with weaker memory ordering, the FlushBuffer() process's clearing of
> BM_JUST_DIRTIED may not be visible to the SetBufferCommitInfoNeedsSave()
> process until some time after the former retrieved buffer contents from shared
> memory.

True.

> Memory barriers could make the comment true, but we should probably
> just update the comment to explain that a race condition may eat the update
> entirely.

Agreed.

> Incidentally, is there a good reason for MarkBufferDirty() to
> increment pgBufferUsage.shared_blks_dirtied and SetBufferCommitInfoNeedsSave()
> not to do so?  Looks like an oversight.

Also agreed.

>> 2. More seriously, lazy_scan_heap() releases the buffer lock before
>> setting the all-visible bit on the heap.  This looks just plain wrong
>> (and it's my fault, to be clear).  Somebody else can come along after
>> we've released the buffer lock and mark the page not-all-visible.
>> That concurrent process will check the visibility map bit, find it
>> already clear, and go on its merry way.  Then, we set the visibility
>> map bit and go on our merry way.  Now we have a not-all-visible page
>> with the all-visible bit set in the visibility map.   Oops.
>
> Hmm, indeed.  This deserves its own open item.

Also agreed.

> Good point.  We could finagle things so the standby notices end-of-recovery
> checkpoints and blocks the optimization for older snapshots.  For example,
> maintain an integer count of end-of-recovery checkpoints seen and store that
> in each Snapshot instead of takenDuringRecovery; use 0 on a master.  When the
> global value is greater than the snapshot's value, disable the optimization
> for that snapshot.  I don't know whether the optimization is powerful enough
> to justify that level of trouble, but it's an option to consider.

I suspect not.  It seems we can make index-only scans work without
doing something like this; it's only the sequential-scan optimization
we might lose.  But nobody's ever really complained about not having
that, to my knowledge.

> For a different approach, targeting the fragility, we could add assertions to
> detect unexpected cases of a page bearing PD_ALL_VISIBLE submitted as a
> full-page image.

The holes are narrow enough that I fear such cases would be detected
only on high-velocity production systems, which is not exactly where
you want to find out about such problems.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: index-only scans vs. Hot Standby, round two
Next
From: Jeff Davis
Date:
Subject: Re: 9.3 Pre-proposal: Range Merge Join