Index-only scans and non-MVCC snapshots - Mailing list pgsql-hackers

From Ryan Johnson
Subject Index-only scans and non-MVCC snapshots
Date
Msg-id 53ACF773.50200@cs.utoronto.ca
Whole thread Raw
Responses Re: Index-only scans and non-MVCC snapshots  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: Index-only scans and non-MVCC snapshots  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Hi,

As part of a research project, I'm trying to change Read Committed 
isolation to use HeapTupleSatisfiesNow rather than acquiring a new 
snapshot at every command [1]. Things appear to have gone reasonably 
well so far, except certain queries fail with "ERROR:  non-MVCC 
snapshots are not supported in index-only scans."

I'm using v9.3.2, and the docs claim that index-only scans work without 
MVCC, but require some extra locking to avoid races [2]. Is this not 
actually implemented? If that is the case, shouldn't the query optimizer 
avoid selecting index-only scans for non-MVCC snapshots?

I realize I'm playing with fire here, but any pointers to sections of 
code I might look at to either work around or fix this issue would be 
greatly appreciated. I've been looking around in index_fetch_heap 
(indexam.c) as well as other locations that use scan->xs_continue_hot; 
there seems to be code in place to detect when a non-MVCC snapshot is in 
use, as if that were nothing out of the ordinary, but nothing prevents 
the error from arising if a hot chain is actually encountered.

Thanks,
Ryan

[1] Right now, Read Committed is significantly *slower* than Repeatable 
Read---for transactions involving multiple short commands---because the 
former acquires multiple snapshots per transaction and causes a lwlock 
bottleneck on my 12-core machine.

[2] http://www.postgresql.org/docs/9.3/static/index-locking.html:
> with a non-MVCC-compliant snapshot (such as SnapshotNow), it would be 
> possible to accept and return a row that does not in fact match the 
> scan keys ... [so] we use a pin on an index page as a proxy to 
> indicate that the reader might still be "in flight" from the index 
> entry to the matching heap entry. Making ambulkdelete block on such a 
> pin ensures that VACUUM cannot delete the heap entry before the reader 
> is done with it. ... This solution requires that index scans be 
> "synchronous": we have to fetch each heap tuple immediately after 
> scanning the corresponding index entry. This is expensive for a number 
> of reasons. An "asynchronous" scan in which we collect many TIDs from 
> the index, and only visit the heap tuples sometime later, requires 
> much less index locking overhead and can allow a more efficient heap 
> access pattern. Per the above analysis, we must use the synchronous 
> approach for non-MVCC-compliant snapshots. 




pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: ALTER SYSTEM RESET?
Next
From: Alvaro Herrera
Date:
Subject: Re: Index-only scans and non-MVCC snapshots