On Thu, Jul 21, 2016 at 12:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jul 20, 2016 at 9:15 PM, Noah Misch <noah@leadboat.com> wrote:
>> This PostgreSQL 9.6 open item now needs a permanent owner. Would any other
>> committer like to take ownership? If this role interests you, please read
>> this thread and the policy linked above, then send an initial status update
>> bearing a date for your subsequent status update. If the item does not have a
>> permanent owner by 2016-07-24 02:00 UTC, I will resolve the item by reverting
>> commit 848ef42 and followups.
>
> I will adopt this item. I will provide an initial patch for this
> issue, or convince someone else to do so, within one week. Therefore,
> expect a further status update from me on or before July 28th. I
> expect that the patch will be based on ideas from these emails:
>
> https://www.postgresql.org/message-id/1AB8F80A-D16E-4154-9497-98FBB164253D@anarazel.de
> https://www.postgresql.org/message-id/20160720181213.f4io7gc6lyc377sw@alap3.anarazel.de
Here is a patch. Please review. I'm not suffering from an
overwhelming excess of confidence that this is correct. In
particular, I'm concerned about the fact that there isn't always a
registered snapshot - if you make it ERROR out when that happens
instead of doing what it actually does, the regression tests fail in
CLUSTER and CREATE VIEW. Also, I wonder why it's right to use
pairingheap_first() instead of looking at the oldest snapshot on the
active snapshot stack, or conversely why that is right and this is
not. Or maybe we need to check both.
The basic principle of testing both MVCC and TOAST snapshots for
snapshot-too-old problems seems sound. Furthermore, it seems clear
enough that if we're ever looking up a datum in the TOAST table whose
xmin is no longer included in our MyPgXact->xmin, then we're
vulnerable to having TOAST chunks vacuumed away under us even without
snapshot_too_old enabled. But there's a bit of spooky action at a
distance: if we don't see any snapshots, then we have to assume that
the scan is being performed with a non-MVCC snapshot and therefore we
don't need to worry. But there's no way to verify that via an
assertion, because the connection between the relevant MVCC snapshot
and the corresponding TOAST snapshot is pretty indirect, mediated only
by snapmgr.c. It's a bit like a 9-1-1 operator (or whatever your
local emergency number is) saying "hey, thanks for calling. if I
don't hear back from you about this issue again, I'll just assume
everything's OK." That might actually the correct approach in some
cases, but it certainly comes with a bit of risk.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company