Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Date
Msg-id CA+Tgmobx6V2SmvomxcrnGy59WXSGQMV2aYLeXYGSAXDBvmrz+g@mail.gmail.com
Whole thread Raw
In response to Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Wed, Jun 15, 2016 at 2:59 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Kevin Grittner wrote:
>> On Wed, Jun 15, 2016 at 1:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> > The expression index case is the one to worry about; if there is a
>> > problem, that's where it is.  What bothers me is that a function used
>> > in an expression index could do anything at all - it can read any
>> > table in the database.
>>
>> It *can*, but then you are lying to the database when you call it
>> IMMUTABLE.  Such an index can easily become corrupted through
>> normal DML.  Without DML the ANALYZE has no problem.  So you seem
>> to be concerned that if someone is lying to the database engine to
>> force it accept a function as IMMUTABLE when it actually isn't, and
>> then updating the referenced rows (which is very likely to render
>> the index corrupted), that statistics might also become stale.
>
> We actually go quite some lengths to support this case, even when it's
> the opinion of many that we shouldn't.  For example VACUUM doesn't try
> to find index entries using the values in each deleted tuple; instead we
> remember the TIDs and then scan the indexes (possibly many times) to
> find entries that match those TIDs -- which is much slower.  Yet we do
> it this way to protect the case that somebody is doing the
> not-really-IMMUTABLE function.
>
> In other words, I don't think we consider the position you argued as
> acceptable.

Well, I actually don't think there's a giant problem here.  I'm just
trying to follow the chain of the argument to its (illogical)
conclusion.  I mean, if ANALYZE itself fails to see a tuple subjected
to early pruning, that should be fine.  And if some query run by a
supposedly-but-not-actually immutable function errors out because
snapshot_too_old is set, that also seems more or less fine.  The
statistics might not get updated, but oh well: either make your
supposedly-immutable function actually immutable, or else turn off
snapshot_too_old, or else live with the fact that ANALYZE will fail
some percentage of the time.  Supporting people who cheat and do
things that technically aren't allowed is one thing; saying that every
new feature must never have any downsides for such people is something
else.  If we took the latter approach, parallel query would be right
out, because you sure can break things by mislabeling functions as
PARALLEL SAFE.  I *do* think it *must* be possible to get an ANALYZE
to do something funky - either error out, or give wrong answers - if
you set up a strange enough set of circumstances, but I don't think
that's necessarily something we need to do anything about.

I think this whole discussion of snapshot too old has provoked far too
many bitter emails -- on all sides.  I entirely believe that there are
legitimate reasons to have concerns about this feature, and I entirely
suspect that it has problems we haven't found yet, and I also entirely
believe that there will be some future bugs that stem from this
feature that we would not have had otherwise.  I think it is entirely
legitimate to have concerns about those things.  On the other hand, I
*also* believe that Kevin is a pretty careful guy and that he's done
his best to make this patch safe and that he did not just go off and
commit something half-baked without due reflection.  We have to expect
that if people who are committers don't get much review of their
patches, they will eventually commit them anyway.  The "I can't
believe you committed this" reactions seem out of line to me.  This
feature is not perfect.  Nor is it the worst thing anybody's ever
committed.  But it's definitely provoked more ire than most.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Choosing the cheapest optimizer cost
Next
From: PostgreSQL - Hans-Jürgen Schönig
Date:
Subject: Re: WIP: Data at rest encryption