Re: snapshot too old, configured by time - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: snapshot too old, configured by time
Date
Msg-id 20160330192223.GA988171@alvherre.pgsql
Whole thread Raw
In response to Re: snapshot too old, configured by time  (Kevin Grittner <kgrittn@gmail.com>)
Responses Re: snapshot too old, configured by time
List pgsql-hackers
Kevin Grittner wrote:
> On Wed, Mar 30, 2016 at 11:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> >> I think a safer proposition would be to replace all current
> >> BufferGetPage() calls (there are about 500) by adding the necessary
> >> arguments: buffer, snapshot, rel, and an integer "flags".  All this
> >> without adding the feature.  Then a subsequent commit would add the
> >> TestForOldSnapshot inside BufferGetPage, *except* when a
> >> BUFFER_NO_SNAPSHOT_TEST flag is passed.  That way, new code always get
> >> the snapshot test by default.
> >
> > That seems awfully invasive,
> 
> That's the argument I made, which Álvaro described as "dismissing"
> his suggestion.  In this post from October of 2015, I pointed out
> that there are 36 calls where we need a snapshot and 450 where we
> don't.
> 
> http://www.postgresql.org/message-id/56479263.1140984.1444945639606.JavaMail.yahoo@mail.yahoo.com

I understand the invasiveness argument, but to me the danger of
introducing new bugs trumps that.  The problem is not the current code,
but future patches: it is just too easy to make the mistake of not
checking the snapshot in new additions of BufferGetPage.  And you said
that the result of missing a check is silent wrong results from queries
that should instead be cancelled, which seems fairly bad to me.  My
impression was that you were actually considering doing something about
that -- sorry for the lack of clarity.

We have made similarly invasive changes in the past -- the
SearchSysCache API for instance.

> > not to mention performance-killing if the expectation is that
> > most such calls are going to need a snapshot check.
> 
> This patch is one which has allowed a customer where we could not
> meet their performance requirements to pass them.  It is the
> opposite of performance-killing.

I think Tom misunderstood what I said and you misunderstood what Tom
said.  Let me attempt to set things straight.

I said that we should change BufferGetPage into having the snapshot
check built-in, except in the cases where a flag is passed; and the flag
would be passed in all cases except those 30-something you identified.
In other words, the behavior in all the current callsites would be
identical to what's there today; we could have a macro do the first
check so that we don't introduce the overhead of a function call in the
450 cases where it's not needed.

Tom said that my proposal would be performance-killing, not that your
patch would be performance-killing.  But as I argue above, with my
proposal performance would stay the same, so we're actually okay.


I don't think nobody disputes that your patch is good in general.
I would be happy with it in 9.6, but I have my reservations about the
aforementioned problem.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: snapshot too old, configured by time
Next
From: Peter Geoghegan
Date:
Subject: Re: snapshot too old, configured by time