Home > mailing lists

SetQuerySnapshot redesign: yet another try - Mailing list pgsql-hackers

From	Tom Lane
Subject	SetQuerySnapshot redesign: yet another try
Date	September 6, 2004 22:16:01
Msg-id	18928.1094505313@sss.pgh.pa.us Whole thread Raw
List	pgsql-hackers

Tree view

I know you've heard about this before, for instance in
http://archives.postgresql.org/pgsql-hackers/2002-06/msg00452.php
http://archives.postgresql.org/pgsql-bugs/2004-08/msg00304.php
http://archives.postgresql.org/pgsql-general/2004-04/msg00555.php
but here is another try at redefining what we want to do with snapshot
updates.  I think that it would be a good idea to take whatever backwards
compatibility hit is needed in 8.0, rather than some later release, simply
because people are more prepared to expect such things in a front-version-
bump release.  So even though we're into beta, I'm willing to think
about changing this.

Keep in mind this all matters only for READ COMMITTED mode, since in
SERIALIZABLE mode we take only one snapshot per transaction; that behavior
won't change.

In read committed mode I think it is *essential* that we take a fresh
snapshot for each query, or at least after every data-modifying query,
since otherwise you get the sorts of bizarre behaviors exhibited in the
above bug reports.  That has to hold true within PL functions as well as
for interactive commands.  I think further that we should make the PL
behavior as much like the interactive behavior as possible --- that means
that I'm not really interested in ideas like, say, skipping the snapshot
update between SELECTs.  So I think what we want is a SetQuerySnapshot
before every query and a CommandCounterIncrement after, just like it's
done at the interactive level.  I would propose doing this inside SPI
so that it automatically affects the behavior of all PLs.

But having said that, it might be nice to provide some way of
approximating the old behavior (no SQS) in case anyone has functions that
will be broken by this change.  Also, in the thread following the first
message linked to above, Hiroshi raised the question of whether it's
possible to implement a truly "stable" PL function --- that is, a function
that makes a database inquiry but is guaranteed to return the same result
throughout the lifespan of its calling query.  At present PL functions are
stable in this sense with respect to changes made by other transactions,
though not with respect to changes made by the calling query or other
functions that it calls.  With the SQS change a PL function doing a query
would not be stable at all.

Hiroshi had suggested removing the initial CommandCounterIncrement that
is currently done by the various SPI operations.  I think this is a
reasonable thing to do on efficiency grounds --- it should be sufficient
to do it after each operation, as we are already doing, and that would be
more like the behavior for the interactive cases.  However this is not
by itself sufficient to solve the stability problem, since CCI calls done
by other operations (eg other functions) would still break the stability
of the would-be stable function.

I would like to propose that we solve the stability issue for real, and as
a byproduct offer a sort-of-backwards-compatible mode.  What I have in
mind is this: PL functions that are declared STABLE or IMMUTABLE will not
do *either* SetQuerySnapshot or CommandCounterIncrement, but will execute
all their queries against the same snapshot being used by the calling
query.  This would imply disallowing database modifications inside such
functions, since without CommandCounterIncrement database changes wouldn't
behave very sensibly (you couldn't see them in following queries).
I think that's a reasonable restriction --- it's already true that you
shouldn't declare a function STABLE or IMMUTABLE if it has side effects,
since you are risking having the side effects optimized away when you make
such a declaration.

An alternative that people might find more palatable is to drive this
behavior off a new function property called READ ONLY, or some such.
I'm not sure that I see the value of that though --- is there any use
in a READ ONLY VOLATILE or non-READ-ONLY STABLE function?  Adding a new
function property would force an initdb to add a pg_proc column, so I'd
prefer not to do this unless there's a strong consensus that we need a
separate knob for it.

I haven't fully finished designing the implementation for this, but what
I'm envisioning is that SPI would offer the option to execute queries as
"read only", and when that's set it would get the required snapshot data
from the current ActivePortal, rather than doing internal SetQuerySnapshot
calls as we will make it do in the default case.  The functions in
pquery.c would stash the active snapshot into the Portal data structure
to make it possible to retrieve it.  This would require minor rejiggering
of the API for ExecutorStart and nearby operations, but nothing extensive.

Comments?
        regards, tom lane

pgsql-hackers by date:

From: Gaetano Mendola
Date: 06 September 2004, 21:06:25
Subject: Re: huge execution time difference with almost same plan

From: Tom Lane
Date: 06 September 2004, 22:56:52
Subject: Breakage in trigger.c

SetQuerySnapshot redesign: yet another try - Mailing list pgsql-hackers

Previous

Next