Thread: Re: [COMMITTERS] pgsql: Account for catalog snapshot in PGXACT->xmin updates.

Re: [COMMITTERS] pgsql: Account for catalog snapshot in PGXACT->xmin updates.

From
Robert Haas
Date:
On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Account for catalog snapshot in PGXACT->xmin updates.
>
> The CatalogSnapshot was not plugged into SnapshotResetXmin()'s accounting
> for whether MyPgXact->xmin could be cleared or advanced.  In normal
> transactions this was masked by the fact that the transaction snapshot
> would be older, but during backend startup and certain utility commands
> it was possible to re-use the CatalogSnapshot after MyPgXact->xmin had
> been cleared, meaning that recently-deleted rows could be pruned even
> though this snapshot could still see them, causing unexpected catalog
> lookup failures.  This effect appears to be the explanation for a recent
> failure on buildfarm member piculet.
>
> To fix, add the CatalogSnapshot to the RegisteredSnapshots heap whenever
> it is valid.

It seems to me that this makes it possible for
ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
core code that calls that function is in copy.c, and apparently we
never reach that point with the catalog snapshot set.  But that seems
fragile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Account for catalog snapshot in PGXACT->xmin updates.

> It seems to me that this makes it possible for
> ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
> core code that calls that function is in copy.c, and apparently we
> never reach that point with the catalog snapshot set.  But that seems
> fragile.

Hm.  If there were some documentation explaining what
ThereAreNoPriorRegisteredSnapshots is supposed to do, it would be possible
for us to have a considered argument as to whether this patch broke it or
not.  As things stand, it is not obvious whether it ought to be ignoring
the catalog snapshot or not; one could construct plausible arguments
either way.  It's not even very obvious why both 0 and 1 registered
snapshots should be considered okay; that looks a lot more like a hack
than reasoned-out behavior.  So I'm satisfied if COPY FREEZE does what
it's supposed to do, which it still appears to do.

IOW, I'm not interested in touching this unless someone first provides
adequate documentation for the pre-existing kluge here.  There is no
API spec at all on the function itself, and the comment in front of the
one call site is too muddle-headed to provide any insight.
        regards, tom lane



Re: [COMMITTERS] pgsql: Account for catalog snapshot in PGXACT->xmin updates.

From
Robert Haas
Date:
On Tue, Dec 6, 2016 at 1:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Account for catalog snapshot in PGXACT->xmin updates.
>
>> It seems to me that this makes it possible for
>> ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
>> core code that calls that function is in copy.c, and apparently we
>> never reach that point with the catalog snapshot set.  But that seems
>> fragile.
>
> Hm.  If there were some documentation explaining what
> ThereAreNoPriorRegisteredSnapshots is supposed to do, it would be possible
> for us to have a considered argument as to whether this patch broke it or
> not.  As things stand, it is not obvious whether it ought to be ignoring
> the catalog snapshot or not; one could construct plausible arguments
> either way.  It's not even very obvious why both 0 and 1 registered
> snapshots should be considered okay; that looks a lot more like a hack
> than reasoned-out behavior.  So I'm satisfied if COPY FREEZE does what
> it's supposed to do, which it still appears to do.
>
> IOW, I'm not interested in touching this unless someone first provides
> adequate documentation for the pre-existing kluge here.  There is no
> API spec at all on the function itself, and the comment in front of the
> one call site is too muddle-headed to provide any insight.

COPY FREEZE is designed to be usable only in situations where the new
tuples won't be visible earlier than would otherwise be the case.
Thus, copy.c has a check that the relfilenode was created by our
(sub)transaction, which guards against the possibility that other
transactions might see the data early, and also a check that there are
no other registered snapshots or ready portals, which guarantees that,
for example, a cursor belonging to the current subtransaction but with
a lower command-ID doesn't see the rows early.  I am fuzzy why we need
to check for both snapshots and portals, but the overall intent seems
pretty clear to me.  What are you confused about?  It seems quite
obvious to me that no matter how old the catalog snapshot is, it's
nothing that we need to worry about here because it'll get invalidated
and refreshed before use if anything changes meanwhile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Tue, Dec 06, 2016 at 01:59:18PM -0500, Robert Haas wrote:
> On Tue, Dec 6, 2016 at 1:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> >> On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Account for catalog snapshot in PGXACT->xmin updates.
> >
> >> It seems to me that this makes it possible for
> >> ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
> >> core code that calls that function is in copy.c, and apparently we
> >> never reach that point with the catalog snapshot set.  But that seems
> >> fragile.

I recently noticed this by way of the copy2 regression test failing, in 9.4
and later, under REPEATABLE READ and SERIALIZABLE.  That failure started with
$SUBJECT.  Minimal example:

CREATE TABLE vistest (a text);
BEGIN ISOLATION LEVEL REPEATABLE READ;
TRUNCATE vistest;
COPY vistest FROM stdin CSV FREEZE;
x
\.

Before $SUBJECT, the registered snapshot count was one.  Since $SUBJECT, the
catalog snapshot raises the count to two.

> > Hm.  If there were some documentation explaining what
> > ThereAreNoPriorRegisteredSnapshots is supposed to do, it would be possible
> > for us to have a considered argument as to whether this patch broke it or
> > not.  As things stand, it is not obvious whether it ought to be ignoring
> > the catalog snapshot or not; one could construct plausible arguments

The rows COPY FREEZE writes will be visible (until deleted) to every possible
catalog snapshot, so whether CatalogSnapshot==NULL doesn't matter.  It may be
useful to error out if a catalog scan is in-flight, but the registration for
CatalogSnapshot doesn't confirm or refute that.

> > either way.  It's not even very obvious why both 0 and 1 registered
> > snapshots should be considered okay; that looks a lot more like a hack
> > than reasoned-out behavior.  So I'm satisfied if COPY FREEZE does what

Fair summary.  ThereAreNoPriorRegisteredSnapshots() has functioned that way
since an early submission[1], and I don't see that we ever discussed it
explicitly.  Adding Simon for his recollection.  I think the goal was to raise
an error if any scan of the COPY-target table might use an older CommandId;
this prevents a way of seeing tuples that the scan would not see after COPY
without FREEZE.  Allowing one registered snapshot was a satisfactory heuristic
at the time.  It rejected running plain queries, which have been registering
two snapshots.  It did not reject otherwise-idle REPEATABLE READ transactions,
which register FirstXactSnapshot once; that has been okay, because any use of
FirstXactSnapshot in a query pushes or registers a copy.  I think the code was
wrong to consider registered snapshots and ignore ActiveSnapshot, though.  (We
must ignore COPY's own ActiveSnapshot, but it should be the only one.)  I
don't blame $SUBJECT for changing this landscape; COPY should not attribute
meaning to a particular nonzero registered snapshot count.

> > it's supposed to do, which it still appears to do.
> >
> > IOW, I'm not interested in touching this unless someone first provides
> > adequate documentation for the pre-existing kluge here.  There is no
> > API spec at all on the function itself, and the comment in front of the
> > one call site is too muddle-headed to provide any insight.
> 
> COPY FREEZE is designed to be usable only in situations where the new
> tuples won't be visible earlier than would otherwise be the case.
> Thus, copy.c has a check that the relfilenode was created by our
> (sub)transaction, which guards against the possibility that other
> transactions might see the data early, and also a check that there are
> no other registered snapshots or ready portals, which guarantees that,
> for example, a cursor belonging to the current subtransaction but with
> a lower command-ID doesn't see the rows early.  I am fuzzy why we need
> to check for both snapshots and portals, but the overall intent seems

The snapshot check helps in this case:

CREATE TABLE vistest (a text);
BEGIN ISOLATION LEVEL READ COMMITTED;
CREATE FUNCTION copy_vistest() RETURNS void LANGUAGE plpgsql AS $$BEGIN COPY vistest FROM '/dev/null' CSV FREEZE;
END$$;
CREATE FUNCTION count_vistest() RETURNS int LANGUAGE plpgsql STABLE AS $$BEGIN RETURN (SELECT count(*) FROM vistest);
END$$;
TRUNCATE vistest;
SELECT copy_vistest(), count_vistest();
ROLLBACK;

Other sessions still see rows they ideally would not see[2], the same kind of
anomaly the portal and snapshot tests exist to prevent.  It's odd that we
protect visibility solely within the COPY transaction, the transaction most
likely to know what it's doing.  I'm therefore skeptical of having these tests
at all, but I hesitate to remove them in back branches.  Even in master, we
may want them later if we add visibility protection for other transactions.

I'm tempted to add this hack, ...

--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2445,2 +2445,3 @@ CopyFrom(CopyState cstate)    {
+        InvalidateCatalogSnapshot();        if (!ThereAreNoPriorRegisteredSnapshots() || !ThereAreNoReadyPortals())

... along with comments and renaming ThereAreNoPriorRegisteredSnapshots().
That restores the pre-$SUBJECT semantics.  I suspect those semantics allow
COPY FREEZE more than intended, but again that is mostly harmless.  If we want
to make the restrictions more principled, ThereAreNoPriorRegisteredSnapshots()
could give way to a test like this (pseudocode):

n_relevant_registered_snap = cardinality(RegisteredSnapshots)
/* transaction/catalog snapshots are pushed/registered before use in query */
if (FirstXactSnapshot) n_relevant_registered_snap-- 
if (CatalogSnapshot)   n_relevant_registered_snap--
/* expect one ActiveSnapshot for COPY itself */
can_freeze = cardinality(ActiveSnapshot) <= 1 && n_relevant_registered_snap == 0

Preferences?

[1] https://postgr.es/m/CA+U5nMJ8CdApEcm0+Ln0WXVx194k3wJHihb=-0xDHen1v0hhDg@mail.gmail.com
[2] https://postgr.es/m/20121206191823.GW5162@tamriel.snowman.net

> pretty clear to me.  What are you confused about?  It seems quite
> obvious to me that no matter how old the catalog snapshot is, it's
> nothing that we need to worry about here because it'll get invalidated
> and refreshed before use if anything changes meanwhile.



On 19 August 2017 at 20:54, Noah Misch <noah@leadboat.com> wrote:
> On Tue, Dec 06, 2016 at 01:59:18PM -0500, Robert Haas wrote:
>> On Tue, Dec 6, 2016 at 1:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > Robert Haas <robertmhaas@gmail.com> writes:
>> >> On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >>> Account for catalog snapshot in PGXACT->xmin updates.
>> >
>> >> It seems to me that this makes it possible for
>> >> ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
>> >> core code that calls that function is in copy.c, and apparently we
>> >> never reach that point with the catalog snapshot set.  But that seems
>> >> fragile.
>
> I recently noticed this by way of the copy2 regression test failing, in 9.4
> and later, under REPEATABLE READ and SERIALIZABLE.  That failure started with
> $SUBJECT.  Minimal example:
>
> CREATE TABLE vistest (a text);
> BEGIN ISOLATION LEVEL REPEATABLE READ;
> TRUNCATE vistest;
> COPY vistest FROM stdin CSV FREEZE;
> x
> \.
>
> Before $SUBJECT, the registered snapshot count was one.  Since $SUBJECT, the
> catalog snapshot raises the count to two.
>
>> > Hm.  If there were some documentation explaining what
>> > ThereAreNoPriorRegisteredSnapshots is supposed to do, it would be possible
>> > for us to have a considered argument as to whether this patch broke it or
>> > not.  As things stand, it is not obvious whether it ought to be ignoring
>> > the catalog snapshot or not; one could construct plausible arguments
>
> The rows COPY FREEZE writes will be visible (until deleted) to every possible
> catalog snapshot, so whether CatalogSnapshot==NULL doesn't matter.  It may be
> useful to error out if a catalog scan is in-flight, but the registration for
> CatalogSnapshot doesn't confirm or refute that.
>
>> > either way.  It's not even very obvious why both 0 and 1 registered
>> > snapshots should be considered okay; that looks a lot more like a hack
>> > than reasoned-out behavior.  So I'm satisfied if COPY FREEZE does what
>
> Fair summary.  ThereAreNoPriorRegisteredSnapshots() has functioned that way
> since an early submission[1], and I don't see that we ever discussed it
> explicitly.  Adding Simon for his recollection.

The intention, IIRC, was to allow the current snapshot (i.e. 1) but no
earlier (i.e. prior) snapshots (so not >=2)

The name was chosen so it was (hopefully) clear what it did --
ThereAreNoPriorRegisteredSnapshots
...but that was before we had MVCC scans on catalogs, which changed
things in unforeseen ways.

The right fix is surely to do a more principled approach and renaming
of the function so that it reflects the current snapshot types.

As Noah mentions there are potential anomalies in other transactions
but this was understood and hence why we have a specific command
keyword to acknowledge user acceptance of these behaviours.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Mon, Aug 21, 2017 at 07:41:52AM +0100, Simon Riggs wrote:
> On 19 August 2017 at 20:54, Noah Misch <noah@leadboat.com> wrote:
> > On Tue, Dec 06, 2016 at 01:59:18PM -0500, Robert Haas wrote:
> >> On Tue, Dec 6, 2016 at 1:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> > Robert Haas <robertmhaas@gmail.com> writes:
> >> >> On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> >>> Account for catalog snapshot in PGXACT->xmin updates.
> >> >
> >> >> It seems to me that this makes it possible for
> >> >> ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
> >> >> core code that calls that function is in copy.c, and apparently we
> >> >> never reach that point with the catalog snapshot set.  But that seems
> >> >> fragile.
> >
> > I recently noticed this by way of the copy2 regression test failing, in 9.4
> > and later, under REPEATABLE READ and SERIALIZABLE.  That failure started with
> > $SUBJECT.

> > Fair summary.  ThereAreNoPriorRegisteredSnapshots() has functioned that way
> > since an early submission[1], and I don't see that we ever discussed it
> > explicitly.  Adding Simon for his recollection.
> 
> The intention, IIRC, was to allow the current snapshot (i.e. 1) but no
> earlier (i.e. prior) snapshots (so not >=2)
> 
> The name was chosen so it was (hopefully) clear what it did --
> ThereAreNoPriorRegisteredSnapshots

I now understand the function name, so I added a comment but didn't rename it.
I plan to use the attached patch after the minor release tags land.  If
there's significant support, I could instead push before the wrap.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
Noah Misch <noah@leadboat.com> writes:
> I plan to use the attached patch after the minor release tags land.  If
> there's significant support, I could instead push before the wrap.

This looks fine to me --- I think you should push now.

(which reminds me, I'd better get on with making release notes.)
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

On Sat, Nov 04, 2017 at 12:23:36PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > I plan to use the attached patch after the minor release tags land.  If
> > there's significant support, I could instead push before the wrap.
> 
> This looks fine to me --- I think you should push now.

Done.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers