Thread: AtEOXact_Snapshot timing

AtEOXact_Snapshot timing

From
Robert Haas
Date:
Way back in 2011, commit 57eb009092684e6e1788dd0dae641ccee1668b10
moved AbortTransaction's AtEOXact_Snapshot call to CleanupTransaction
to fix a problem when a ROLLBACK statement was prepared at the
protocol level and executed in a transaction with REPEATABLE READ or
higher isolation. RevalidateCachedQuery would attempt to obtain a
snapshot and end up failing.  At the time, it was judged that
plancache.c was behaving correctly and this logic was rejiggered to
make that coding pattern safe. However, commit
ac63dca607e8e22247defbc8fe03b6baa3628c42 subsequently taught
RevalidateCachedQuery not to obtain a snapshot for such commands after
all while fixing an unrelated bug, and there now seems to be no case
in which we obtain a snapshot in an aborted transaction.

I'd like to propose that we upgrade that practice to a formal rule.
We've already taken some steps in this direction; notably, commit
42c80c696e9c8323841180029cc62741c21bd356 added an assertion to the
effect that we never perform catcache lookups outside of a valid,
non-aborted transaction. However, right now, if you made the mistake
of trying to access the database through some means other than a
catcache lookup in an aborted transaction, it might appear to work.
Actually, it would be terribly unsafe, because (1) you might've failed
after inserting a heap tuple and before inserting all the
corresponding index tuples and (2) any random subset of the tuples
inserted by prior commands in your transaction might have been pruned
by other backends after you removed your XID from the ProcArray, while
others would remain visible.  In short, such a snapshot is not really
a snapshot at all.

The best way that I've been able to come up with to enforce this rule
after a little bit of thought is to add Assert(IsTransactionState())
to a bunch of functions in snapmgr.c, most notably
GetTransactionSnapshot and GetCatalogSnapshot. The attached patch does
that.  It also makes the comments in RevalidateCachedQuery more
explicit about this issue, and it moves the AtEOXact_Snapshot call
back to AbortTransaction, on the theory (or hope?) that it's better to
dispose of resources sooner, especially resources that might look
superficially valid but really are not.

You may (or may not) wonder why I'm poking at this apparently-obscure
topic.  The answer is "undo."  Without getting into the gory details,
it's better for undo if as much of the cleanup work as possible
happens at AbortTransaction() time and as little as possible is left
until CleanupTransaction().  That seems like a good idea on general
principle too, though, so I'm proposing this as an independent
cleanup.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: AtEOXact_Snapshot timing

From
Andres Freund
Date:
Hi,

On 2019-09-05 14:50:50 -0400, Robert Haas wrote:
> The best way that I've been able to come up with to enforce this rule
> after a little bit of thought is to add Assert(IsTransactionState())
> to a bunch of functions in snapmgr.c, most notably
> GetTransactionSnapshot and GetCatalogSnapshot.

Wonder if there's a risk that callers might still have a snapshot
around, in independent memory?  It could make sense to add such an
assert to a visibility routine or two, maybe?

I suspect that we should add non-assert condition to a central place
like GetSnapshotData(). It's not hard to imagine extensions just using
that directly, and that we'd never notice that with assert only
testing. It's also hard to imagine a single if
(unlikely(IsTransactionState())) to be expensive enough to matter
compared to GetSnapshotData()'s own cost.

I wonder, not just because of the previous paragraph, whether it could
be worthwhile to expose enough xact.c state to allow
IsTransactionState() to be done without a function call. ISMT a few
Assert(IsTransactionState()) type checks really are important enough to
be done in production builds too. Some of the relevant scenarios aren't
even close to be covered fully, and you'll get bad results if there's
such a problem.

> @@ -2732,6 +2732,18 @@ AbortTransaction(void)
>          pgstat_report_xact_timestamp(0);
>      }
>  
> +    /*
> +     * Any snapshots taken by this transaction were unsafe to use even at the
> +     * time when we entered AbortTransaction(), since we might have, for
> +     * example, inserted a heap tuple and failed while inserting index tuples.
> +     * They were even more unsafe after ProcArrayEndTransaction(), since after
> +     * that point tuples we inserted could be pruned by other backends.
> +     * However, we postpone the cleanup until this point in the sequence
> +     * because it has to be done after ResourceOwnerRelease() has finishing
> +     * unregistering snapshots.
> +     */
> +    AtEOXact_Snapshot(false, true);
> +
>      /*
>       * State remains TRANS_ABORT until CleanupTransaction().
>       */

Hm. This means that
        if (is_parallel_worker)
            CallXactCallbacks(XACT_EVENT_PARALLEL_ABORT);
        else
            CallXactCallbacks(XACT_EVENT_ABORT);
which, together with a few of the other functions, could plausibly try
to use snapshot related logic, may end up continuing to use an existing
snapshot without us detecting the problem? I think?  Especially with the
asserts present ISTM that we really should kill the existing snapshots
directly adjacent to ProcArrayEndTransaction(). As you say, after that
the snapshots aren't correct anymore.  And with the right assertions we
should be safe againsts accidental reintroduction of catalog access in
the following code?


Greetings,

Andres Freund



Re: AtEOXact_Snapshot timing

From
Robert Haas
Date:
On Thu, Sep 5, 2019 at 5:32 PM Andres Freund <andres@anarazel.de> wrote:
> On 2019-09-05 14:50:50 -0400, Robert Haas wrote:
> > The best way that I've been able to come up with to enforce this rule
> > after a little bit of thought is to add Assert(IsTransactionState())
> > to a bunch of functions in snapmgr.c, most notably
> > GetTransactionSnapshot and GetCatalogSnapshot.
>
> Wonder if there's a risk that callers might still have a snapshot
> around, in independent memory?  It could make sense to add such an
> assert to a visibility routine or two, maybe?
>
> I suspect that we should add non-assert condition to a central place
> like GetSnapshotData(). It's not hard to imagine extensions just using
> that directly, and that we'd never notice that with assert only
> testing. It's also hard to imagine a single if
> (unlikely(IsTransactionState())) to be expensive enough to matter
> compared to GetSnapshotData()'s own cost.

I guess we could do that, but I feel like it might be overkill.  It
seems unlikely to me that an extension would call GetSnapshotData()
directly, but if it does, it's probably some kind of advanced wizardry
and I'm happy to trust that the extension author knows what she's
doing (or she can keep both pieces if it breaks). For my $0.02,
calling GetTransactionSnapshot() in an aborted transaction is the kind
of thing that's much more likely to be an unintentional goof.

> > @@ -2732,6 +2732,18 @@ AbortTransaction(void)
> >               pgstat_report_xact_timestamp(0);
> >       }
> >
> > +     /*
> > +      * Any snapshots taken by this transaction were unsafe to use even at the
> > +      * time when we entered AbortTransaction(), since we might have, for
> > +      * example, inserted a heap tuple and failed while inserting index tuples.
> > +      * They were even more unsafe after ProcArrayEndTransaction(), since after
> > +      * that point tuples we inserted could be pruned by other backends.
> > +      * However, we postpone the cleanup until this point in the sequence
> > +      * because it has to be done after ResourceOwnerRelease() has finishing
> > +      * unregistering snapshots.
> > +      */
> > +     AtEOXact_Snapshot(false, true);
> > +
> >       /*
> >        * State remains TRANS_ABORT until CleanupTransaction().
> >        */
>
> Hm. This means that
>                 if (is_parallel_worker)
>                         CallXactCallbacks(XACT_EVENT_PARALLEL_ABORT);
>                 else
>                         CallXactCallbacks(XACT_EVENT_ABORT);
> which, together with a few of the other functions, could plausibly try
> to use snapshot related logic, may end up continuing to use an existing
> snapshot without us detecting the problem? I think?  Especially with the
> asserts present ISTM that we really should kill the existing snapshots
> directly adjacent to ProcArrayEndTransaction(). As you say, after that
> the snapshots aren't correct anymore.  And with the right assertions we
> should be safe againsts accidental reintroduction of catalog access in
> the following code?

Well, I don't really see how to make those things precisely adjacent
without a lot more rejiggering of the code, and that doesn't seem
worth it to me.  I think the main risk here is that someone tries to
use a snapshot after AbortTransaction() and before
CleanupTransaction(), and that's what I want to try to block.  The
risk that somebody's going to try to use a snapshot within
AbortTransaction() seems lower to me, although obviously not zero.
You can't do anything that will plausibly fail in this code path, and
that's generally a tighter restriction than the one we're trying to
enforce here.

I agree with you that it would be nicer if we could put the killing of
the old snapshots directly adjacent to ProcArrayEndTransaction(), and
that was the first thing I tried, but it doesn't work, because
resource owner cleanup has to run first.  It's possible that we could
split AtEOXact_Snapshot() into two pieces that run at different times,
but I think we'd be guarding against a vulnerability that is mostly
theoretical, and I don't really see the point.  The only kind of
access that somebody's likely to attempt during AbortTransaction() is
catalog access, and that's likely to trip either the assertion I added
in GetCatalogSnapshot() or the existing one in SearchCatCache().
Non-catalog access would probably fail the assertion in
GetTransactionSnapshot() or GetLatestSnapshot().  I guess we should
probably add a similar check in GetActiveSnapshot(); the attached
patch adds that.

I might be missing something here, so feel free to point out to me if
there's a plausible coding pattern I'm missing where the things you
are proposing would actually be a problem.  To me, it just feels like
you're worried about a scenario that would require writing the code in
a very unnatural way, like saving a snapshot in a global variable or
calling GetSnapshotData() directly. The fact that there's no existing
code that does those things should be enough to warn you off of doing
them.  If it's not, I doubt that a mere assertion is going to stand in
your way...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: AtEOXact_Snapshot timing

From
Andres Freund
Date:
Hi,

On 2019-09-06 10:25:22 -0400, Robert Haas wrote:
> On Thu, Sep 5, 2019 at 5:32 PM Andres Freund <andres@anarazel.de> wrote:
> > On 2019-09-05 14:50:50 -0400, Robert Haas wrote:
> > > The best way that I've been able to come up with to enforce this rule
> > > after a little bit of thought is to add Assert(IsTransactionState())
> > > to a bunch of functions in snapmgr.c, most notably
> > > GetTransactionSnapshot and GetCatalogSnapshot.
> >
> > Wonder if there's a risk that callers might still have a snapshot
> > around, in independent memory?  It could make sense to add such an
> > assert to a visibility routine or two, maybe?
> >
> > I suspect that we should add non-assert condition to a central place
> > like GetSnapshotData(). It's not hard to imagine extensions just using
> > that directly, and that we'd never notice that with assert only
> > testing. It's also hard to imagine a single if
> > (unlikely(IsTransactionState())) to be expensive enough to matter
> > compared to GetSnapshotData()'s own cost.
>
> I guess we could do that, but I feel like it might be overkill.  It
> seems unlikely to me that an extension would call GetSnapshotData()
> directly, but if it does, it's probably some kind of advanced wizardry
> and I'm happy to trust that the extension author knows what she's
> doing (or she can keep both pieces if it breaks).

Hm. I feel like there's plenty reasons to get a snapshot in extensions -
there's plenty APIs one cannot really call without doing so? What I'm
worried about is not primarily that GetSnapshotData() is being called
directly, but that $author got a snapshot previously, and then tries to
use it in an xact callback or such.

I'd add asserts to at least PushActiveSnapshot(), and I don't see the
harm in adding one to GetSnapshotData().


> For my $0.02, calling GetTransactionSnapshot() in an aborted
> transaction is the kind of thing that's much more likely to be an
> unintentional goof.

Based on what I've seen people do in xact callback handlers... Didn't PG
itself e.g. have code doing syscache lookups in aborted transactions a
couple times?

The danger of using a previously acquired snapshot in that stage is why
I was pondering adding an assert to visibility functions
themselves. E.g. just adding one to HeapTupleSatisfiesVisibility() might
already add a good bit of coverage.


> I agree with you that it would be nicer if we could put the killing of
> the old snapshots directly adjacent to ProcArrayEndTransaction(), and
> that was the first thing I tried, but it doesn't work, because
> resource owner cleanup has to run first.

Hm. I'd even say that it actually belongs to *before* the
ProcArrayEndTransaction() call.


For a bit I wondered if the resowner snapshot cleanup ought to be at
least moved to RESOURCE_RELEASE_BEFORE_LOCKS. Not that it actually
addresses this issue, but it seems to belong there "thematically". But
then I honestly don't understand why most of the resowner managed
resources in the abort sequence are released where they are.  The only
really explanatory comment is:

     * The ordering of operations is not entirely random.  The idea is:
     * release resources visible to other backends (eg, files, buffer pins);
     * then release locks; then release backend-local resources. We want to
     * release locks at the point where any backend waiting for us will see
     * our transaction as being fully cleaned up.

but that doesn't explain why we e.g. process relcache references, jit
contexts (arguably it does for dsm segments), at that stage. And
definitely not why we do abort's relcache inval processing between
RESOURCE_RELEASE_BEFORE_LOCKS and RESOURCE_RELEASE_LOCKS - that can be
quite expensive whe needing to scan the whole relcache.

Anyway, this is grumbling about things far beyond the scope of this
patch.


> It's possible that we could
> split AtEOXact_Snapshot() into two pieces that run at different times,
> but I think we'd be guarding against a vulnerability that is mostly
> theoretical, and I don't really see the point.  The only kind of
> access that somebody's likely to attempt during AbortTransaction() is
> catalog access, and that's likely to trip either the assertion I added
> in GetCatalogSnapshot() or the existing one in SearchCatCache().

I'm not sure that's true - I've certainly seen extensions logging the
transaction state into a table, for example... Even in aborted xacts.


> Non-catalog access would probably fail the assertion in
> GetTransactionSnapshot() or GetLatestSnapshot().

Not this patch's fault, obviously, but none of this appears to catch
DML...


> I might be missing something here, so feel free to point out to me if
> there's a plausible coding pattern I'm missing where the things you
> are proposing would actually be a problem.  To me, it just feels like
> you're worried about a scenario that would require writing the code in
> a very unnatural way, like saving a snapshot in a global variable or
> calling GetSnapshotData() directly.

Well, if you actually want to look at state as it was at the beginning
of the database, it seems not unreasonable to get a snapshot at the
start of the transaction, push it onto the stack, and then use it at the
end of the transaction, too.


Just to be clear: While I think an assert or two more seem like a good
idea, that's musing around the edges, not a fundamental concern.


> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
> index f594d33e7a..9993251607 100644
> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -2732,6 +2732,18 @@ AbortTransaction(void)
>          pgstat_report_xact_timestamp(0);
>      }
>
> +    /*
> +     * Any snapshots taken by this transaction were unsafe to use even at the
> +     * time when we entered AbortTransaction(), since we might have, for
> +     * example, inserted a heap tuple and failed while inserting index tuples.
> +     * They were even more unsafe after ProcArrayEndTransaction(), since after
> +     * that point tuples we inserted could be pruned by other backends.
> +     * However, we postpone the cleanup until this point in the sequence
> +     * because it has to be done after ResourceOwnerRelease() has finishing
> +     * unregistering snapshots.
> +     */
> +    AtEOXact_Snapshot(false, true);

One thing that bothers me a bit here is that the other cleanup calls are
within

    /*
     * Post-abort cleanup.  See notes in CommitTransaction() concerning
     * ordering.  We can skip all of it if the transaction failed before
     * creating a resource owner.
     */
    if (TopTransactionResourceOwner != NULL)

and adding AtEOXact_Snapshot() below that block kind of makes that
comment wrong. I don't think we actually can make the call conditional
in the same way, however.


>      /*
>       * State remains TRANS_ABORT until CleanupTransaction().
>       */
> @@ -2757,7 +2769,6 @@ CleanupTransaction(void)
>       * do abort cleanup processing
>       */
>      AtCleanup_Portals();        /* now safe to release portal memory */
> -    AtEOXact_Snapshot(false, true); /* and release the transaction's snapshots */

Hm. I don't quite get why we tell AtEOXact_Snapshot() to clean up
->xmin, when we just called ProcArrayEndTransaction(). Again, not this
patch's fault...


Greetings,

Andres Freund



Re: AtEOXact_Snapshot timing

From
Robert Haas
Date:
On Mon, Nov 11, 2019 at 2:12 PM Andres Freund <andres@anarazel.de> wrote:
> Hm. I feel like there's plenty reasons to get a snapshot in extensions -
> there's plenty APIs one cannot really call without doing so?

Sure, I don't disagree with that.

> What I'm
> worried about is not primarily that GetSnapshotData() is being called
> directly, but that $author got a snapshot previously, and then tries to
> use it in an xact callback or such.

Yeah, I guess that's possible. Registering a new snapshot would trip
one of the assertions I added, but using an old one wouldn't.

> I'd add asserts to at least PushActiveSnapshot(), and I don't see the
> harm in adding one to GetSnapshotData().

OK. I think there's some risk that someone will have a purpose for
calling GetSansphotData() which is actually tolerably safe but now
won't work, so I thin there's a chance we'll get a complaint. But if
we do, we can always reconsider whether to take that particular
assertion back out again.

> Based on what I've seen people do in xact callback handlers... Didn't PG
> itself e.g. have code doing syscache lookups in aborted transactions a
> couple times?

Sure -- but the assertions I had already added would catch that
anyway, assuming that it actually attempted catalog access, and we now
also have assertions that will catch it even if it would have been
satisfied from the cache.

> The danger of using a previously acquired snapshot in that stage is why
> I was pondering adding an assert to visibility functions
> themselves. E.g. just adding one to HeapTupleSatisfiesVisibility() might
> already add a good bit of coverage.

Yeah, but I really hate to do that; those functions are super-hot. And
I don't think we need to go overboard in protecting people from
themselves. The assertions I'm proposing to add should already catch
quite a bit of stuff that is unchecked today, and with little or no
possible downside. There's no rule that we can't add more later, nor
are more assertions always better than fewer.

> > I agree with you that it would be nicer if we could put the killing of
> > the old snapshots directly adjacent to ProcArrayEndTransaction(), and
> > that was the first thing I tried, but it doesn't work, because
> > resource owner cleanup has to run first.
>
> Hm. I'd even say that it actually belongs to *before* the
> ProcArrayEndTransaction() call.
>
> For a bit I wondered if the resowner snapshot cleanup ought to be at
> least moved to RESOURCE_RELEASE_BEFORE_LOCKS. Not that it actually
> addresses this issue, but it seems to belong there "thematically". But
> then I honestly don't understand why most of the resowner managed
> resources in the abort sequence are released where they are.  The only
> really explanatory comment is:
>
>          * The ordering of operations is not entirely random.  The idea is:
>          * release resources visible to other backends (eg, files, buffer pins);
>          * then release locks; then release backend-local resources. We want to
>          * release locks at the point where any backend waiting for us will see
>          * our transaction as being fully cleaned up.
>
> but that doesn't explain why we e.g. process relcache references, jit
> contexts (arguably it does for dsm segments), at that stage. And
> definitely not why we do abort's relcache inval processing between
> RESOURCE_RELEASE_BEFORE_LOCKS and RESOURCE_RELEASE_LOCKS - that can be
> quite expensive whe needing to scan the whole relcache.
>
> Anyway, this is grumbling about things far beyond the scope of this
> patch.

Yeah. I do agree with you that a lot of that stuff isn't very well
explained.  Nor is there much of an explanation of why some things go
through resowner.c and other things have bespoke cleanup code. But, as
you say, that's out of scope.

> I'm not sure that's true - I've certainly seen extensions logging the
> transaction state into a table, for example... Even in aborted xacts.

Whoa.

> > Non-catalog access would probably fail the assertion in
> > GetTransactionSnapshot() or GetLatestSnapshot().
>
> Not this patch's fault, obviously, but none of this appears to catch
> DML...

Really? It's hard to imagine that DML wouldn't attempt catalog access.

> Just to be clear: While I think an assert or two more seem like a good
> idea, that's musing around the edges, not a fundamental concern.

All right, here's another version with an assert or two more. :-)

> > diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
> > index f594d33e7a..9993251607 100644
> > --- a/src/backend/access/transam/xact.c
> > +++ b/src/backend/access/transam/xact.c
> > @@ -2732,6 +2732,18 @@ AbortTransaction(void)
> >               pgstat_report_xact_timestamp(0);
> >       }
> >
> > +     /*
> > +      * Any snapshots taken by this transaction were unsafe to use even at the
> > +      * time when we entered AbortTransaction(), since we might have, for
> > +      * example, inserted a heap tuple and failed while inserting index tuples.
> > +      * They were even more unsafe after ProcArrayEndTransaction(), since after
> > +      * that point tuples we inserted could be pruned by other backends.
> > +      * However, we postpone the cleanup until this point in the sequence
> > +      * because it has to be done after ResourceOwnerRelease() has finishing
> > +      * unregistering snapshots.
> > +      */
> > +     AtEOXact_Snapshot(false, true);
>
> One thing that bothers me a bit here is that the other cleanup calls are
> within
>
>         /*
>          * Post-abort cleanup.  See notes in CommitTransaction() concerning
>          * ordering.  We can skip all of it if the transaction failed before
>          * creating a resource owner.
>          */
>         if (TopTransactionResourceOwner != NULL)
>
> and adding AtEOXact_Snapshot() below that block kind of makes that
> comment wrong. I don't think we actually can make the call conditional
> in the same way, however.

I guess I read that statement as referring to the contents of the
if-block, not everything below that in the function. But we could
reword the comment to, e.g. We can skip quite a bit of work if the
transaction failed before creating a resource owner. Done in the
attached version.

>
>
> >       /*
> >        * State remains TRANS_ABORT until CleanupTransaction().
> >        */
> > @@ -2757,7 +2769,6 @@ CleanupTransaction(void)
> >        * do abort cleanup processing
> >        */
> >       AtCleanup_Portals();            /* now safe to release portal memory */
> > -     AtEOXact_Snapshot(false, true); /* and release the transaction's snapshots */
>
> Hm. I don't quite get why we tell AtEOXact_Snapshot() to clean up
> ->xmin, when we just called ProcArrayEndTransaction(). Again, not this
> patch's fault...

So let's leave that for another time.

v3 attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment