Thread: MultiXact bugs

MultiXact bugs

From

Andres Freund

Date:

24 November 2013, 00:02:25

Hi,

The attached pgbench testcase can reproduce two issues:
1) (takes a bit)
TRAP: FailedAssertion("!(((xid) >= ((TransactionId) 3)))", File:/pruneheap.c", Line: 601)

That's because HeapTupleHeaderGetUpdateXid() ignores aborted updaters
and returns InvalidTransactionId in that case, but
HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...

Looks like a 9.3+ issue, and shouldn't have a ll that high impact in
non-assert builds, page pruning will be delayed a bit.

2) we frequently error out in heap_lock_updated_tuple_rec() with
ERROR:  unable to fetch updated version of tuple

That's because when we're following a ctid chain, it's perfectly
possible for the updated version of a tuple to already have been
vacuumed/pruned away if the the updating transaction has aborted.

Also looks like a 9.3+ issues to me, slightly higher impact, but in the
end you're just getting some errors under concurrency.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: MultiXact bugs

From

Alvaro Herrera

Date:

25 November 2013, 15:53:02

Andres Freund wrote:

Hi,

> The attached pgbench testcase can reproduce two issues:

Thanks.

> 2) we frequently error out in heap_lock_updated_tuple_rec() with
> ERROR:  unable to fetch updated version of tuple
>
> That's because when we're following a ctid chain, it's perfectly
> possible for the updated version of a tuple to already have been
> vacuumed/pruned away if the the updating transaction has aborted.
>
> Also looks like a 9.3+ issues to me, slightly higher impact, but in the
> end you're just getting some errors under concurrency.

Yes, I think this is 9.3 only.  First attachment shows my proposed
patch, which is just to report success to caller in case the tuple
cannot be acquired by heap_fetch.  This is OK because if by the time we
check the updated version of a tuple it is gone, it means there is no
further update chain to follow and lock.

> 1) (takes a bit)
> TRAP: FailedAssertion("!(((xid) >= ((TransactionId) 3)))", File:/pruneheap.c", Line: 601)
>
> That's because HeapTupleHeaderGetUpdateXid() ignores aborted updaters
> and returns InvalidTransactionId in that case, but
> HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...

Interesting.  This is a very narrow race condition: when we call
HeapTupleSafisfiesVacuum the updater is still running, so it returns
HEAPTUPLE_DELETE_IN_PROGRESS; but it aborts just before we read the
tuple's update Xid.  So that one returns InvalidXid and that causes the
failure.  I was able to hit this race condition very quickly by adding a
pg_usleep(1000) in heap_prune_chain(), in the
HEAPTUPLE_DELETE_IN_PROGRESS case, before trying to acquire the update
Xid.

There is no way to close the window, but there is no need; if the
updater aborted, we don't need to mark the page prunable in the first
place.  So we can just check the return value of
HeapTupleHeaderGetUpdateXid and if it's InvalidXid, don't set the
prunable bit.  The second attachment below fixes the bug that way.

I checked for other cases where the update Xid is checked after
HeapTupleSatisfiesVacuum returns HEAPTUPLE_DELETE_IN_PROGRESS.  As far
as I can tell, the only one that would be affected is the one in
predicate.c.  It is far from clear to me what is the right thing to do
in these cases; the simplest idea is to return without reporting a
failure if the updater aborted, just as above; but I wonder if this
needs to be conditional on "visible".  I added a pg_usleep() before
acquiring the update Xid in the relevant case, but the isolation test
cases didn't hit the problem (I presume there is no update/delete in
these test cases, but I didn't check).  I defer to Kevin on this issue.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: MultiXact bugs

From

Andres Freund

Date:

25 November 2013, 16:27:08

On 2013-11-25 12:36:19 -0300, Alvaro Herrera wrote:
> > 2) we frequently error out in heap_lock_updated_tuple_rec() with
> > ERROR:  unable to fetch updated version of tuple
> > 
> > That's because when we're following a ctid chain, it's perfectly
> > possible for the updated version of a tuple to already have been
> > vacuumed/pruned away if the the updating transaction has aborted.
> > 
> > Also looks like a 9.3+ issues to me, slightly higher impact, but in the
> > end you're just getting some errors under concurrency.
> 
> Yes, I think this is 9.3 only.  First attachment shows my proposed
> patch, which is just to report success to caller in case the tuple
> cannot be acquired by heap_fetch.  This is OK because if by the time we
> check the updated version of a tuple it is gone, it means there is no
> further update chain to follow and lock.

Looks good.

> > 1) (takes a bit)
> > TRAP: FailedAssertion("!(((xid) >= ((TransactionId) 3)))", File:/pruneheap.c", Line: 601)
> > 
> > That's because HeapTupleHeaderGetUpdateXid() ignores aborted updaters
> > and returns InvalidTransactionId in that case, but
> > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...
> 
> Interesting.  This is a very narrow race condition: when we call
> HeapTupleSafisfiesVacuum the updater is still running, so it returns
> HEAPTUPLE_DELETE_IN_PROGRESS; but it aborts just before we read the
> tuple's update Xid.

Well, it's not *that* narrow - remember that a transaction is marked as
aborted in the clog *before* it is removed from the proc array.

> There is no way to close the window, but there is no need; if the
> updater aborted, we don't need to mark the page prunable in the first
> place.  So we can just check the return value of
> HeapTupleHeaderGetUpdateXid and if it's InvalidXid, don't set the
> prunable bit.  The second attachment below fixes the bug that way.

I am not sure I like the fact that HeapTupleHeaderGetUpdateXid() checks
for aborted transactions in the first place. Why is that a good idea?
ISTM that wanders off a fair bit from the other HeapTupleHeaderGet*
macros.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: MultiXact bugs

From

Alvaro Herrera

Date:

25 November 2013, 16:46:03

Andres Freund wrote:
> On 2013-11-25 12:36:19 -0300, Alvaro Herrera wrote:

> > There is no way to close the window, but there is no need; if the
> > updater aborted, we don't need to mark the page prunable in the first
> > place.  So we can just check the return value of
> > HeapTupleHeaderGetUpdateXid and if it's InvalidXid, don't set the
> > prunable bit.  The second attachment below fixes the bug that way.
> 
> I am not sure I like the fact that HeapTupleHeaderGetUpdateXid() checks
> for aborted transactions in the first place. Why is that a good idea?
> ISTM that wanders off a fair bit from the other HeapTupleHeaderGet*
> macros.

Originally it didn't, which caused various bugs.  I recall it turned out
to be cleaner to do the check inside it than putting it out to its
callers.

I have thoughts that this design might break other things such as the
priorXmax checking while traversing HOT chains.  Not seeing how: surely
if there's an aborted updater in a tuple, there can't be a followup HOT
chain elsewhere involving the same tuple.  A HOT chain would require
another updater Xid in the MultiXact (and we ensure there can only be
one updater in a multi).  I might be missing something.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: MultiXact bugs

From

Andres Freund

Date:

25 November 2013, 16:58:39

On 2013-11-25 13:45:53 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > On 2013-11-25 12:36:19 -0300, Alvaro Herrera wrote:
> 
> > > There is no way to close the window, but there is no need; if the
> > > updater aborted, we don't need to mark the page prunable in the first
> > > place.  So we can just check the return value of
> > > HeapTupleHeaderGetUpdateXid and if it's InvalidXid, don't set the
> > > prunable bit.  The second attachment below fixes the bug that way.
> > 
> > I am not sure I like the fact that HeapTupleHeaderGetUpdateXid() checks
> > for aborted transactions in the first place. Why is that a good idea?
> > ISTM that wanders off a fair bit from the other HeapTupleHeaderGet*
> > macros.
> 
> Originally it didn't, which caused various bugs.  I recall it turned out
> to be cleaner to do the check inside it than putting it out to its
> callers.

This seems strange to me because we do *not* do those checks for
!IS_MULTI xmax's. So there surely shouldn't be too many callers caring
about that?

> I have thoughts that this design might break other things such as the
> priorXmax checking while traversing HOT chains.  Not seeing how:

Hm. Are you arguing with yourself about this?

> surely
> if there's an aborted updater in a tuple, there can't be a followup HOT
> chain elsewhere involving the same tuple.  A HOT chain would require
> another updater Xid in the MultiXact (and we ensure there can only be
> one updater in a multi).  I might be missing something.

I don't see dangers that way either. I think there might be some strange
behaviour because callers need to check IsRunning() first though, which
MultiXactIdGetUpdateXid() doesn't.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: MultiXact bugs

From

Kevin Grittner

Date:

25 November 2013, 22:10:08

Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Andres Freund wrote:

>> That's because HeapTupleHeaderGetUpdateXid() ignores aborted
>> updaters and returns InvalidTransactionId in that case, but
>> HeapTupleSatisfiesVacuum() returns
>> HEAPTUPLE_DELETE_IN_PROGRESS...

> I checked for other cases where the update Xid is checked after
> HeapTupleSatisfiesVacuum returns HEAPTUPLE_DELETE_IN_PROGRESS.
> As far as I can tell, the only one that would be affected is the
> one in predicate.c.  It is far from clear to me what is the right
> thing to do in these cases; the simplest idea is to return
> without reporting a failure if the updater aborted, just as
> above; but I wonder if this needs to be conditional on "visible".
> I added a pg_usleep() before acquiring the update Xid in the
> relevant case, but the isolation test cases didn't hit the
> problem (I presume there is no update/delete in these test cases,
> but I didn't check).  I defer to Kevin on this issue.

Right now if HeapTupleSatisfiesVacuum() returns
HEAPTUPLE_DELETE_IN_PROGRESS we call
HeapTupleHeaderGetUpdateXid(tuple->t_data) and Assert() that the
result is valid.  It sounds like we should do something like the
attached, maybe?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

multi-xact-predicate-fix-v1.diff

Re: MultiXact bugs

From

Kevin Grittner

Date:

27 November 2013, 23:14:16

Andres Freund <andres@2ndquadrant.com> wrote:

> HeapTupleHeaderGetUpdateXid() ignores aborted updaters
> and returns InvalidTransactionId in that case, but
> HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...

That sure *sounds* like it should cause a problem for this code in
CheckForSerializableConflictOut():

    htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
    switch (htsvResult)
    {
        [ ... ]
        case HEAPTUPLE_DELETE_IN_PROGRESS:
            xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
            break;
        [ ... ]
    }
    Assert(TransactionIdIsValid(xid));

... however, I have not been able to trigger that Assert even with
gdb breakpoints at what I think are the right spots.  Any
suggestions?  How far back is it true that the above
HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same
tuple structure can return InvalidTransactionId?  Is there some
other condition (besides a ROLLBACK of an UPDATE on the tuple being
read) which needs to be met?  Is any particular timing necessary?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: MultiXact bugs

From

Andres Freund

Date:

27 November 2013, 23:24:19

On 2013-11-27 15:14:11 -0800, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
>
> > HeapTupleHeaderGetUpdateXid() ignores aborted updaters
> > and returns InvalidTransactionId in that case, but
> > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...
>
> That sure *sounds* like it should cause a problem for this code in
> CheckForSerializableConflictOut():

Yea. IMNSHO the current state is a API design flaw. We really should be
returning the aborted xid since that's what happens for non-multixact
ids.

>     htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
>     switch (htsvResult)
>     {
>         [ ... ]
>         case HEAPTUPLE_DELETE_IN_PROGRESS:
>             xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
>             break;
>         [ ... ]
>     }
>     Assert(TransactionIdIsValid(xid));
>
> ... however, I have not been able to trigger that Assert even with
> gdb breakpoints at what I think are the right spots.  Any
> suggestions?  How far back is it true that the above
> HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
> but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same
> tuple structure can return InvalidTransactionId?  Is ther

What do you mean with "how far back"?

> e some
> other condition (besides a ROLLBACK of an UPDATE on the tuple being
> read) which needs to be met?  Is any particular timing necessary?

Afaics you need a multixact consisting out of a) the updater and b) a
lock. That's probably easiest to get if you update a row in one session
without changing the primary key, and then key-share lock it in
another. Or the other way round.
Then abort the updater.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: MultiXact bugs

From

Kevin Grittner

Date:

27 November 2013, 23:42:23

Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-11-27 15:14:11 -0800, Kevin Grittner wrote:

>> ... however, I have not been able to trigger that Assert even with
>> gdb breakpoints at what I think are the right spots.  Any
>> suggestions?  How far back is it true that the above
>> HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
>> but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same
>> tuple structure can return InvalidTransactionId?
>
> What do you mean with "how far back"?

What back-patching will be needed for a fix?  It sounds like 9.3?

> Afaics you need a multixact consisting out of a) the updater and b) a
> lock. That's probably easiest to get if you update a row in one session
> without changing the primary key, and then key-share lock it in
> another. Or the other way round.
> Then abort the updater.

Thanks!  I'll keep trying to generate a failure at that point.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: MultiXact bugs

From

Andres Freund

Date:

27 November 2013, 23:44:05

On 2013-11-27 15:42:11 -0800, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > What do you mean with "how far back"?
>
> What back-patching will be needed for a fix?  It sounds like 9.3?

Yep.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: MultiXact bugs

From

Kevin Grittner

Date:

29 November 2013, 21:14:14

Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-11-27 15:42:11 -0800, Kevin Grittner wrote:

>> What back-patching will be needed for a fix?  It sounds like
>> 9.3?
>
> Yep.

In going over this, I found pre-existing bugs when a tuple was both
inserted and deleted by concurrent transactions, but fixing that is
too invasive to consider for Monday's minor release lockdown.  The
attached seems very safe to me, and protects against some new
hazards related to the subtransaction changes (mostly just for an
assert-enabled build, but still worth fixing).  It includes a lot
of work on the comments, to guide the subsequent fixes or other
work in that area.

If nobody objects, I will push it to master and 9.3 tomorrow.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

multi-xact-predicate-fix-v2.patch

Re: MultiXact bugs

From

Andres Freund

Date:

29 November 2013, 21:27:26

On 2013-11-29 13:14:06 -0800, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-11-27 15:42:11 -0800, Kevin Grittner wrote:
>
> >> What back-patching will be needed for a fix?  It sounds like
> >> 9.3?
> >
> > Yep.
>
> In going over this, I found pre-existing bugs when a tuple was both
> inserted and deleted by concurrent transactions, but fixing that is
> too invasive to consider for Monday's minor release lockdown.  The
> attached seems very safe to me, and protects against some new
> hazards related to the subtransaction changes (mostly just for an
> assert-enabled build, but still worth fixing).  It includes a lot
> of work on the comments, to guide the subsequent fixes or other
> work in that area.

> If nobody objects, I will push it to master and 9.3 tomorrow.

Alvaro is about to commit a patch that will remove the behaviour of
GetUpdateXid() to check for IdDidAbort() because it makes other fixes
really complicated and it's a really suprising behaviour. But most of
your patch shouldn't be affected by that.
Check
http://archives.postgresql.org/message-id/20131129193008.GP5513%40eldon.alvh.no-ip.org
for the current state of the series.

Looking at predicate.c I think I see a bigger problem though: Isn't its
usage of HeapTupleSatisfiesVacuum() quite dangerous? It passes
TransactionXmin to HeapTupleSatisfiesVacuum(). But since that's just the
transaction's own cutoff, not the global cutoff that will cause wrong
hint bits to be set. Or am I missing something?
HTSV's comment says:** OldestXmin is a cutoff XID (obtained from GetOldestXmin()).    Tuples* deleted by XIDs >=
OldestXminare deemed "recently dead"; they might* still be visible to some open transaction, so we can't remove them,*
evenif we see that the deleting transaction has committed.*/ 

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: MultiXact bugs

From

Andres Freund

Date:

29 November 2013, 21:34:51

On 2013-11-29 22:27:16 +0100, Andres Freund wrote:
> Looking at predicate.c I think I see a bigger problem though: Isn't its
> usage of HeapTupleSatisfiesVacuum() quite dangerous? It passes
> TransactionXmin to HeapTupleSatisfiesVacuum(). But since that's just the
> transaction's own cutoff, not the global cutoff that will cause wrong
> hint bits to be set. Or am I missing something?
> HTSV's comment says:
>  *
>  * OldestXmin is a cutoff XID (obtained from GetOldestXmin()).    Tuples
>  * deleted by XIDs >= OldestXmin are deemed "recently dead"; they might
>  * still be visible to some open transaction, so we can't remove them,
>  * even if we see that the deleting transaction has committed.
>  */

Strike that, sorry for the noise. I was thinking there was some
conditional hint bit setting based on OldestXmin in there, but I was
misremembering.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: MultiXact bugs

From

Kevin Grittner

Date:

29 November 2013, 22:00:16

Andres Freund <andres@2ndquadrant.com> wrote:

> Looking at predicate.c I think I see a bigger problem though: Isn't its
> usage of HeapTupleSatisfiesVacuum() quite dangerous? It passes
> TransactionXmin to HeapTupleSatisfiesVacuum(). But since that's just the
> transaction's own cutoff, not the global cutoff that will cause wrong
> hint bits to be set. Or am I missing something?

I don't see where that parameter has anything to do with setting
hint bits; it only seems to affect the return code for the caller.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: MultiXact bugs

From

Kevin Grittner

Date:

30 November 2013, 16:41:37

Kevin Grittner <kgrittn@ymail.com> wrote:

> In going over this, I found pre-existing bugs when a tuple was both
> inserted and deleted by concurrent transactions, but fixing that is
> too invasive to consider for Monday's minor release lockdown.  The
> attached seems very safe to me, and protects against some new
> hazards related to the subtransaction changes (mostly just for an
> assert-enabled build, but still worth fixing).  It includes a lot
> of work on the comments, to guide the subsequent fixes or other
> work in that area.
>
> If nobody objects, I will push it to master and 9.3 tomorrow.

Given the recent MultiXact patches, I no longer see any bugs in SSI
which have not been there since the start, and these are in such
remote corner cases that there have so far been no reports of
anyone hitting them.  I will wait until after this emergency
release to patch those; the risk/benefit ratio of doing something
so quickly just doesn't seem good.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company