Thread: BUG #8470: 9.3 locking/subtransaction performance regression

BUG #8470: 9.3 locking/subtransaction performance regression

From
os@ohmu.fi
Date:
The following bug has been logged on the website:

Bug reference:      8470
Logged by:          Oskari Saarenmaa
Email address:      os@ohmu.fi
PostgreSQL version: 9.3.0
Operating system:   Linux
Description:

The following code performs a lot slower on PostgreSQL 9.3.0 than on
PostgreSQL 9.2.4:


DROP TABLE IF EXISTS tmp;
CREATE TABLE tmp (id BIGSERIAL, vals BIGINT[]);
DO $$
DECLARE
    r_id BIGINT;
    n BIGINT;
BEGIN
    FOR n IN 1..1000 LOOP
        BEGIN
            SELECT id INTO r_id FROM tmp WHERE array_length(vals, 1) < 100
LIMIT 1 FOR UPDATE NOWAIT;
        EXCEPTION WHEN lock_not_available THEN
            r_id := NULL;
        END;
        IF r_id IS NULL THEN
            INSERT INTO tmp (vals) VALUES (ARRAY[n]::BIGINT[]);
        ELSE
            UPDATE tmp SET vals = array_append(vals, n::BIGINT) WHERE id =
r_id;
        END IF;
    END LOOP;
END;
$$;


PostgreSQL 9.3.0:
Time: 7278.910 ms


PostgreSQL 9.2.4:
Time: 128.008 ms


Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR UPDATE'
for a suitable row is significantly slower in 9.3.0 (314.765 ms vs 118.894
ms on 9.2.4).  A 'SELECT' without a FOR UPDATE and BEGIN/EXCEPTION/END has
the same performance on 9.2.4 and 9.3.0.


I'm running 9.2.4 and 9.3.0 packages from apt.postgresql.org on a  Debian
Squeeze host.

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
os@ohmu.fi wrote:

> The following code performs a lot slower on PostgreSQL 9.3.0 than on
> PostgreSQL 9.2.4:
>
> DROP TABLE IF EXISTS tmp;
> CREATE TABLE tmp (id BIGSERIAL, vals BIGINT[]);
> DO $$
> DECLARE
>     r_id BIGINT;
>     n BIGINT;
> BEGIN
>     FOR n IN 1..1000 LOOP
>         BEGIN
>             SELECT id INTO r_id FROM tmp WHERE array_length(vals, 1) < 100
> LIMIT 1 FOR UPDATE NOWAIT;

Most likely, this is caused by the new tuple lock code in 9.3.  I can't
look at this immediately but I will give it a look as soon as I'm able.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
AFAICS the problem here is that this test doesn't use MultiXactIds at
all in 9.2, but it does in 9.3.  I vaguely recall Noah tried to convince
me to put in an optimization which would have avoided this issue; I will
give that a thought.  I don't think I will be able to get it done for
9.3.1 though.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Oskari Saarenmaa
Date:
05.10.2013 01:31, Alvaro Herrera kirjoitti:
> AFAICS the problem here is that this test doesn't use MultiXactIds at
> all in 9.2, but it does in 9.3.  I vaguely recall Noah tried to convince
> me to put in an optimization which would have avoided this issue; I will
> give that a thought.  I don't think I will be able to get it done for
> 9.3.1 though.

Is the schedule for 9.3.1 visible somewhere?  Please reconsider trying
to get a fix for this in as it's a pretty big regression from 9.2.  My
application test case that triggered this ran in roughly 3 seconds with
9.2 but timed out after 3 minutes on 9.3.0.  And even without a loop
that amplifies this, just a single 'select for update' inside a
subtransaction is significantly slower in 9.3.

Thanks,
Oskari

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Oskari Saarenmaa wrote:
> 05.10.2013 01:31, Alvaro Herrera kirjoitti:
> > AFAICS the problem here is that this test doesn't use MultiXactIds at
> > all in 9.2, but it does in 9.3.  I vaguely recall Noah tried to convince
> > me to put in an optimization which would have avoided this issue; I will
> > give that a thought.  I don't think I will be able to get it done for
> > 9.3.1 though.
>
> Is the schedule for 9.3.1 visible somewhere?

Not sure.  I just happen to know it'll be git-tagged early next week.

> Please reconsider trying to get a fix for this in as it's a pretty big
> regression from 9.2.

I understand the pressure.  Sadly, I have other things scheduled right
now that I can't postpone.  I might still be able to get to it over the
weekend; if someone else has time and can submit and/or test a patch,
that would perhaps allow me to get it done in time.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Oskari Saarenmaa wrote:
> 05.10.2013 01:31, Alvaro Herrera kirjoitti:
> > AFAICS the problem here is that this test doesn't use MultiXactIds at
> > all in 9.2, but it does in 9.3.  I vaguely recall Noah tried to convince
> > me to put in an optimization which would have avoided this issue; I will
> > give that a thought.  I don't think I will be able to get it done for
> > 9.3.1 though.
>
> Is the schedule for 9.3.1 visible somewhere?

http://www.postgresql.org/message-id/CA+OCxozG4PhbFphbg+ehpJk3TgLFDVyMuzwy+wyf=x1Utt-WAA@mail.gmail.com

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> AFAICS the problem here is that this test doesn't use MultiXactIds at
> all in 9.2, but it does in 9.3.  I vaguely recall Noah tried to convince
> me to put in an optimization which would have avoided this issue; I will
> give that a thought.  I don't think I will be able to get it done for
> 9.3.1 though.

I gave this a deeper look yesterday.  Sadly, that optimization would not
work here; we give stronger guarantees now than 9.2 and that's the
reason for the slowness.  Previously, we just overwrote the lock held by
the ancestor transaction if a subxact grabbed a stronger lock; we no
longer do that, and instead create a multi comprising both of those
locks.  That way if the subxact rolls back, we don't lose the previous
lock we held.

It seems that we could avoid creating a new multixact in a few more
cases, by reusing an already existing multixact; or, if we're forced to
create a new one, omit the member from the old one that is superceded by
our own member with the stronger lock.  However, there is no way this
will affect this test case *at all*.

I see another way forward: if an ancestor takes lock of a certain
strength, and then subxact takes a stronger lock, we could record this
as "ancestor taking the stronger lock", so this could be stored as a
plain Xid and not a Multi.  That way, (1) we do not incur a new
multixact, and (2) the lock would not be lost if the subxact aborts.
This would come at the cost that if the subxact rolls back, the stronger
lock would not be released.

Another thing we could look at is whether we can optimize for the case
of sub-committed subtransactions, i.e. we know these won't roll back in
relation to the current xact.  It seems that would apply here because
when you finish the BEGIN/EXCEPTION/END block, that automatic subxact is
marked sub-committed.  It seems to me we could collapse the locks
acquired by those subxacts to appear as locks owned by its currently
open ancestor.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
> Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR UPDATE'
> for a suitable row is significantly slower in 9.3.0 (314.765 ms vs 118.894
> ms on 9.2.4).  A 'SELECT' without a FOR UPDATE and BEGIN/EXCEPTION/END has
> the same performance on 9.2.4 and 9.3.0.

I have come up with the attached patch.  As far as I can tell it
restores performance roughly to the level of 9.2 for this test case;
could you please try it out and see if it fixes things for you?  It'd be
particularly good if you can check not only the test case you submitted
but also the one that made you explore this in the first place.

I haven't pushed it yet because I haven't yet convinced myself that this
is safe, but my intention is to do it shortly so that it will be in
9.3.3.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
After looking at this some more, I am more convinced that we need to
have HeapTupleSatisfiesUpdate return BeingUpdated when a row is locked
by the same xact.  This is more in line with what we do for MultiXacts,
and allows us to treat some cases in a saner fashion: in particular, if
a transaction grabs a lock on a tuple and a subxact grabs a stronger
lock, we now create a multi recording both locks instead of silently
ignoring the second request.

The fallout from this change wasn't as bad as I had feared; I had to add
some coding in heap_update and heap_delete to prevent them from trying
to wait on their own Xid, but it looks pretty innocuous otherwise.

In ran your test case inflating the loop counters a bit, so that it
shows more exaggerated timings.  The results of running this twice on
9.2 are:
Timing: 6140,506 ms
Timing: 6054,553 ms

I only ran it once in an unpatched 9.3:
Timing: 221204,352

This is patched 9.3:
Timing: 13757,925 ms

Note patched 9.3 is 2x slower than 9.2, which is understandable because
it does more work ... yet I think it returns performance to a much more
acceptable level, comparable to 9.2.

I have more confidence in this patch than in the previous version,
despite it being more intrusive.  The fact that it allows us to get rid
of some very strange coding in pgrowlocks is comforting: we now know
that anytime a tuple is locked we will get BeingUpdated from
HeapTupleSatisfiesUpdate, regardless of *who* locked it, and all the
callers are able to identify this case and act in consequence.  This
seems saner in concept that sometimes returning MayBeUpdated, because
that turns each such return into a condition that must be double-checked
by callers.  This means cleaner code in pgrowlocks around a HTSU call,
for instance.  Also, we have discussed and seriously considered this
idea in several occasions previously in connection with other things.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> I have more confidence in this patch than in the previous version,
> despite it being more intrusive.  The fact that it allows us to get rid
> of some very strange coding in pgrowlocks is comforting: we now know
> that anytime a tuple is locked we will get BeingUpdated from
> HeapTupleSatisfiesUpdate, regardless of *who* locked it, and all the
> callers are able to identify this case and act in consequence.  This
> seems saner in concept that sometimes returning MayBeUpdated, because
> that turns each such return into a condition that must be double-checked
> by callers.  This means cleaner code in pgrowlocks around a HTSU call,
> for instance.  Also, we have discussed and seriously considered this
> idea in several occasions previously in connection with other things.

... in particular, this change allows us to revert the addition of
MultiXactHasRunningRemoteMembers(), and simply use
MultiXactIdIsRunning() in the two places that required it.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> ... in particular, this change allows us to revert the addition of
> MultiXactHasRunningRemoteMembers(), and simply use
> MultiXactIdIsRunning() in the two places that required it.

And doing that enables simplifying this patch a bit, so here it is one
more time.  Hopefully this is the final version -- I intend to push it
to 9.3 and master.

Please *note the behavior change* of HeapTupleSatisfiesUpdate: it will
now return HeapTupleBeingUpdated when a tuple is locked by an Xid that's
a subtransaction of the current top transactionn.  That case previously
returned HeapTupleBeingUpdated.  If someone wants to say that we
shouldn't change that in 9.3, please speak up.  But do note that it was
already returning that valu ewhen the lock was a multixact, so the new
behavior can be said to be more consistent.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> Please *note the behavior change* of HeapTupleSatisfiesUpdate: it will
> now return HeapTupleBeingUpdated when a tuple is locked by an Xid that's
> a subtransaction of the current top transactionn.  That case previously
> returned HeapTupleBeingUpdated.  If someone wants to say that we
> shouldn't change that in 9.3, please speak up.  But do note that it was
> already returning that valu ewhen the lock was a multixact, so the new
> behavior can be said to be more consistent.

I hate to keep replying to myself, but above I intended to say "that
case previously returned HeapTupleMayBeUpdated".

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Andres Freund
Date:
On 2013-12-23 16:53:00 -0300, Alvaro Herrera wrote:
> +        /*
> +         * If any subtransaction of the current top transaction already holds
> +         * a lock on this tuple, (and no one else does,) we must skip sleeping
> +         * on the xwait; that would raise an assert about sleeping on our own
> +         * Xid (or sleep indefinitely in a non-asserts-enabled build.)
> +         *
> +         * Note we don't need to do anything about this when the Xmax is a
> +         * multixact, because that code is already prepared to ignore
> +         * subtransactions of the current top transaction; also, trying to do
> +         * anything other than sleeping during a delete when someone else is
> +         * holding a lock on this tuple would be wrong.

I don't really understand why those two issues are in the same
paragraph, what's their relation? Just that a deleting xact shouldn't be
a multi?

> +         * This must be done before acquiring our tuple lock, to avoid
> +         * deadlocks with other transaction that are already waiting on the
> +         * lock we hold.
> +         */
> +        if (!(tp.t_data->t_infomask & HEAP_XMAX_IS_MULTI) &&
> +            TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tp.t_data)))
> +        {
> +            Assert(HEAP_XMAX_IS_LOCKED_ONLY(tp.t_data->t_infomask));
> +
> +            goto continue_deletion;
> +        }

Man, the control flow in heapam.c isn't getting simpler :/. Could you
rename the label to perform_deletion or such? It's no really continuing
it from my POV.

> +        /*
> +         * If any subtransaction of the current top transaction already holds
> +         * a lock on this tuple, (and no one else does,) we must skip sleeping
> +         * on the xwait; that would raise an assert about sleeping on our own
> +         * Xid (or sleep indefinitely in a non-assertion enabled build.)

I'd reformulate this to note that we'd wait for ourselves - perhaps
noting that that fact would be caught by an assert in assert enabled
builds. The assert isn't the real reason we cannot do this.

>      /*
> -     * We might already hold the desired lock (or stronger), possibly under a
> -     * different subtransaction of the current top transaction.  If so, there
> -     * is no need to change state or issue a WAL record.  We already handled
> -     * the case where this is true for xmax being a MultiXactId, so now check
> -     * for cases where it is a plain TransactionId.
> -     *
> -     * Note in particular that this covers the case where we already hold
> -     * exclusive lock on the tuple and the caller only wants key share or
> -     * share lock. It would certainly not do to give up the exclusive lock.
> -     */
> -    if (!(old_infomask & (HEAP_XMAX_INVALID |
> -                          HEAP_XMAX_COMMITTED |
> -                          HEAP_XMAX_IS_MULTI)) &&
> -        (mode == LockTupleKeyShare ?
> -         (HEAP_XMAX_IS_KEYSHR_LOCKED(old_infomask) ||
> -          HEAP_XMAX_IS_SHR_LOCKED(old_infomask) ||
> -          HEAP_XMAX_IS_EXCL_LOCKED(old_infomask)) :
> -         mode == LockTupleShare ?
> -         (HEAP_XMAX_IS_SHR_LOCKED(old_infomask) ||
> -          HEAP_XMAX_IS_EXCL_LOCKED(old_infomask)) :
> -         (HEAP_XMAX_IS_EXCL_LOCKED(old_infomask))) &&
> -        TransactionIdIsCurrentTransactionId(xmax))
> -    {
> -        LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
> -        /* Probably can't hold tuple lock here, but may as well check */
> -        if (have_tuple_lock)
> -            UnlockTupleTuplock(relation, tid, mode);
> -        return HeapTupleMayBeUpdated;
> -    }

Are you sure transforming this hunk the way you did is functionally
equivalent?

>          if (!MultiXactIdIsRunning(xmax))
>          {
> -            if (HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) ||
> -                TransactionIdDidAbort(MultiXactIdGetUpdateXid(xmax,
> -                                                              old_infomask)))
> +            if ((HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) ||
> +                 !TransactionIdDidCommit(MultiXactIdGetUpdateXid(xmax,
> +                                                                 old_infomask))))

Heh, that's actually a (independent) bug...

What's the test coverage for these cases? It better be 110%...

Greetings,

Andres Freund

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Andres Freund wrote:
> On 2013-12-23 16:53:00 -0300, Alvaro Herrera wrote:

> > +              * This must be done before acquiring our tuple lock, to avoid
> > +              * deadlocks with other transaction that are already waiting on the
> > +              * lock we hold.
> > +              */
> > +             if (!(tp.t_data->t_infomask & HEAP_XMAX_IS_MULTI) &&
> > +                     TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tp.t_data)))
> > +             {
> > +                     Assert(HEAP_XMAX_IS_LOCKED_ONLY(tp.t_data->t_infomask));
> > +
> > +                     goto continue_deletion;
> > +             }
>
> Man, the control flow in heapam.c isn't getting simpler :/. Could you
> rename the label to perform_deletion or such? It's no really continuing
> it from my POV.

I considered the idea of putting the rest of that block inside a new
"else" block.  That would have the same effect.  From a control flow POV
that might be simpler, not sure.  Any opinions on that?  I renamed the
labels for now.

> >      /*
> > -     * We might already hold the desired lock (or stronger), possibly under a
> > -     * different subtransaction of the current top transaction.  If so, there
> > -     * is no need to change state or issue a WAL record.  We already handled
> > -     * the case where this is true for xmax being a MultiXactId, so now check
> > -     * for cases where it is a plain TransactionId.
> > -     *
> > -     * Note in particular that this covers the case where we already hold
> > -     * exclusive lock on the tuple and the caller only wants key share or
> > -     * share lock. It would certainly not do to give up the exclusive lock.
> > -     */
> > -    if (!(old_infomask & (HEAP_XMAX_INVALID |
> > -                          HEAP_XMAX_COMMITTED |
> > -                          HEAP_XMAX_IS_MULTI)) &&
> > -        (mode == LockTupleKeyShare ?
> > -         (HEAP_XMAX_IS_KEYSHR_LOCKED(old_infomask) ||
> > -          HEAP_XMAX_IS_SHR_LOCKED(old_infomask) ||
> > -          HEAP_XMAX_IS_EXCL_LOCKED(old_infomask)) :
> > -         mode == LockTupleShare ?
> > -         (HEAP_XMAX_IS_SHR_LOCKED(old_infomask) ||
> > -          HEAP_XMAX_IS_EXCL_LOCKED(old_infomask)) :
> > -         (HEAP_XMAX_IS_EXCL_LOCKED(old_infomask))) &&
> > -        TransactionIdIsCurrentTransactionId(xmax))
> > -    {
> > -        LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
> > -        /* Probably can't hold tuple lock here, but may as well check */
> > -        if (have_tuple_lock)
> > -            UnlockTupleTuplock(relation, tid, mode);
> > -        return HeapTupleMayBeUpdated;
> > -    }
>
> Are you sure transforming this hunk the way you did is functionally
> equivalent?

It's not, actually -- with the new code, we will go to all the trouble
of setting OldestMulti, marking the buffer dirty and doing some
WAL-logging, whereas the old code wouldn't do any of that.  But the new
code handles more cases and is easier to understand (at least IMV).

I considered the idea of checking whether the new Xmax we would be
setting on the tuple is the same as the Xmax the tuple already had, and
skip doing those things if so.  But it didn't seem enough of a win to me
to warrant the complexity of checking (I think we'd need to check the
lock/update bits also, not just the xmax value itself.)

> >          if (!MultiXactIdIsRunning(xmax))
> >          {
> > -            if (HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) ||
> > -                TransactionIdDidAbort(MultiXactIdGetUpdateXid(xmax,
> > -                                                              old_infomask)))
> > +            if ((HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) ||
> > +                 !TransactionIdDidCommit(MultiXactIdGetUpdateXid(xmax,
> > +                                                                 old_infomask))))
>
> Heh, that's actually a (independent) bug...

I'm not sure it's a bug fix ... this is just additionally considering
optimizing (avoiding a multi) in the case that a transaction has crashed
without aborting; I don't think there would be any functional
difference.  We would end up spuriously calling MultiXactIdExpand, but
that routine already ignores crashed updaters so it would end up
creating a singleton multi.

This doesn't seem worth a separate commit, if that's what you're
thinking.  Do you disagree?

> What's the test coverage for these cases? It better be 110%...

110% is rather hard to achieve, particularly because of the cases
involving crashed transactions.  Other than that I think I tested all
those paths.  I will go through it again.


Thanks for the review; here's an updated version.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> > What's the test coverage for these cases? It better be 110%...
>
> 110% is rather hard to achieve, particularly because of the cases
> involving crashed transactions.  Other than that I think I tested all
> those paths.  I will go through it again.

Actually, there was a nasty bug in that formulation of the patch; in
heap_lock_tuple I mixed the "is multi" case in with the "do we own a
lock" check, which are completely unrelated and cause a number of
regressions in isolation tests.  Here's a reworked version.  I removed
those new labels and gotos; seems better this way.

Note there are some new blocks that need to be reflowed and pgindented;
also the diff contain some spurious whitespace changes that are undoing
some pgindent decisions.  There are also some spurious changes which
have to do with a later patch I have -- the one that optimizes for
LOCK_ONLY multis.  I tried to avoid including unnecessary whitespace
changes so that the actual code changes are easier to spot, but if I was
to remove them all I wouldn't have been able to submit the patch this
year, so please bear with me.

I generated the "make coverage" report after having run initdb, the
isolation tests and the regular regression tests; it shows almost all
the interesting paths here are being taken.  (There are only a couple of
cases not being tested in compute_new_xmax_infomask, but they are
simpler ones.  I will see about modifying one of the existing isolation
tests to cover those cases as well.)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Oskari Saarenmaa
Date:
Thanks for looking at this and sorry about the late reply, I finally got
around to testing your latest patch.

20.12.2013 22:47, Alvaro Herrera kirjoitti:
>> Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR
>> UPDATE' for a suitable row is significantly slower in 9.3.0 (314.765 ms
>> vs 118.894 ms on 9.2.4).  A 'SELECT' without a FOR UPDATE and
>> BEGIN/EXCEPTION/END has the same performance on 9.2.4 and 9.3.0.
>
> I have come up with the attached patch.  As far as I can tell it
> restores performance roughly to the level of 9.2 for this test case;
> could you please try it out and see if it fixes things for you?  It'd be
> particularly good if you can check not only the test case you submitted
> but also the one that made you explore this in the first place.

I didn't try this patch, but I built today's REL9_3_STABLE with the patch
from your mail on the 31st of Dec
(http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and
ran the older version of my appliaction's test suite which has a case that
times out after 3 minutes with unpatched 9.3.  With the current patch the
test case successfully completes in ~20 seconds which is a major improvement
although it's still slower than 9.2  It also passes all other test cases in
my test suite.

Do you think it'll make it to 9.3.3?

Thanks,
Oskari

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Oskari Saarenmaa wrote:
> Thanks for looking at this and sorry about the late reply, I finally got
> around to testing your latest patch.
>
> 20.12.2013 22:47, Alvaro Herrera kirjoitti:
> >> Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR
> >> UPDATE' for a suitable row is significantly slower in 9.3.0 (314.765 ms
> >> vs 118.894 ms on 9.2.4).  A 'SELECT' without a FOR UPDATE and
> >> BEGIN/EXCEPTION/END has the same performance on 9.2.4 and 9.3.0.
> >
> > I have come up with the attached patch.  As far as I can tell it
> > restores performance roughly to the level of 9.2 for this test case;
> > could you please try it out and see if it fixes things for you?  It'd be
> > particularly good if you can check not only the test case you submitted
> > but also the one that made you explore this in the first place.
>
> I didn't try this patch, but I built today's REL9_3_STABLE with the patch
> from your mail on the 31st of Dec
> (http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and
> ran the older version of my appliaction's test suite which has a case that
> times out after 3 minutes with unpatched 9.3.  With the current patch the
> test case successfully completes in ~20 seconds which is a major improvement
> although it's still slower than 9.2  It also passes all other test cases in
> my test suite.
>
> Do you think it'll make it to 9.3.3?

Well, I gave this patch a long, careful look and eventually noticed that
there was a rather subtle but serious design bug in it, and I had to
ditch it.  I have the intention to come back to this problem again
later, but right now I cannot.

I'm not sure whether I will have something for 9.3.3, but it's not
likely because that'll happen only in a week or two.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Tomonari Katsumata
Date:
Hi,

I don't understand about this problem, but
it seems this problem is still remaining in 93_STABLE(*).
(*)f1e522696fbc01b5c8a41dbd45d6cbae229be425

9.3.2   : 4.225(s)
93_STABE: 2.819(s)
9.2.6   : 0.135(s)

93_STABLE is faster than 9.3.2, but it's apparently
slower than 9.2.6.

9.3.3 is going to be released in next week.
Will any fixes be included in 9.3.3?

regards,
-----------
Tomonari Katsumata

(2014/01/14 22:09), Alvaro Herrera wrote:
 > Oskari Saarenmaa wrote:
 >> Thanks for looking at this and sorry about the late reply, I finally got
 >> around to testing your latest patch.
 >>
 >> 20.12.2013 22:47, Alvaro Herrera kirjoitti:
 >>>> Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR
 >>>> UPDATE' for a suitable row is significantly slower in 9.3.0
(314.765 ms
 >>>> vs 118.894 ms on 9.2.4).  A 'SELECT' without a FOR UPDATE and
 >>>> BEGIN/EXCEPTION/END has the same performance on 9.2.4 and 9.3.0.
 >>>
 >>> I have come up with the attached patch.  As far as I can tell it
 >>> restores performance roughly to the level of 9.2 for this test case;
 >>> could you please try it out and see if it fixes things for you?
It'd be
 >>> particularly good if you can check not only the test case you submitted
 >>> but also the one that made you explore this in the first place.
 >>
 >> I didn't try this patch, but I built today's REL9_3_STABLE with the
patch
 >> from your mail on the 31st of Dec
 >>
(http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and
 >> ran the older version of my appliaction's test suite which has a
case that
 >> times out after 3 minutes with unpatched 9.3.  With the current
patch the
 >> test case successfully completes in ~20 seconds which is a major
improvement
 >> although it's still slower than 9.2  It also passes all other test
cases in
 >> my test suite.
 >>
 >> Do you think it'll make it to 9.3.3?
 >
 > Well, I gave this patch a long, careful look and eventually noticed that
 > there was a rather subtle but serious design bug in it, and I had to
 > ditch it.  I have the intention to come back to this problem again
 > later, but right now I cannot.
 >
 > I'm not sure whether I will have something for 9.3.3, but it's not
 > likely because that'll happen only in a week or two.
 >

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
chrisrow
Date:
In spare time,different people have different hobbies.Some people may choose
to go on a date with their lovers with beautiful hairdressing and
fashionable clothes.To improve personal image,more and more people choose to
wear  * <http://www.hairwigsall.com/categories/Synthetic-Wigs/> synthetic
wig*  even including men.And to my surprise,there are mens long wigs for
sale in the market.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-8470-9-3-locking-subtransaction-performance-regression-tp5772337p5793623.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
yosxpe23
Date:
i also like to collect all kinds of fantastic wigs which i purchase online
with great discount price! In my opinion,  wigsalemall.com
<http://wigsalemall.com >  is a good choice!



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-8470-9-3-locking-subtransaction-performance-regression-tp5772337p5805217.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
I just got around to merging this patch to 9.5 sources.  I'm glad I
did it because in the course of doing so I noticed a bug in a recent
patch, which led to commit d5e3d1e969d2f65009f718d3100d6565f47f9112
(back-patched to 9.3).

I'm now more confident in this code and will probably push this a few
days, but only to 9.5 at least for now.  It probably won't apply cleanly
to 9.3 due to other changes in the area, such as 05315498012530d44cd89a2
and df630b0dd5ea2de52972d456f5978a012436115e and others.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Bruce Momjian
Date:
On Mon, Jan  5, 2015 at 02:23:18PM -0300, Alvaro Herrera wrote:
> I just got around to merging this patch to 9.5 sources.  I'm glad I
> did it because in the course of doing so I noticed a bug in a recent
> patch, which led to commit d5e3d1e969d2f65009f718d3100d6565f47f9112
> (back-patched to 9.3).
>
> I'm now more confident in this code and will probably push this a few
> days, but only to 9.5 at least for now.  It probably won't apply cleanly
> to 9.3 due to other changes in the area, such as 05315498012530d44cd89a2
> and df630b0dd5ea2de52972d456f5978a012436115e and others.

Where are we on this?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Bruce Momjian wrote:
> On Mon, Jan  5, 2015 at 02:23:18PM -0300, Alvaro Herrera wrote:
> > I just got around to merging this patch to 9.5 sources.  I'm glad I
> > did it because in the course of doing so I noticed a bug in a recent
> > patch, which led to commit d5e3d1e969d2f65009f718d3100d6565f47f9112
> > (back-patched to 9.3).
> >
> > I'm now more confident in this code and will probably push this a few
> > days, but only to 9.5 at least for now.  It probably won't apply cleanly
> > to 9.3 due to other changes in the area, such as 05315498012530d44cd89a2
> > and df630b0dd5ea2de52972d456f5978a012436115e and others.
>
> Where are we on this?

That's indeed the question.  I gave this a look yesterday and came up
with two patches that "fix" the issue.  (I had to fix one additional bug
in the formulation of the complex patch that I posted in January).  I
leave the details of the bug as exercise for the interested readers
(hint: run the isolation tests).  First some timings.  Ran the script
against unpatched master and each of the patches five times.  Results:

unpatched:
real    0m8.192s
real    0m8.230s
real    0m8.233s
real    0m8.187s
real    0m8.212s

simple patch:
real    0m0.741s
real    0m0.728s
real    0m0.729s
real    0m0.738s
real    0m0.731s

complex patch:
real    0m0.732s
real    0m0.723s
real    0m0.730s
real    0m0.725s
real    0m0.726s

In 9.2 the time is about 0.150s, so the regression is not completely
resolved, but it's a huge improvement.

The main catch of the "simple" formulation of the patch is that we do
the new GetMultiXactIdMembers call with the buffer lock held, which is a
horrible idea from a concurrency point of view; it will make many cases
where the optimization doesn't apply a lot slower.  I think with some
extra contortions we could fix that problem, but it's already quite ugly
that we have a duplicate check for the are-we-already-a-multixact-locker
so I reject the idea that this seemingly simple patch is any good.  I
much prefer the complex formulation, which is what I had to start with,
and makes thing a bit less unclear(*).

There was some traction to the idea of backpatching this, but I'm no
longer on board with that.  If somebody wants to, I would like some
commitment to a huge testing effort.


(*) In a scale 1 to 10 with 10 being most unclear, the original code is
about 12-unclear, and with the patch is 11-unclear.  So it's an
improvement anyhow.

PS: Apologies for unified diff.  This is one case where filterdiff
dropped some hunks from the patch produced by git diff.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Oskari Saarenmaa wrote:

Hi,

> I didn't try this patch, but I built today's REL9_3_STABLE with the patch
> from your mail on the 31st of Dec
> (http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and
> ran the older version of my appliaction's test suite which has a case that
> times out after 3 minutes with unpatched 9.3.  With the current patch the
> test case successfully completes in ~20 seconds which is a major improvement
> although it's still slower than 9.2  It also passes all other test cases in
> my test suite.
>
> Do you think it'll make it to 9.3.3?

I just pushed this to 9.5.  I'm not sure how much value I place on a
backpatch at this point.  There were a number of complaints about the
performance regression, but other than yours I got no report on whether
the fix was good, or on whether it was stable.  If you have other
thoughts, I'm willing to listen.  Anyway if you could test the patch I
pushed, it'd be great.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Oskari Saarenmaa
Date:
10.04.2015, 19:57, Alvaro Herrera kirjoitti:
>> I didn't try this patch, but I built today's REL9_3_STABLE with the patch
>> from your mail on the 31st of Dec
>> (http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and
>> ran the older version of my appliaction's test suite which has a case that
>> times out after 3 minutes with unpatched 9.3.  With the current patch the
>> test case successfully completes in ~20 seconds which is a major improvement
>> although it's still slower than 9.2  It also passes all other test cases in
>> my test suite.
>>
>> Do you think it'll make it to 9.3.3?
>
> I just pushed this to 9.5.  I'm not sure how much value I place on a
> backpatch at this point.  There were a number of complaints about the
> performance regression, but other than yours I got no report on whether
> the fix was good, or on whether it was stable.  If you have other
> thoughts, I'm willing to listen.  Anyway if you could test the patch I
> pushed, it'd be great.

The component where this was an issue was totally rewritten already a
while ago to allow us to take 9.3 into use and as I can't run the old
version anymore I can't verify the fix in a real application.  The code
we had there was almost identical to the test case I posted.

I don't think the fix needs to be backpatched.

Thanks,
Oskari

Re: BUG #8470: 9.3 locking/subtransaction performance regression

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> The main catch of the "simple" formulation of the patch is that we do
> the new GetMultiXactIdMembers call with the buffer lock held, which is a
> horrible idea from a concurrency point of view; it will make many cases
> where the optimization doesn't apply a lot slower.  I think with some
> extra contortions we could fix that problem, but it's already quite ugly
> that we have a duplicate check for the are-we-already-a-multixact-locker
> so I reject the idea that this seemingly simple patch is any good.  I
> much prefer the complex formulation, which is what I had to start with,
> and makes thing a bit less unclear(*).

Here's an updated version of this patch.  This applies to 9.3 and 9.4 and
restores performance to what's in master, while being much less invasive
than what was committed to master.  But more importantly, what it does
is reduce the typical size of multixacts.

Since the size of multixacts is so critical with regards to the problems
of members overrun, I think it is a good idea to reduce it.  In master
we already have an equivalent of this optimization.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment