Thread: BUG #8470: 9.3 locking/subtransaction performance regression
The following bug has been logged on the website: Bug reference: 8470 Logged by: Oskari Saarenmaa Email address: os@ohmu.fi PostgreSQL version: 9.3.0 Operating system: Linux Description: The following code performs a lot slower on PostgreSQL 9.3.0 than on PostgreSQL 9.2.4: DROP TABLE IF EXISTS tmp; CREATE TABLE tmp (id BIGSERIAL, vals BIGINT[]); DO $$ DECLARE r_id BIGINT; n BIGINT; BEGIN FOR n IN 1..1000 LOOP BEGIN SELECT id INTO r_id FROM tmp WHERE array_length(vals, 1) < 100 LIMIT 1 FOR UPDATE NOWAIT; EXCEPTION WHEN lock_not_available THEN r_id := NULL; END; IF r_id IS NULL THEN INSERT INTO tmp (vals) VALUES (ARRAY[n]::BIGINT[]); ELSE UPDATE tmp SET vals = array_append(vals, n::BIGINT) WHERE id = r_id; END IF; END LOOP; END; $$; PostgreSQL 9.3.0: Time: 7278.910 ms PostgreSQL 9.2.4: Time: 128.008 ms Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR UPDATE' for a suitable row is significantly slower in 9.3.0 (314.765 ms vs 118.894 ms on 9.2.4). A 'SELECT' without a FOR UPDATE and BEGIN/EXCEPTION/END has the same performance on 9.2.4 and 9.3.0. I'm running 9.2.4 and 9.3.0 packages from apt.postgresql.org on a Debian Squeeze host.
os@ohmu.fi wrote: > The following code performs a lot slower on PostgreSQL 9.3.0 than on > PostgreSQL 9.2.4: > > DROP TABLE IF EXISTS tmp; > CREATE TABLE tmp (id BIGSERIAL, vals BIGINT[]); > DO $$ > DECLARE > r_id BIGINT; > n BIGINT; > BEGIN > FOR n IN 1..1000 LOOP > BEGIN > SELECT id INTO r_id FROM tmp WHERE array_length(vals, 1) < 100 > LIMIT 1 FOR UPDATE NOWAIT; Most likely, this is caused by the new tuple lock code in 9.3. I can't look at this immediately but I will give it a look as soon as I'm able. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
AFAICS the problem here is that this test doesn't use MultiXactIds at all in 9.2, but it does in 9.3. I vaguely recall Noah tried to convince me to put in an optimization which would have avoided this issue; I will give that a thought. I don't think I will be able to get it done for 9.3.1 though. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
05.10.2013 01:31, Alvaro Herrera kirjoitti: > AFAICS the problem here is that this test doesn't use MultiXactIds at > all in 9.2, but it does in 9.3. I vaguely recall Noah tried to convince > me to put in an optimization which would have avoided this issue; I will > give that a thought. I don't think I will be able to get it done for > 9.3.1 though. Is the schedule for 9.3.1 visible somewhere? Please reconsider trying to get a fix for this in as it's a pretty big regression from 9.2. My application test case that triggered this ran in roughly 3 seconds with 9.2 but timed out after 3 minutes on 9.3.0. And even without a loop that amplifies this, just a single 'select for update' inside a subtransaction is significantly slower in 9.3. Thanks, Oskari
Oskari Saarenmaa wrote: > 05.10.2013 01:31, Alvaro Herrera kirjoitti: > > AFAICS the problem here is that this test doesn't use MultiXactIds at > > all in 9.2, but it does in 9.3. I vaguely recall Noah tried to convince > > me to put in an optimization which would have avoided this issue; I will > > give that a thought. I don't think I will be able to get it done for > > 9.3.1 though. > > Is the schedule for 9.3.1 visible somewhere? Not sure. I just happen to know it'll be git-tagged early next week. > Please reconsider trying to get a fix for this in as it's a pretty big > regression from 9.2. I understand the pressure. Sadly, I have other things scheduled right now that I can't postpone. I might still be able to get to it over the weekend; if someone else has time and can submit and/or test a patch, that would perhaps allow me to get it done in time. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Oskari Saarenmaa wrote: > 05.10.2013 01:31, Alvaro Herrera kirjoitti: > > AFAICS the problem here is that this test doesn't use MultiXactIds at > > all in 9.2, but it does in 9.3. I vaguely recall Noah tried to convince > > me to put in an optimization which would have avoided this issue; I will > > give that a thought. I don't think I will be able to get it done for > > 9.3.1 though. > > Is the schedule for 9.3.1 visible somewhere? http://www.postgresql.org/message-id/CA+OCxozG4PhbFphbg+ehpJk3TgLFDVyMuzwy+wyf=x1Utt-WAA@mail.gmail.com -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera wrote: > AFAICS the problem here is that this test doesn't use MultiXactIds at > all in 9.2, but it does in 9.3. I vaguely recall Noah tried to convince > me to put in an optimization which would have avoided this issue; I will > give that a thought. I don't think I will be able to get it done for > 9.3.1 though. I gave this a deeper look yesterday. Sadly, that optimization would not work here; we give stronger guarantees now than 9.2 and that's the reason for the slowness. Previously, we just overwrote the lock held by the ancestor transaction if a subxact grabbed a stronger lock; we no longer do that, and instead create a multi comprising both of those locks. That way if the subxact rolls back, we don't lose the previous lock we held. It seems that we could avoid creating a new multixact in a few more cases, by reusing an already existing multixact; or, if we're forced to create a new one, omit the member from the old one that is superceded by our own member with the stronger lock. However, there is no way this will affect this test case *at all*. I see another way forward: if an ancestor takes lock of a certain strength, and then subxact takes a stronger lock, we could record this as "ancestor taking the stronger lock", so this could be stored as a plain Xid and not a Multi. That way, (1) we do not incur a new multixact, and (2) the lock would not be lost if the subxact aborts. This would come at the cost that if the subxact rolls back, the stronger lock would not be released. Another thing we could look at is whether we can optimize for the case of sub-committed subtransactions, i.e. we know these won't roll back in relation to the current xact. It seems that would apply here because when you finish the BEGIN/EXCEPTION/END block, that automatic subxact is marked sub-committed. It seems to me we could collapse the locks acquired by those subxacts to appear as locks owned by its currently open ancestor. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
> Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR UPDATE' > for a suitable row is significantly slower in 9.3.0 (314.765 ms vs 118.894 > ms on 9.2.4). A 'SELECT' without a FOR UPDATE and BEGIN/EXCEPTION/END has > the same performance on 9.2.4 and 9.3.0. I have come up with the attached patch. As far as I can tell it restores performance roughly to the level of 9.2 for this test case; could you please try it out and see if it fixes things for you? It'd be particularly good if you can check not only the test case you submitted but also the one that made you explore this in the first place. I haven't pushed it yet because I haven't yet convinced myself that this is safe, but my intention is to do it shortly so that it will be in 9.3.3. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
After looking at this some more, I am more convinced that we need to have HeapTupleSatisfiesUpdate return BeingUpdated when a row is locked by the same xact. This is more in line with what we do for MultiXacts, and allows us to treat some cases in a saner fashion: in particular, if a transaction grabs a lock on a tuple and a subxact grabs a stronger lock, we now create a multi recording both locks instead of silently ignoring the second request. The fallout from this change wasn't as bad as I had feared; I had to add some coding in heap_update and heap_delete to prevent them from trying to wait on their own Xid, but it looks pretty innocuous otherwise. In ran your test case inflating the loop counters a bit, so that it shows more exaggerated timings. The results of running this twice on 9.2 are: Timing: 6140,506 ms Timing: 6054,553 ms I only ran it once in an unpatched 9.3: Timing: 221204,352 This is patched 9.3: Timing: 13757,925 ms Note patched 9.3 is 2x slower than 9.2, which is understandable because it does more work ... yet I think it returns performance to a much more acceptable level, comparable to 9.2. I have more confidence in this patch than in the previous version, despite it being more intrusive. The fact that it allows us to get rid of some very strange coding in pgrowlocks is comforting: we now know that anytime a tuple is locked we will get BeingUpdated from HeapTupleSatisfiesUpdate, regardless of *who* locked it, and all the callers are able to identify this case and act in consequence. This seems saner in concept that sometimes returning MayBeUpdated, because that turns each such return into a condition that must be double-checked by callers. This means cleaner code in pgrowlocks around a HTSU call, for instance. Also, we have discussed and seriously considered this idea in several occasions previously in connection with other things. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera wrote: > I have more confidence in this patch than in the previous version, > despite it being more intrusive. The fact that it allows us to get rid > of some very strange coding in pgrowlocks is comforting: we now know > that anytime a tuple is locked we will get BeingUpdated from > HeapTupleSatisfiesUpdate, regardless of *who* locked it, and all the > callers are able to identify this case and act in consequence. This > seems saner in concept that sometimes returning MayBeUpdated, because > that turns each such return into a condition that must be double-checked > by callers. This means cleaner code in pgrowlocks around a HTSU call, > for instance. Also, we have discussed and seriously considered this > idea in several occasions previously in connection with other things. ... in particular, this change allows us to revert the addition of MultiXactHasRunningRemoteMembers(), and simply use MultiXactIdIsRunning() in the two places that required it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera wrote: > ... in particular, this change allows us to revert the addition of > MultiXactHasRunningRemoteMembers(), and simply use > MultiXactIdIsRunning() in the two places that required it. And doing that enables simplifying this patch a bit, so here it is one more time. Hopefully this is the final version -- I intend to push it to 9.3 and master. Please *note the behavior change* of HeapTupleSatisfiesUpdate: it will now return HeapTupleBeingUpdated when a tuple is locked by an Xid that's a subtransaction of the current top transactionn. That case previously returned HeapTupleBeingUpdated. If someone wants to say that we shouldn't change that in 9.3, please speak up. But do note that it was already returning that valu ewhen the lock was a multixact, so the new behavior can be said to be more consistent. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera wrote: > Please *note the behavior change* of HeapTupleSatisfiesUpdate: it will > now return HeapTupleBeingUpdated when a tuple is locked by an Xid that's > a subtransaction of the current top transactionn. That case previously > returned HeapTupleBeingUpdated. If someone wants to say that we > shouldn't change that in 9.3, please speak up. But do note that it was > already returning that valu ewhen the lock was a multixact, so the new > behavior can be said to be more consistent. I hate to keep replying to myself, but above I intended to say "that case previously returned HeapTupleMayBeUpdated". -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2013-12-23 16:53:00 -0300, Alvaro Herrera wrote: > + /* > + * If any subtransaction of the current top transaction already holds > + * a lock on this tuple, (and no one else does,) we must skip sleeping > + * on the xwait; that would raise an assert about sleeping on our own > + * Xid (or sleep indefinitely in a non-asserts-enabled build.) > + * > + * Note we don't need to do anything about this when the Xmax is a > + * multixact, because that code is already prepared to ignore > + * subtransactions of the current top transaction; also, trying to do > + * anything other than sleeping during a delete when someone else is > + * holding a lock on this tuple would be wrong. I don't really understand why those two issues are in the same paragraph, what's their relation? Just that a deleting xact shouldn't be a multi? > + * This must be done before acquiring our tuple lock, to avoid > + * deadlocks with other transaction that are already waiting on the > + * lock we hold. > + */ > + if (!(tp.t_data->t_infomask & HEAP_XMAX_IS_MULTI) && > + TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tp.t_data))) > + { > + Assert(HEAP_XMAX_IS_LOCKED_ONLY(tp.t_data->t_infomask)); > + > + goto continue_deletion; > + } Man, the control flow in heapam.c isn't getting simpler :/. Could you rename the label to perform_deletion or such? It's no really continuing it from my POV. > + /* > + * If any subtransaction of the current top transaction already holds > + * a lock on this tuple, (and no one else does,) we must skip sleeping > + * on the xwait; that would raise an assert about sleeping on our own > + * Xid (or sleep indefinitely in a non-assertion enabled build.) I'd reformulate this to note that we'd wait for ourselves - perhaps noting that that fact would be caught by an assert in assert enabled builds. The assert isn't the real reason we cannot do this. > /* > - * We might already hold the desired lock (or stronger), possibly under a > - * different subtransaction of the current top transaction. If so, there > - * is no need to change state or issue a WAL record. We already handled > - * the case where this is true for xmax being a MultiXactId, so now check > - * for cases where it is a plain TransactionId. > - * > - * Note in particular that this covers the case where we already hold > - * exclusive lock on the tuple and the caller only wants key share or > - * share lock. It would certainly not do to give up the exclusive lock. > - */ > - if (!(old_infomask & (HEAP_XMAX_INVALID | > - HEAP_XMAX_COMMITTED | > - HEAP_XMAX_IS_MULTI)) && > - (mode == LockTupleKeyShare ? > - (HEAP_XMAX_IS_KEYSHR_LOCKED(old_infomask) || > - HEAP_XMAX_IS_SHR_LOCKED(old_infomask) || > - HEAP_XMAX_IS_EXCL_LOCKED(old_infomask)) : > - mode == LockTupleShare ? > - (HEAP_XMAX_IS_SHR_LOCKED(old_infomask) || > - HEAP_XMAX_IS_EXCL_LOCKED(old_infomask)) : > - (HEAP_XMAX_IS_EXCL_LOCKED(old_infomask))) && > - TransactionIdIsCurrentTransactionId(xmax)) > - { > - LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); > - /* Probably can't hold tuple lock here, but may as well check */ > - if (have_tuple_lock) > - UnlockTupleTuplock(relation, tid, mode); > - return HeapTupleMayBeUpdated; > - } Are you sure transforming this hunk the way you did is functionally equivalent? > if (!MultiXactIdIsRunning(xmax)) > { > - if (HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) || > - TransactionIdDidAbort(MultiXactIdGetUpdateXid(xmax, > - old_infomask))) > + if ((HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) || > + !TransactionIdDidCommit(MultiXactIdGetUpdateXid(xmax, > + old_infomask)))) Heh, that's actually a (independent) bug... What's the test coverage for these cases? It better be 110%... Greetings, Andres Freund
Andres Freund wrote: > On 2013-12-23 16:53:00 -0300, Alvaro Herrera wrote: > > + * This must be done before acquiring our tuple lock, to avoid > > + * deadlocks with other transaction that are already waiting on the > > + * lock we hold. > > + */ > > + if (!(tp.t_data->t_infomask & HEAP_XMAX_IS_MULTI) && > > + TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tp.t_data))) > > + { > > + Assert(HEAP_XMAX_IS_LOCKED_ONLY(tp.t_data->t_infomask)); > > + > > + goto continue_deletion; > > + } > > Man, the control flow in heapam.c isn't getting simpler :/. Could you > rename the label to perform_deletion or such? It's no really continuing > it from my POV. I considered the idea of putting the rest of that block inside a new "else" block. That would have the same effect. From a control flow POV that might be simpler, not sure. Any opinions on that? I renamed the labels for now. > > /* > > - * We might already hold the desired lock (or stronger), possibly under a > > - * different subtransaction of the current top transaction. If so, there > > - * is no need to change state or issue a WAL record. We already handled > > - * the case where this is true for xmax being a MultiXactId, so now check > > - * for cases where it is a plain TransactionId. > > - * > > - * Note in particular that this covers the case where we already hold > > - * exclusive lock on the tuple and the caller only wants key share or > > - * share lock. It would certainly not do to give up the exclusive lock. > > - */ > > - if (!(old_infomask & (HEAP_XMAX_INVALID | > > - HEAP_XMAX_COMMITTED | > > - HEAP_XMAX_IS_MULTI)) && > > - (mode == LockTupleKeyShare ? > > - (HEAP_XMAX_IS_KEYSHR_LOCKED(old_infomask) || > > - HEAP_XMAX_IS_SHR_LOCKED(old_infomask) || > > - HEAP_XMAX_IS_EXCL_LOCKED(old_infomask)) : > > - mode == LockTupleShare ? > > - (HEAP_XMAX_IS_SHR_LOCKED(old_infomask) || > > - HEAP_XMAX_IS_EXCL_LOCKED(old_infomask)) : > > - (HEAP_XMAX_IS_EXCL_LOCKED(old_infomask))) && > > - TransactionIdIsCurrentTransactionId(xmax)) > > - { > > - LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); > > - /* Probably can't hold tuple lock here, but may as well check */ > > - if (have_tuple_lock) > > - UnlockTupleTuplock(relation, tid, mode); > > - return HeapTupleMayBeUpdated; > > - } > > Are you sure transforming this hunk the way you did is functionally > equivalent? It's not, actually -- with the new code, we will go to all the trouble of setting OldestMulti, marking the buffer dirty and doing some WAL-logging, whereas the old code wouldn't do any of that. But the new code handles more cases and is easier to understand (at least IMV). I considered the idea of checking whether the new Xmax we would be setting on the tuple is the same as the Xmax the tuple already had, and skip doing those things if so. But it didn't seem enough of a win to me to warrant the complexity of checking (I think we'd need to check the lock/update bits also, not just the xmax value itself.) > > if (!MultiXactIdIsRunning(xmax)) > > { > > - if (HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) || > > - TransactionIdDidAbort(MultiXactIdGetUpdateXid(xmax, > > - old_infomask))) > > + if ((HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) || > > + !TransactionIdDidCommit(MultiXactIdGetUpdateXid(xmax, > > + old_infomask)))) > > Heh, that's actually a (independent) bug... I'm not sure it's a bug fix ... this is just additionally considering optimizing (avoiding a multi) in the case that a transaction has crashed without aborting; I don't think there would be any functional difference. We would end up spuriously calling MultiXactIdExpand, but that routine already ignores crashed updaters so it would end up creating a singleton multi. This doesn't seem worth a separate commit, if that's what you're thinking. Do you disagree? > What's the test coverage for these cases? It better be 110%... 110% is rather hard to achieve, particularly because of the cases involving crashed transactions. Other than that I think I tested all those paths. I will go through it again. Thanks for the review; here's an updated version. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera wrote: > > What's the test coverage for these cases? It better be 110%... > > 110% is rather hard to achieve, particularly because of the cases > involving crashed transactions. Other than that I think I tested all > those paths. I will go through it again. Actually, there was a nasty bug in that formulation of the patch; in heap_lock_tuple I mixed the "is multi" case in with the "do we own a lock" check, which are completely unrelated and cause a number of regressions in isolation tests. Here's a reworked version. I removed those new labels and gotos; seems better this way. Note there are some new blocks that need to be reflowed and pgindented; also the diff contain some spurious whitespace changes that are undoing some pgindent decisions. There are also some spurious changes which have to do with a later patch I have -- the one that optimizes for LOCK_ONLY multis. I tried to avoid including unnecessary whitespace changes so that the actual code changes are easier to spot, but if I was to remove them all I wouldn't have been able to submit the patch this year, so please bear with me. I generated the "make coverage" report after having run initdb, the isolation tests and the regular regression tests; it shows almost all the interesting paths here are being taken. (There are only a couple of cases not being tested in compute_new_xmax_infomask, but they are simpler ones. I will see about modifying one of the existing isolation tests to cover those cases as well.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Thanks for looking at this and sorry about the late reply, I finally got around to testing your latest patch. 20.12.2013 22:47, Alvaro Herrera kirjoitti: >> Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR >> UPDATE' for a suitable row is significantly slower in 9.3.0 (314.765 ms >> vs 118.894 ms on 9.2.4). A 'SELECT' without a FOR UPDATE and >> BEGIN/EXCEPTION/END has the same performance on 9.2.4 and 9.3.0. > > I have come up with the attached patch. As far as I can tell it > restores performance roughly to the level of 9.2 for this test case; > could you please try it out and see if it fixes things for you? It'd be > particularly good if you can check not only the test case you submitted > but also the one that made you explore this in the first place. I didn't try this patch, but I built today's REL9_3_STABLE with the patch from your mail on the 31st of Dec (http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and ran the older version of my appliaction's test suite which has a case that times out after 3 minutes with unpatched 9.3. With the current patch the test case successfully completes in ~20 seconds which is a major improvement although it's still slower than 9.2 It also passes all other test cases in my test suite. Do you think it'll make it to 9.3.3? Thanks, Oskari
Oskari Saarenmaa wrote: > Thanks for looking at this and sorry about the late reply, I finally got > around to testing your latest patch. > > 20.12.2013 22:47, Alvaro Herrera kirjoitti: > >> Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR > >> UPDATE' for a suitable row is significantly slower in 9.3.0 (314.765 ms > >> vs 118.894 ms on 9.2.4). A 'SELECT' without a FOR UPDATE and > >> BEGIN/EXCEPTION/END has the same performance on 9.2.4 and 9.3.0. > > > > I have come up with the attached patch. As far as I can tell it > > restores performance roughly to the level of 9.2 for this test case; > > could you please try it out and see if it fixes things for you? It'd be > > particularly good if you can check not only the test case you submitted > > but also the one that made you explore this in the first place. > > I didn't try this patch, but I built today's REL9_3_STABLE with the patch > from your mail on the 31st of Dec > (http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and > ran the older version of my appliaction's test suite which has a case that > times out after 3 minutes with unpatched 9.3. With the current patch the > test case successfully completes in ~20 seconds which is a major improvement > although it's still slower than 9.2 It also passes all other test cases in > my test suite. > > Do you think it'll make it to 9.3.3? Well, I gave this patch a long, careful look and eventually noticed that there was a rather subtle but serious design bug in it, and I had to ditch it. I have the intention to come back to this problem again later, but right now I cannot. I'm not sure whether I will have something for 9.3.3, but it's not likely because that'll happen only in a week or two. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi, I don't understand about this problem, but it seems this problem is still remaining in 93_STABLE(*). (*)f1e522696fbc01b5c8a41dbd45d6cbae229be425 9.3.2 : 4.225(s) 93_STABE: 2.819(s) 9.2.6 : 0.135(s) 93_STABLE is faster than 9.3.2, but it's apparently slower than 9.2.6. 9.3.3 is going to be released in next week. Will any fixes be included in 9.3.3? regards, ----------- Tomonari Katsumata (2014/01/14 22:09), Alvaro Herrera wrote: > Oskari Saarenmaa wrote: >> Thanks for looking at this and sorry about the late reply, I finally got >> around to testing your latest patch. >> >> 20.12.2013 22:47, Alvaro Herrera kirjoitti: >>>> Removing the BEGIN/EXCEPTION/END block and just doing a 'SELECT FOR >>>> UPDATE' for a suitable row is significantly slower in 9.3.0 (314.765 ms >>>> vs 118.894 ms on 9.2.4). A 'SELECT' without a FOR UPDATE and >>>> BEGIN/EXCEPTION/END has the same performance on 9.2.4 and 9.3.0. >>> >>> I have come up with the attached patch. As far as I can tell it >>> restores performance roughly to the level of 9.2 for this test case; >>> could you please try it out and see if it fixes things for you? It'd be >>> particularly good if you can check not only the test case you submitted >>> but also the one that made you explore this in the first place. >> >> I didn't try this patch, but I built today's REL9_3_STABLE with the patch >> from your mail on the 31st of Dec >> (http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and >> ran the older version of my appliaction's test suite which has a case that >> times out after 3 minutes with unpatched 9.3. With the current patch the >> test case successfully completes in ~20 seconds which is a major improvement >> although it's still slower than 9.2 It also passes all other test cases in >> my test suite. >> >> Do you think it'll make it to 9.3.3? > > Well, I gave this patch a long, careful look and eventually noticed that > there was a rather subtle but serious design bug in it, and I had to > ditch it. I have the intention to come back to this problem again > later, but right now I cannot. > > I'm not sure whether I will have something for 9.3.3, but it's not > likely because that'll happen only in a week or two. >
In spare time,different people have different hobbies.Some people may choose to go on a date with their lovers with beautiful hairdressing and fashionable clothes.To improve personal image,more and more people choose to wear * <http://www.hairwigsall.com/categories/Synthetic-Wigs/> synthetic wig* even including men.And to my surprise,there are mens long wigs for sale in the market. -- View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-8470-9-3-locking-subtransaction-performance-regression-tp5772337p5793623.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
i also like to collect all kinds of fantastic wigs which i purchase online with great discount price! In my opinion, wigsalemall.com <http://wigsalemall.com > is a good choice! -- View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-8470-9-3-locking-subtransaction-performance-regression-tp5772337p5805217.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
I just got around to merging this patch to 9.5 sources. I'm glad I did it because in the course of doing so I noticed a bug in a recent patch, which led to commit d5e3d1e969d2f65009f718d3100d6565f47f9112 (back-patched to 9.3). I'm now more confident in this code and will probably push this a few days, but only to 9.5 at least for now. It probably won't apply cleanly to 9.3 due to other changes in the area, such as 05315498012530d44cd89a2 and df630b0dd5ea2de52972d456f5978a012436115e and others. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Mon, Jan 5, 2015 at 02:23:18PM -0300, Alvaro Herrera wrote: > I just got around to merging this patch to 9.5 sources. I'm glad I > did it because in the course of doing so I noticed a bug in a recent > patch, which led to commit d5e3d1e969d2f65009f718d3100d6565f47f9112 > (back-patched to 9.3). > > I'm now more confident in this code and will probably push this a few > days, but only to 9.5 at least for now. It probably won't apply cleanly > to 9.3 due to other changes in the area, such as 05315498012530d44cd89a2 > and df630b0dd5ea2de52972d456f5978a012436115e and others. Where are we on this? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Bruce Momjian wrote: > On Mon, Jan 5, 2015 at 02:23:18PM -0300, Alvaro Herrera wrote: > > I just got around to merging this patch to 9.5 sources. I'm glad I > > did it because in the course of doing so I noticed a bug in a recent > > patch, which led to commit d5e3d1e969d2f65009f718d3100d6565f47f9112 > > (back-patched to 9.3). > > > > I'm now more confident in this code and will probably push this a few > > days, but only to 9.5 at least for now. It probably won't apply cleanly > > to 9.3 due to other changes in the area, such as 05315498012530d44cd89a2 > > and df630b0dd5ea2de52972d456f5978a012436115e and others. > > Where are we on this? That's indeed the question. I gave this a look yesterday and came up with two patches that "fix" the issue. (I had to fix one additional bug in the formulation of the complex patch that I posted in January). I leave the details of the bug as exercise for the interested readers (hint: run the isolation tests). First some timings. Ran the script against unpatched master and each of the patches five times. Results: unpatched: real 0m8.192s real 0m8.230s real 0m8.233s real 0m8.187s real 0m8.212s simple patch: real 0m0.741s real 0m0.728s real 0m0.729s real 0m0.738s real 0m0.731s complex patch: real 0m0.732s real 0m0.723s real 0m0.730s real 0m0.725s real 0m0.726s In 9.2 the time is about 0.150s, so the regression is not completely resolved, but it's a huge improvement. The main catch of the "simple" formulation of the patch is that we do the new GetMultiXactIdMembers call with the buffer lock held, which is a horrible idea from a concurrency point of view; it will make many cases where the optimization doesn't apply a lot slower. I think with some extra contortions we could fix that problem, but it's already quite ugly that we have a duplicate check for the are-we-already-a-multixact-locker so I reject the idea that this seemingly simple patch is any good. I much prefer the complex formulation, which is what I had to start with, and makes thing a bit less unclear(*). There was some traction to the idea of backpatching this, but I'm no longer on board with that. If somebody wants to, I would like some commitment to a huge testing effort. (*) In a scale 1 to 10 with 10 being most unclear, the original code is about 12-unclear, and with the patch is 11-unclear. So it's an improvement anyhow. PS: Apologies for unified diff. This is one case where filterdiff dropped some hunks from the patch produced by git diff. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Oskari Saarenmaa wrote: Hi, > I didn't try this patch, but I built today's REL9_3_STABLE with the patch > from your mail on the 31st of Dec > (http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and > ran the older version of my appliaction's test suite which has a case that > times out after 3 minutes with unpatched 9.3. With the current patch the > test case successfully completes in ~20 seconds which is a major improvement > although it's still slower than 9.2 It also passes all other test cases in > my test suite. > > Do you think it'll make it to 9.3.3? I just pushed this to 9.5. I'm not sure how much value I place on a backpatch at this point. There were a number of complaints about the performance regression, but other than yours I got no report on whether the fix was good, or on whether it was stable. If you have other thoughts, I'm willing to listen. Anyway if you could test the patch I pushed, it'd be great. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
10.04.2015, 19:57, Alvaro Herrera kirjoitti: >> I didn't try this patch, but I built today's REL9_3_STABLE with the patch >> from your mail on the 31st of Dec >> (http://github.com/saaros/postgres/tree/alvaro-multixact-optimize-9.3) and >> ran the older version of my appliaction's test suite which has a case that >> times out after 3 minutes with unpatched 9.3. With the current patch the >> test case successfully completes in ~20 seconds which is a major improvement >> although it's still slower than 9.2 It also passes all other test cases in >> my test suite. >> >> Do you think it'll make it to 9.3.3? > > I just pushed this to 9.5. I'm not sure how much value I place on a > backpatch at this point. There were a number of complaints about the > performance regression, but other than yours I got no report on whether > the fix was good, or on whether it was stable. If you have other > thoughts, I'm willing to listen. Anyway if you could test the patch I > pushed, it'd be great. The component where this was an issue was totally rewritten already a while ago to allow us to take 9.3 into use and as I can't run the old version anymore I can't verify the fix in a real application. The code we had there was almost identical to the test case I posted. I don't think the fix needs to be backpatched. Thanks, Oskari
Alvaro Herrera wrote: > The main catch of the "simple" formulation of the patch is that we do > the new GetMultiXactIdMembers call with the buffer lock held, which is a > horrible idea from a concurrency point of view; it will make many cases > where the optimization doesn't apply a lot slower. I think with some > extra contortions we could fix that problem, but it's already quite ugly > that we have a duplicate check for the are-we-already-a-multixact-locker > so I reject the idea that this seemingly simple patch is any good. I > much prefer the complex formulation, which is what I had to start with, > and makes thing a bit less unclear(*). Here's an updated version of this patch. This applies to 9.3 and 9.4 and restores performance to what's in master, while being much less invasive than what was committed to master. But more importantly, what it does is reduce the typical size of multixacts. Since the size of multixacts is so critical with regards to the problems of members overrun, I think it is a good idea to reduce it. In master we already have an equivalent of this optimization. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services