Re: reducing the overhead of frequent table locks - now, with WIP patch - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: reducing the overhead of frequent table locks - now, with WIP patch |
Date | |
Msg-id | BANLkTimFkPJB_mL=b2noGXg55f1v7FObDw@mail.gmail.com Whole thread Raw |
In response to | Re: reducing the overhead of frequent table locks - now, with WIP patch (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>) |
Responses |
Re: reducing the overhead of frequent table locks - now,
with WIP patch
|
List | pgsql-hackers |
On Sun, Jun 5, 2011 at 4:01 PM, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: > On 06/05/2011 09:12 PM, Heikki Linnakangas wrote: >> On 05.06.2011 22:04, Stefan Kaltenbrunner wrote: >>> and one for the -j80 case(also patched). >>> >>> >>> 485798 48.9667 postgres s_lock >>> 60327 6.0808 postgres LWLockAcquire >>> 57049 5.7503 postgres LWLockRelease >>> 18357 1.8503 postgres hash_search_with_hash_value >>> 17033 1.7169 postgres GetSnapshotData >>> 14763 1.4881 postgres base_yyparse >>> 14460 1.4575 postgres SearchCatCache >>> 13975 1.4086 postgres AllocSetAlloc >>> 6416 0.6467 postgres PinBuffer >>> 5024 0.5064 postgres SIGetDataEntries >>> 4704 0.4741 postgres core_yylex >>> 4625 0.4662 postgres _bt_compare >> >> Hmm, does that mean that it's spending 50% of the time spinning on a >> spinlock? That's bad. It's one thing to be contended on a lock, and have >> a lot of idle time because of that, but it's even worse to spend a lot >> of time spinning because that CPU time won't be spent on doing more >> useful work, even if there is some other process on the system that >> could make use of that CPU time. > > well yeah - we are broken right now with only being able to use ~20% of > CPU on a modern mid-range box, but using 80% CPU (or 4x like in the > above case) and only getting less than 2x the performance seems wrong as > well. I also wonder if we are still missing something fundamental - > because even with the current patch we are quite far away from linear > scaling and light-years from some of our competitors... Could you compile with LWLOCK_STATS, rerun these tests, total up the "blk" numbers by LWLockId, and post the results? (Actually, totalling up the shacq and exacq numbers would be useful as well, if you wouldn't mind.) Unless I very much miss my guess, we're going to see zero contention on the new structures introduced by this patch. Rather, I suspect what we're going to find is that, with the hideous contention on one particular lock manager partition lock removed, there's a more spread-out contention problem, likely involving the lock manager partition lock, the buffer mapping locks, and possibly other LWLocks as well. The fact that the system is busy-waiting rather than just not using the CPU at all probably means that the remaining contention is more spread out than that which is removed by this patch. We don't actually have everything pile up on a single LWLock (as happens in git master), but we do spend a lot of time fighting cache lines away from other CPUs. Or at any rate, that's my guess: we need some real numbers to know for sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: