On Sun, Feb 2, 2014 at 6:00 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> The changed algorithm for lwlock imo is an *algorithmic* improvement,
> not one for a particular architecture. The advantage being that locking
> a lwlock which is primarily taken in shared mode will never need need to
> wait or loop.
I agree. My point was only that the messaging ought to be that this is
something that those with multi-socket Intel systems should take note
of.
> Yes, that branch is used by some of them. But to make that clear to all
> that are still reading, I have *first* presented the patch & findings to
> -hackers and *then* backported it, and I have referenced the existance
> of the patch for 9.2 on list before. This isn't some kind of "secret
> sauce" deal...
No, of course not. I certainly didn't mean to imply that. My point was
only that anyone that is affected to the same degree as the party with
the 4 socket server might be left with a very poor impression of
Postgres if we failed to fix the problem. It clearly rises to the
level of a bugfix.
> That might be something to do later, as it *really* can hurt in
> practice. We had one server go from load 240 to 11...
Well, we have to commit something on master first. But it should be a
priority to avoid having this hurt users further, since the problems
are highly predictable for certain types of servers.
> But I think we should first focus on getting the patch ready for
> master, then we can see where it's going. At the very least I'd like to
> split of the part modifying the current spinlocks to use the atomics,
> that seems far to invasive.
Agreed.
> I unfortunately can't tell you that much more, not because it's private,
> but because it mostly was diagnosed by remote hand debugging, limiting
> insights considerably.
Of course.
--
Peter Geoghegan