Tom,
> Strictly a WAG ... but what this sounds like to me is disastrously bad
> behavior of the spinlock code under heavy contention. We thought we'd
> fixed the spinlock code for SMP machines awhile ago, but maybe
> hyperthreading opens some new vistas for misbehavior ...
Yeah, I thought of that based on the discussion on -Hackers. But we tried
turning off hyperthreading, with no change in behavior.
> If you can't try 7.4, or want to gather more data first, it would be
> good to try to confirm or disprove the theory that the context switches
> are coming from spinlock delays. If they are, they'd be coming from the
> select() calls in s_lock() in s_lock.c. Can you strace or something to
> see what kernel calls the context switches occur on?
Might be worth it ... will suggest that. Will also try 7.4.
--
-Josh Berkus
Aglio Database Solutions
San Francisco