Tom Lane wrote:
> Kris Kennaway <kris@obsecurity.org> forwards:
> > Yes but there are still a lot of wakeups to be avoided in the current
> > System V semaphore code. More specifically, not only do we wakeup all
> > the processes waiting on a single semaphore everytime something changes,
> > but we also wakeup all processes waiting on *any* of the semaphore in
> > the semaphore *set*, whatever the reason we're sleeping.
Thanks for forwarding my mail, Kris! To Tom: if you can get my mails
to reach pgsql-hackers@ somehow that would be just great :-).
> Ohhhh ... *that's* the problem. Ugh. Although we have a separate
> semaphore for each PG backend, they're grouped into semaphore sets
> (I think 16 active semaphores per set). So a wakeup intended for one
> process would uselessly send up to 15 others through the semop code.
Yes.
> The only thing we could do to fix that from our end would be to use
> a smaller sema-set size on *BSD platforms. Is the overhead per sema set
> small enough to make this a sane thing to do? Will we be likely to
> run into system limits on the number of sets?
I'm not familiar enough with the PostgreSQL code to know what impact
such a change could have, but since the problem is clearly on our
side here, I would advise against doing changes in PostgreSQL that
are likely to complicate the code for little gain. We still didn't
even fully measure how much the useless wakups cost us since we're
running into other contention problems with my patch that removes
those. And, as you point out, there are complications ensuing with
respect to system limits (we already ask users to bump them when
they install PostgreSQL).
I'm looking forward fixing/rewriting all of the FreeBSD sysV semaphore
code and am just waiting for a green light from my boss before doing
so. Maybe someone will beat me to it, since it isn't such a big
change.
I think the high number of setproctitle() calls are more problematic
to us at the moment, Kris can comment on that.
Cheers,
Maxime