Jakub Ouhrabka <kuba@comgate.cz> writes:
> Yes, I can confirm that it's triggered by SIGUSR1 signals.
OK, that confirms the theory that it's sinval-queue contention.
> If I understand it correctly we have following choices now:
> 1) Use only 2 cores (out of 8 cores)
> 2) Lower the number of idle backends - at least force backends to do
> something at different times to eliminate spikes - is "select 1" enough
> to force processing the queue?
Yeah, if you could get your clients to issue trivial queries every few
seconds (not all at the same time) the spikes should go away.
If you don't want to change your clients, one possible amelioration is
to reduce the signaling threshold in SIInsertDataEntry --- instead of
70% of MAXNUMMESSAGES, maybe signal at 10% or 20%. That would make the
spikes more frequent but smaller, which might help ... or not.
> 3) Is there any chance of this being fixed/improved in 8.3 or even 8.2?
I doubt we'd risk destabilizing 8.3 at this point, for a problem that
affects so few people; let alone back-patching into 8.2. There are some
other known performance problems in the sinval signaling (for instance,
that a queue overflow results in cache resets for all backends, not only
the slowest), so I think addressing all of them at once would be the
thing to do. That would be a large enough patch that it would certainly
need to go through beta testing before I'd want to inflict it on the
world...
This discussion has raised the priority of the problem in my mind,
so I'm thinking it should be worked on in 8.4; but it's too late for
8.3.
regards, tom lane