On Tue, 2025-03-18 at 13:32 +0530, Motog Plus wrote:
> We are doing performance testing of our applications. During testing our postgres db
> got crashed with the error checkpointer was killed by signal 9. When checked system
> logs we found it was OOM issue.
> We have postgreSQL on independent node with 32GB RAM, multiple DBs are there witl
> multiple schemas.
>
> Our current configuration is as follows:
> Shared buffers 12 GB(increased from 10)
> Work_mem 6 MB
> Maintenance work mem 1 gb
> Active connections around 1500-2000.
>
> While analysing issue we made below changes:
> Increased shared buffers from. 10 to 12 gb as buffers_alloc was getting increased
> Bgwriter_delay was 200 ms, reduced to 100ms
> Bgwriter multiplier increased from 2 to 4 and then reduced to 3
> Bgwriter max pages increased from 100 to 1000
> We changed above bgwriter parameters to make bgwriter more aggressive but still we see
> most of the writes are being done by checkpointer as indicated by buffers_checkpoint
> and then by backends indicated by buffers_backend and then by bgwriter indicated by
> buffers_clean.
That's exactly the way it should be.
> Can you please advise on how to make bgwriter more active or am I missing to validate anything.
I don't think that is indicated.
> Also please advise can we tune any parameters to fix the OOM error that I mentioned in
> the starting, apart from looking at the queries.
Reduce the memory parameters: shared_buffers, work_mem, maintenance_work_mem
But most of all, get a working connection pool and reduce the number of
connections to something like 50. That way, you can be more generous with
the memory parameters and still not run OOM.
> Also if I want to check what queries might have caused the memory issue, that would
> be queries just above the checkpointer killed error message in the postgres logs?
Yes.
Yours,
Laurenz Albe