The shared buffers challenge - Mailing list pgsql-performance

From Merlin Moncure
Subject The shared buffers challenge
Date
Msg-id BANLkTimPC-K_o8XNn1hK4ZgBrTGO_Z6RDw@mail.gmail.com
Whole thread Raw
Responses Re: The shared buffers challenge
Re: The shared buffers challenge
Re: The shared buffers challenge
List pgsql-performance
Hello performers, I've long been unhappy with the standard advice
given for setting shared buffers.  This includes the stupendously
vague comments in the standard documentation, which suggest certain
settings in order to get 'good performance'.  Performance of what?
Connection negotiation speed?  Not that it's wrong necessarily, but
ISTM too much based on speculative or anecdotal information.  I'd like
to see the lore around this setting clarified, especially so we can
refine advice to: 'if you are seeing symptoms x,y,z set shared_buffers
from a to b to get symptom reduction of k'.  I've never seen a
database blow up from setting them too low, but over the years I've
helped several people with bad i/o situations or outright OOM
conditions from setting them too high.

My general understanding of shared_buffers is that they are a little
bit faster than filesystem buffering (everything these days is
ultimately based on mmap AIUI, so there's no reason to suspect
anything else).  Where they are most helpful is for masking of i/o if
a page gets dirtied >1 times before it's written out to the heap, but
seeing any benefit from that at all is going to be very workload
dependent.  There are also downsides using them instead of on the heap
as well, and the amount of buffers you have influences checkpoint
behavior.  So things are complex.

So, the challenge is this: I'd like to see repeatable test cases that
demonstrate regular performance gains > 20%.  Double bonus points for
cases that show gains > 50%.  No points given for anecdotal or
unverifiable data. Not only will this help raise the body of knowledge
regarding the setting, but it will help produce benchmarking metrics
against which we can measure multiple interesting buffer related
patches in the pipeline.  Anybody up for it?

merlin

pgsql-performance by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: Hash Anti Join performance degradation
Next
From: "Kevin Grittner"
Date:
Subject: Re: Hash Anti Join performance degradation