Re: shared_buffers performance - Mailing list pgsql-performance

From Gregory Stark
Subject Re: shared_buffers performance
Date
Msg-id 87r6d8hfgw.fsf@oxford.xeocode.com
Whole thread Raw
In response to shared_buffers performance  (Gaetano Mendola <mendola@gmail.com>)
Responses Re: shared_buffers performance  (Richard Huxton <dev@archonet.com>)
Re: shared_buffers performance  (Greg Smith <gsmith@gregsmith.com>)
Re: shared_buffers performance  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
"Gaetano Mendola" <mendola@gmail.com> writes:

> The following graph reports the results:
>
> http://img84.imageshack.us/my.php?image=totalid7.png

That's a *fascinating* graph.

It seems there are basically three domains.

The small domain where the database fits in shared buffers -- though actually
this domain seems to hold until the accounts table is about 1G so maybe it's
more that the *indexes* fit in memory. Here larger shared buffers do clearly
win.

The transition domain where performance drops dramatically as the database
starts to not fit in shared buffers but does still fit in filesystem cache.
Here every megabyte stolen from the filesystem cache makes a *huge*
difference. At a scale factor of 120 or so you're talking about a factor of 4
between each of the shared buffer sizes.

The large domain where the database doesn't fit in filesystem cache. Here it
doesn't make a large difference but the more buffers duplicated between
postgres and the filesystem cache the lower the overall cache effectiveness.

If we used something like either mmap or directio to avoid the double
buffering we would be able to squeeze these into a single curve, as well as
push the dropoff slightly to the right. In theory.

In practice it would depend on the OS's ability to handle page faults
efficiently in the mmap case, and our ability to do read-ahead and cache
management in the directio case. And it would be a huge increase in complexity
for Postgres and a push into a direction which isn't our "core competency". We
might find that while in theory it should perform better our code just can't
keep up with Linux's and it doesn't.

I'm curious about the total database size as a for each of the scaling factors
as well as the total of the index sizes. And how much memory Linux says is
being used for filesystem buffers.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

pgsql-performance by date:

Previous
From: PFC
Date:
Subject: Re: db size
Next
From: Richard Huxton
Date:
Subject: Re: shared_buffers performance