Thread: [performance] fast reads on a busy server

[performance] fast reads on a busy server

From

Willy-Bas Loos

Date:

26 June 2012, 19:17:18

Hi,

I've read this:
http://wiki.postgresql.org/wiki/Prioritizing_databases_by_separating_into_multiple_clusters

But it doesn't really say anything about memory.
If i can fit an extra cluster into it's shared buffer, it should have fast reads, right?
Even if i don't have seperate spindles and the disks are busy.
This is on a Debain server, postgres 8.4

Cheers,

WBL
--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

Re: [performance] fast reads on a busy server

From

Hannu Krosing

Date:

27 June 2012, 04:34:39

On Wed, 2012-06-27 at 00:16 +0200, Willy-Bas Loos wrote:
> Hi,
>
> I've read this:
> http://wiki.postgresql.org/wiki/Prioritizing_databases_by_separating_into_multiple_clusters
>
> But it doesn't really say anything about memory.
> If i can fit an extra cluster into it's shared buffer, it should have
> fast reads, right?
> Even if i don't have seperate spindles and the disks are busy.

Check if you are CPU-bound. On a database which fits fully you may
already be.

> This is on a Debain server, postgres 8.4

And if possible, upgrade to latest pg (9.1). On some operations this
already may give you a considerable performance boost

> Cheers,
>
> WBL
> --
> "Quality comes from focus and clarity of purpose" -- Mark Shuttleworth
>

--
-------
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/

Re: [performance] fast reads on a busy server

From

Willy-Bas Loos

Date:

27 June 2012, 07:02:14

On Wed, Jun 27, 2012 at 9:34 AM, Hannu Krosing <hannu@2ndquadrant.com> wrote:

Check if you are CPU-bound. On a database which fits fully you may
already be.

Being CPU-bound is my goal.
That makes your answer a yes to me.

Only i'm afraid that this solution is not optimal.
Because i am stealing more resopurces from the (already busy) rest of the server than necessary. That's because the file-cache will also be filled (partially) with data that this cluster uses, unnecessarily. I'm not going to read from the file cache, because the data will be in the shared_buffers as soon as they have been read from disk.

So then, would it be better to use 80% of memory for the shared buffers of the combined clusters?
I've read that 25% is good and 40% is max because of the file cache, but it doesn't make much sense..

Greg Smith writes (here, page 12):
* PostgreSQL is designed to rely heavily on the operating
system cache, because portable sotware like PostgreSQL can’t
know enough about the filesystem or disk layout to make
optimal decisions about how to read and write files
* The shared buffer cache is really duplicating what the
operating system is already doing: caching popular file blocks
* In many cases, you’ll find exactly the same blocks cached by
both the buffer cache and the OS page cache
* This makes is a bad idea to give PostgreSQL too much
memory to manage

I cannot follow that reasoning completely. Who needs OS level file cache when postgres' shared_buffers is better? The efficiency should go up again after passing 50% of shared buffers, where you would be caching everything twice.
The only problem i see is that work_mem and such will end up in SWAP if there isn't enough memory left over to allocate.

Cheers,

WBL

--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

Re: [performance] fast reads on a busy server

From

Willy-Bas Loos

Date:

27 June 2012, 08:28:54

On Wed, Jun 27, 2012 at 12:01 PM, Willy-Bas Loos <willybas@gmail.com> wrote:

I cannot follow that reasoning completely. Who needs OS level file cache when postgres' shared_buffers is better? The efficiency should go up again after passing 50% of shared buffers, where you would be caching everything twice.
The only problem i see is that work_mem and such will end up in SWAP if there isn't enough memory left over to allocate.\

That is, 25% probably works best when there is only one cluster.
I'm just wondering about this particular case:
* more than 1 cluster on the machine, no separate file systems.
* need fast writes on one cluster, so steal some memory to fit the DB in shared_buffers
* now there is useless data in the OS file-cache

Should i use a larger shared_buffers for the other cluster(s) too, so that i bypass the inefficient OS file-cache?

Cheers,

WBL

--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

Re: [performance] fast reads on a busy server

From

Willy-Bas Loos

Date:

27 June 2012, 08:58:23

On Wed, Jun 27, 2012 at 1:28 PM, Willy-Bas Loos <willybas@gmail.com> wrote:

* need fast writes on one cluster, so steal some memory to fit the DB in shared_buffers

correction: READs, not writes. sry.

--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

Re: [PERFORM] [performance] fast reads on a busy server

From

Ants Aasma

Date:

27 June 2012, 09:59:17

On Jun 27, 2012 2:29 PM, "Willy-Bas Loos" <willybas@gmail.com> wrote:
> Should i use a larger shared_buffers for the other cluster(s) too, so that i bypass the inefficient OS file-cache?

Once the in-memory cluster has filled its shared buffers, the pages go cold for the OS cache and get replaced with pages of other clusters that are actually referenced.

Ants Aasma

Re: [PERFORM] [performance] fast reads on a busy server

From

Willy-Bas Loos

Date:

27 June 2012, 10:24:16

Thank you.

Cheers,
WBL

Op 27 jun. 2012 14:59 schreef "Ants Aasma" <ants@cybertec.at> het volgende:

On Jun 27, 2012 2:29 PM, "Willy-Bas Loos" <willybas@gmail.com> wrote:
> Should i use a larger shared_buffers for the other cluster(s) too, so that i bypass the inefficient OS file-cache?
Once the in-memory cluster has filled its shared buffers, the pages go cold for the OS cache and get replaced with pages of other clusters that are actually referenced.
Ants Aasma