Home > mailing lists

Re: Huge number of INSERTs - Mailing list pgsql-general

From	Tomas Vondra
Subject	Re: Huge number of INSERTs
Date	November 19, 2011 19:25:29
Msg-id	52b72a09b13364635dc4f9a4165e95db.squirrel@sq.gransy.com Whole thread Raw
In response to	Re: Huge number of INSERTs (Phoenix Kiula <phoenix.kiula@gmail.com>)
List	pgsql-general

Tree view

On 19 Listopad 2011, 20:27, Phoenix Kiula wrote:
> On Fri, Nov 18, 2011 at 10:41 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>
>> That has nothing to do with the inserts, it means the number of
>> connection
>> requests exceeds the max_connections. You've set it to 350, and that
>> seems
>> too high - the processes are going to struggle for resources (CPU, I/O
>> and
>> memory) and the scheduling gets expensive too.
>>
>> A good starting point is usually 2*(number of cores + number of drives)
>> which is 16 or 24 (not sure what a "dual server" is - probably dual
>> CPU).
>> You may increase that if the database more or less fits into memory (so
>> less I/O is needed).
>
>
>
> Ok, there's just too much conflicting info on the web.
>
> If I reduce the max_connections to 16, how does this reflect on the
> Apache MaxClients? There's a school of thought that recommends that
> MaxClients in Apache should be the same as max_connection in PGSQL.
> But 16 for MaxClients with a prefork MPM would be disastrous. No?

See, the max_connections is the maximum number of allowed connections. So
if there are 16 open connections and someone asks for another one, he'll
receive "too many clients". So decreasing the max_connections without
MaxClients would make your problem even worse.

I'm not sure about the Apache prefork worker - why are you using it
instead the threaded worker? Anyway as I asked before, do you have a proof
the current MaxClient value provides the best performance? It seems to me
you've just used some very high values in belief that it will give better
performance. Have you performed some stress test to verify the settings.
I'm not saying you should set MaxClients to 16, but 350 probably is too
high?

But if you can't set MaxClients to the same value as max_connections (or
actually a bit lower, because there are connections reserved for superuser
etc.), that's exactly the proof that you need a pooler - see pgbouncer.

>
> Anyway, even if I do try 16 as the number, what about these settings:
>
> work_mem
> shared_buffers
> effective_cache_size
>
> With nginx and apache, and some other sundries, I think about 4-5GB is
> left for PGSQL. This is 9.0.5. And all other details - Centos 5 on 64
> bit, SCSI disks with RAID 10, 3Ware RAID controller...etc.

The phrase "I think" suggests that you actually don't know how much memory
is available. Suggestions with this number of components are going to be a
bit unreliable. Can you post a few lines of "vmstat 5" and "free"?

The recommended shared_buffers size is usually 25% of RAM, that's about
1GB of RAM. I see you've originally set it to 256MB - have you checked the
cache hit ratio, i.e. how many requests were resolved using the cache?

SELECT datname, (100 * blks_hit) / (blks_hit + blks_read + 1) hit_ratio
  FROM pg_stat_database;

Again, this is a point where an application benchmark would really help.
What is the database size, anyway?

It's difficult to recommend a work_mem without deeper knowledge of your
queries and how much memory is available. Using less connections actually
allows you to grant more memory to each of them, i.e. setting higher
work_mem.

Say you have 1GB available, 20 connections - that's about 50MB per
connection. The work_mem is actually per node, so if a query sorts twice
it may allocate 2x work_mem etc. So a conservative estimate would be
work_mem=20MB or something, so that even if all the connections start
sorting at the same time you won't get OOM. But is that really enough or
too much for your queries? I have no idea.

I recommend to set a conservative work_mem value (e.g. 4MB), log slow
queries and check if they'd benefit from higher work_mem values.

Regarding the effective_cache_size - this is just a hint how much data
might be cached. What does "free" says about the cache size? I see you've
decreased the size from 1500MB to 1400MB on (Nov 11 2011). Why?

Tomas

pgsql-general by date:

From: "Andrus"
Date: 19 November 2011, 19:13:57
Subject: Re: How to install latest stable postgresql on Debian

From: Scott Marlowe
Date: 19 November 2011, 20:48:28
Subject: Re: How to install latest stable postgresql on Debian

Re: Huge number of INSERTs - Mailing list pgsql-general

Previous

Next