Re: Huge number of INSERTs - Mailing list pgsql-general

From Tomas Vondra
Subject Re: Huge number of INSERTs
Date
Msg-id b250b65d1c5908e0de0d9a24593943dc.squirrel@sq.gransy.com
Whole thread Raw
In response to Huge number of INSERTs  (Phoenix Kiula <phoenix.kiula@gmail.com>)
Responses Re: Huge number of INSERTs  (Phoenix Kiula <phoenix.kiula@gmail.com>)
List pgsql-general
Hi, there's a pretty wiki page about tuning PostgreSQL databases:

http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

On 18 Listopad 2011, 1:44, Phoenix Kiula wrote:
> Hi. I have a massive traffic website.
>
> I keep getting "FATAL: Sorry, too many clients already" problems.

That has nothing to do with the inserts, it means the number of connection
requests exceeds the max_connections. You've set it to 350, and that seems
too high - the processes are going to struggle for resources (CPU, I/O and
memory) and the scheduling gets expensive too.

A good starting point is usually 2*(number of cores + number of drives)
which is 16 or 24 (not sure what a "dual server" is - probably dual CPU).
You may increase that if the database more or less fits into memory (so
less I/O is needed).

But go step by step - by 10 connections or something like that. The
problem is that each connection can allocate memory (work_mem), and if you
have too many connections doing that at the same time you'll get OOM or a
swapping system (and that's not what you want). And you should give the
sessions enough memory, because otherwise they're going to do on-disk
sort.

So you have to keep in mind these "rules"

(1) give each session enough memory to perform the operations in RAM
(enough work_mem to sort in memory etc.), but not more

(2) don't use too many connections - watch the I/O utilization and don't
overload it (you won't get higher throughput, just higher latencies)

BTW the same rule holds for the number of Apache workers - how many are
you using? Is that on the same machine or on a dedicated one? The fact
that you're receiving "too many clients" suggests that you have MaxClients
higher than 350. Have you actually tested this to see if it gives better
performance than 50? If the clients actually need to connect / query the
database, there's probably no point in having more than max_connections of
them.

> It's a Quad core machine with dual servers, 4 SCSI disks with RAID 10,
> with RAM of 8GB.
>
> Server is Nginx backed by Apache for the php.
>
> Postgresql just has to do about 1000 SELECTs a minute, and about 200
> INSERTs a minute. Maybe 10-20 UPDATEs.

That's completely information-less description. Those SELECTs may be a
simple "fetch by PK" queries or complex queries aggregating data from 20
tables. So the fact that you need to execute 1000 of them is useless.

The same for UPDATEs and INSERTs.

Post an example of the queries with EXPLAIN ANALYZE for each of them (use
explain.depesz.com to post it).

> My conf file is below. My vmstat + top are below too.

A static (single line) of vmstat is not very useful - we need a few lines
of "vmstat 1" (say 30) collected when the application is in use.

> What else can I do?

1) Decrease the number of connections to a reasonable value.

2) Use a connection pooler. You may also use persistent connections in PHP
too, but you have to set MaxClients in apache config to the same value
(otherwise you'll get "too many clients"). The connection pooler can
handle this for you - it will wait until a connection is available.

And the most important thing - prepare a simple stress script (a few HTTP
requests, performed by a typical client) and use it to stress test the
application. Start with low max_connections / MaxClients (say 20),
increase them gradually and watch the performance (throughput).

The usual behavior is that at the beginning the throughput scales linearly
(2 clients give you 2x the throughput of 1 client, with the same latency).
Then this growth stops and the throughput does not grow anymore - adding
more clients just increases the latency. Then the throughput usually goes
down.

Tomas


pgsql-general by date:

Previous
From: Steve Crawford
Date:
Subject: Re: Huge number of INSERTs
Next
From: Bill Thoen
Date:
Subject: Using the internal data dictionary