Thread: 500 requests per second

500 requests per second

From

Tarhon-Onu Victor

Date:

12 May 2007, 04:15:14

     Hi guys,

     I'm looking for a database+hardware solution which should be able
to handle up to 500 requests per second. The requests will consist in:
     - single row updates in indexed tables (the WHERE clauses will use
the index(es), the updated column(s) will not be indexed);
     - inserts in the same kind of tables;
     - selects with approximately the same WHERE clause as the update
statements will use.
     So nothing very special about these requests, only about the
throughput.

     Can anyone give me an idea about the hardware requirements, type
of
clustering (at postgres level or OS level), and eventually about the OS
(ideally should be Linux) which I could use to get something like this in
place?

     Thanks!

--

Re: 500 requests per second

From

Richard Huxton

Date:

14 May 2007, 06:08:08

Tarhon-Onu Victor wrote:
>
>     Hi guys,
>
>     I'm looking for a database+hardware solution which should be able to
> handle up to 500 requests per second.

Crucial questions:

1. Is this one client making 500 requests, or 500 clients making one
request per second?
2. Do you expect the indexes at least to fit in RAM?

--
   Richard Huxton
   Archonet Ltd

Re: 500 requests per second

From

Tarhon-Onu Victor

Date:

15 May 2007, 06:17:31

On Mon, 14 May 2007, Richard Huxton wrote:

> 1. Is this one client making 500 requests, or 500 clients making one request
> per second?

     Up to 250 clients will make up to 500 requests per second.

> 2. Do you expect the indexes at least to fit in RAM?

     not entirely... or not all of them.

--

Re: 500 requests per second

From

Richard Huxton

Date:

15 May 2007, 07:49:13

Tarhon-Onu Victor wrote:
> On Mon, 14 May 2007, Richard Huxton wrote:
>
>> 1. Is this one client making 500 requests, or 500 clients making one
>> request per second?
>
>     Up to 250 clients will make up to 500 requests per second.

Well, PG is pretty good at handling multiple clients. But if I'm
understanding you here, you're talking about potentially 250*500=125000
updates per second. If each update writes 1KB to disk, that's 125MB/sec
continuously written. Are these figures roughly correct?

>> 2. Do you expect the indexes at least to fit in RAM?
>
>     not entirely... or not all of them.

Hmm - OK. So you're going to have index reads accessing disk as well.
Exactly how big are you looking at here?
Will it be constantly growing?
Can you partition the large table(s) by date or similar?

Well, the two things I'd expect to slow you down are:
1. Contention updating index blocks
2. Disk I/O as you balance updates and selects.

Since you're constantly updating, you'll want to have WAL on a separate
set of disks from the rest of your database, battery-backed cache on
your raid controller etc. Check the mailing list archives for recent
discussions about good/bad controllers. You'll also want to
substantially increase checkpoint limits, of course.

If you can cope with the fact that there's a delay, you might want to
look at replication (e.g. slony) to have reads happening on a separate
machine from writes. That may well not be possible in your case.

--
   Richard Huxton
   Archonet Ltd

Re: 500 requests per second

From

"Jim C. Nasby"

Date:

15 May 2007, 20:28:48

On Tue, May 15, 2007 at 11:47:29AM +0100, Richard Huxton wrote:
> Tarhon-Onu Victor wrote:
> >On Mon, 14 May 2007, Richard Huxton wrote:
> >
> >>1. Is this one client making 500 requests, or 500 clients making one
> >>request per second?
> >
> >    Up to 250 clients will make up to 500 requests per second.
>
> Well, PG is pretty good at handling multiple clients. But if I'm
> understanding you here, you're talking about potentially 250*500=125000
> updates per second. If each update writes 1KB to disk, that's 125MB/sec
> continuously written. Are these figures roughly correct?

I'm guessing it's 500TPS overall, not per connection. It'd be rather
challenging just to do 125,000 network round trips per second.

> >>2. Do you expect the indexes at least to fit in RAM?
> >
> >    not entirely... or not all of them.
>
> Hmm - OK. So you're going to have index reads accessing disk as well.
> Exactly how big are you looking at here?
> Will it be constantly growing?
> Can you partition the large table(s) by date or similar?
>
> Well, the two things I'd expect to slow you down are:
> 1. Contention updating index blocks
> 2. Disk I/O as you balance updates and selects.
>
> Since you're constantly updating, you'll want to have WAL on a separate
> set of disks from the rest of your database, battery-backed cache on
> your raid controller etc. Check the mailing list archives for recent
> discussions about good/bad controllers. You'll also want to
> substantially increase checkpoint limits, of course.
>
> If you can cope with the fact that there's a delay, you might want to
> look at replication (e.g. slony) to have reads happening on a separate
> machine from writes. That may well not be possible in your case.

Just as a data point, I've worked with some folks that are doing ~250TPS
on a disk array with around 20-30 drives. IIRC a good amount of their
working set did fit into memory, but not all of it.

Your biggest constraint is really going to be I/O operations per second.
If 90% of your data is in cache then you'll need to do a minimum of
50IOPS (realistically you'd probably have to double that). If 50% of
your working set fits in cache you'd then be looking at 250IOPS, which
is a pretty serious rate.

I very strongly encourage you to do benchmarking to get a feel for how
your system performs on a given set of hardware so that you have some
idea of where you need to get to. You should also be looking hard at
your application and system architecture for ways to cut down on your
throughput. There may be some things you can do that would reduce the
amount of database hardware you need to buy.
--
Jim Nasby                                      decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: 500 requests per second

From

"Merlin Moncure"

Date:

21 May 2007, 16:50:40

On 5/12/07, Tarhon-Onu Victor <mituc@iasi.rdsnet.ro> wrote:
>
>         Hi guys,
>
>         I'm looking for a database+hardware solution which should be able
> to handle up to 500 requests per second. The requests will consist in:
>         - single row updates in indexed tables (the WHERE clauses will use
> the index(es), the updated column(s) will not be indexed);
>         - inserts in the same kind of tables;
>         - selects with approximately the same WHERE clause as the update
> statements will use.
>         So nothing very special about these requests, only about the
> throughput.
>
>         Can anyone give me an idea about the hardware requirements, type
> of
> clustering (at postgres level or OS level), and eventually about the OS
> (ideally should be Linux) which I could use to get something like this in
> place?

I work on a system about like you describe....400tps constant....24/7.
 Major challenges are routine maintenance and locking.  Autovacuum is
your friend but you will need to schedule a full vaccum once in a
while because of tps wraparound.  If you allow AV to do this, none of
your other tables get vacuumed until it completes....heh!

If you lock the wrong table, transactions will accumulate rapidly and
the system will grind to a halt rather quickly (this can be mitigated
somewhat by smart code on the client).

Other general advice:
* reserve plenty of space for WAL and keep volume separate from data
volume...during a long running transaction WAL files will accumulate
rapidly and panic the server if it runs out of space.
* set reasonable statement timeout
* backup with pitr.  pg_dump is a headache on extremely busy servers.
* get good i/o system for your box.  start with 6 disk raid 10 and go
from there.
* spend some time reading about bgwriter settings, commit_delay, etc.
* keep an eye out for postgresql hot (hopefully coming with 8.3) and
make allowances for it in your design if possible.
* normalize your database and think of vacuum as dangerous enemy.

good luck! :-)

merlin

Re: 500 requests per second

From

PFC

Date:

21 May 2007, 17:45:39

>>         I'm looking for a database+hardware solution which should be
>> able to handle up to 500 requests per second.

    What proportion of reads and writes in those 500 tps ?
    (If you have 450 selects and 50 inserts/update transactions, your
hardware requirements will be different than those for the reverse
proportion)

    What is the estimated size of your data and hot working set ?

Re: 500 requests per second

From

"Jim C. Nasby"

Date:

21 May 2007, 17:47:12

On Mon, May 21, 2007 at 03:50:27PM -0400, Merlin Moncure wrote:
> I work on a system about like you describe....400tps constant....24/7.
> Major challenges are routine maintenance and locking.  Autovacuum is
> your friend but you will need to schedule a full vaccum once in a
> while because of tps wraparound.  If you allow AV to do this, none of
> your other tables get vacuumed until it completes....heh!

BTW, that's changed in either 8.2 or 8.3; the change is that freeze
information is now tracked on a per-table basis instead of per-database.
So autovacuum isn't forced into freezing everything in the database at
once.
--
Jim Nasby                                      decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: 500 requests per second

From

Dave Cramer

Date:

21 May 2007, 21:52:42

> * set reasonable statement timeout
> * backup with pitr.  pg_dump is a headache on extremely busy servers.

Where do you put your pitr wal logs so that they don't take up extra
I/O ?
> * get good i/o system for your box.  start with 6 disk raid 10 and go
> from there.
> * spend some time reading about bgwriter settings, commit_delay, etc.
> * keep an eye out for postgresql hot (hopefully coming with 8.3) and
> make allowances for it in your design if possible.
> * normalize your database and think of vacuum as dangerous enemy.
>
> good luck! :-)
>
> merlin
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match