Thread: 500 requests per second
Hi guys, I'm looking for a database+hardware solution which should be able to handle up to 500 requests per second. The requests will consist in: - single row updates in indexed tables (the WHERE clauses will use the index(es), the updated column(s) will not be indexed); - inserts in the same kind of tables; - selects with approximately the same WHERE clause as the update statements will use. So nothing very special about these requests, only about the throughput. Can anyone give me an idea about the hardware requirements, type of clustering (at postgres level or OS level), and eventually about the OS (ideally should be Linux) which I could use to get something like this in place? Thanks! --
Tarhon-Onu Victor wrote: > > Hi guys, > > I'm looking for a database+hardware solution which should be able to > handle up to 500 requests per second. Crucial questions: 1. Is this one client making 500 requests, or 500 clients making one request per second? 2. Do you expect the indexes at least to fit in RAM? -- Richard Huxton Archonet Ltd
On Mon, 14 May 2007, Richard Huxton wrote: > 1. Is this one client making 500 requests, or 500 clients making one request > per second? Up to 250 clients will make up to 500 requests per second. > 2. Do you expect the indexes at least to fit in RAM? not entirely... or not all of them. --
Tarhon-Onu Victor wrote: > On Mon, 14 May 2007, Richard Huxton wrote: > >> 1. Is this one client making 500 requests, or 500 clients making one >> request per second? > > Up to 250 clients will make up to 500 requests per second. Well, PG is pretty good at handling multiple clients. But if I'm understanding you here, you're talking about potentially 250*500=125000 updates per second. If each update writes 1KB to disk, that's 125MB/sec continuously written. Are these figures roughly correct? >> 2. Do you expect the indexes at least to fit in RAM? > > not entirely... or not all of them. Hmm - OK. So you're going to have index reads accessing disk as well. Exactly how big are you looking at here? Will it be constantly growing? Can you partition the large table(s) by date or similar? Well, the two things I'd expect to slow you down are: 1. Contention updating index blocks 2. Disk I/O as you balance updates and selects. Since you're constantly updating, you'll want to have WAL on a separate set of disks from the rest of your database, battery-backed cache on your raid controller etc. Check the mailing list archives for recent discussions about good/bad controllers. You'll also want to substantially increase checkpoint limits, of course. If you can cope with the fact that there's a delay, you might want to look at replication (e.g. slony) to have reads happening on a separate machine from writes. That may well not be possible in your case. -- Richard Huxton Archonet Ltd
On Tue, May 15, 2007 at 11:47:29AM +0100, Richard Huxton wrote: > Tarhon-Onu Victor wrote: > >On Mon, 14 May 2007, Richard Huxton wrote: > > > >>1. Is this one client making 500 requests, or 500 clients making one > >>request per second? > > > > Up to 250 clients will make up to 500 requests per second. > > Well, PG is pretty good at handling multiple clients. But if I'm > understanding you here, you're talking about potentially 250*500=125000 > updates per second. If each update writes 1KB to disk, that's 125MB/sec > continuously written. Are these figures roughly correct? I'm guessing it's 500TPS overall, not per connection. It'd be rather challenging just to do 125,000 network round trips per second. > >>2. Do you expect the indexes at least to fit in RAM? > > > > not entirely... or not all of them. > > Hmm - OK. So you're going to have index reads accessing disk as well. > Exactly how big are you looking at here? > Will it be constantly growing? > Can you partition the large table(s) by date or similar? > > Well, the two things I'd expect to slow you down are: > 1. Contention updating index blocks > 2. Disk I/O as you balance updates and selects. > > Since you're constantly updating, you'll want to have WAL on a separate > set of disks from the rest of your database, battery-backed cache on > your raid controller etc. Check the mailing list archives for recent > discussions about good/bad controllers. You'll also want to > substantially increase checkpoint limits, of course. > > If you can cope with the fact that there's a delay, you might want to > look at replication (e.g. slony) to have reads happening on a separate > machine from writes. That may well not be possible in your case. Just as a data point, I've worked with some folks that are doing ~250TPS on a disk array with around 20-30 drives. IIRC a good amount of their working set did fit into memory, but not all of it. Your biggest constraint is really going to be I/O operations per second. If 90% of your data is in cache then you'll need to do a minimum of 50IOPS (realistically you'd probably have to double that). If 50% of your working set fits in cache you'd then be looking at 250IOPS, which is a pretty serious rate. I very strongly encourage you to do benchmarking to get a feel for how your system performs on a given set of hardware so that you have some idea of where you need to get to. You should also be looking hard at your application and system architecture for ways to cut down on your throughput. There may be some things you can do that would reduce the amount of database hardware you need to buy. -- Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On 5/12/07, Tarhon-Onu Victor <mituc@iasi.rdsnet.ro> wrote: > > Hi guys, > > I'm looking for a database+hardware solution which should be able > to handle up to 500 requests per second. The requests will consist in: > - single row updates in indexed tables (the WHERE clauses will use > the index(es), the updated column(s) will not be indexed); > - inserts in the same kind of tables; > - selects with approximately the same WHERE clause as the update > statements will use. > So nothing very special about these requests, only about the > throughput. > > Can anyone give me an idea about the hardware requirements, type > of > clustering (at postgres level or OS level), and eventually about the OS > (ideally should be Linux) which I could use to get something like this in > place? I work on a system about like you describe....400tps constant....24/7. Major challenges are routine maintenance and locking. Autovacuum is your friend but you will need to schedule a full vaccum once in a while because of tps wraparound. If you allow AV to do this, none of your other tables get vacuumed until it completes....heh! If you lock the wrong table, transactions will accumulate rapidly and the system will grind to a halt rather quickly (this can be mitigated somewhat by smart code on the client). Other general advice: * reserve plenty of space for WAL and keep volume separate from data volume...during a long running transaction WAL files will accumulate rapidly and panic the server if it runs out of space. * set reasonable statement timeout * backup with pitr. pg_dump is a headache on extremely busy servers. * get good i/o system for your box. start with 6 disk raid 10 and go from there. * spend some time reading about bgwriter settings, commit_delay, etc. * keep an eye out for postgresql hot (hopefully coming with 8.3) and make allowances for it in your design if possible. * normalize your database and think of vacuum as dangerous enemy. good luck! :-) merlin
>> I'm looking for a database+hardware solution which should be >> able to handle up to 500 requests per second. What proportion of reads and writes in those 500 tps ? (If you have 450 selects and 50 inserts/update transactions, your hardware requirements will be different than those for the reverse proportion) What is the estimated size of your data and hot working set ?
On Mon, May 21, 2007 at 03:50:27PM -0400, Merlin Moncure wrote: > I work on a system about like you describe....400tps constant....24/7. > Major challenges are routine maintenance and locking. Autovacuum is > your friend but you will need to schedule a full vaccum once in a > while because of tps wraparound. If you allow AV to do this, none of > your other tables get vacuumed until it completes....heh! BTW, that's changed in either 8.2 or 8.3; the change is that freeze information is now tracked on a per-table basis instead of per-database. So autovacuum isn't forced into freezing everything in the database at once. -- Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
> * set reasonable statement timeout > * backup with pitr. pg_dump is a headache on extremely busy servers. Where do you put your pitr wal logs so that they don't take up extra I/O ? > * get good i/o system for your box. start with 6 disk raid 10 and go > from there. > * spend some time reading about bgwriter settings, commit_delay, etc. > * keep an eye out for postgresql hot (hopefully coming with 8.3) and > make allowances for it in your design if possible. > * normalize your database and think of vacuum as dangerous enemy. > > good luck! :-) > > merlin > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match