Thread: how much postgres can scale up?
I have a function in pgsql language, this function do some select to some tables for verify some conditions and then do one insert to a table with NO index. Update are not performed in the function When 1 client connected postgres do 180 execution per second With 2 clients connected postgres do 110 execution per second With 3 clients connected postgres do 90 execution per second Finally with 6 connected clients postgres do 60 executions per second (totally 360 executions per second) While testing, I monitor disk, memory and CPU and not found any overload. I know that with this information you can figure out somethigns, but in normal conditions, Is normal the degradation of performance per connection when connections are incremented? Or should I spect 180 in the first and something similar in the second connection? Maybe 170? The server is a dual xeon quad core with 16 GB of ram and a very fast storage The OS is a windows 2008 R2 x64 Thanks Anibal
> I have a function in pgsql language, this function do some select to some > tables for verify some conditions and then do one insert to a table with > NO > index. Update are not performed in the function > > When 1 client connected postgres do 180 execution per second > With 2 clients connected postgres do 110 execution per second > With 3 clients connected postgres do 90 execution per second > > Finally with 6 connected clients postgres do 60 executions per second > (totally 360 executions per second) > > While testing, I monitor disk, memory and CPU and not found any overload. There's always a bottleneck - otherwise the system might run faster (and hit another bottleneck eventually). It might be CPU, I/O, memory, locking and maybe some less frequent things. > I know that with this information you can figure out somethigns, but in > normal conditions, Is normal the degradation of performance per connection > when connections are incremented? > Or should I spect 180 in the first and something similar in the second > connection? Maybe 170? > > > The server is a dual xeon quad core with 16 GB of ram and a very fast > storage > The OS is a windows 2008 R2 x64 Might be, but we need more details about how the system works. On Linux I'd ask for output from 'iostat -x 1' and 'vmstat 1' but you're on Windows so there are probably other tools. What version of PostgreSQL is this? What are the basic config values (shared_buffers, work_mem, effective_cache_size, ...)? Have you done some tuning? There's a wiki page about this: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server Have you tried to log slow queries? Maybe there's one query that makes the whole workload slow? See this: http://wiki.postgresql.org/wiki/Logging_Difficult_Queries Tomas
The version is Postgres 9.0 Yes, I setup the postgres.conf according to instructions in the http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server Cool, I will check this http://wiki.postgresql.org/wiki/Logging_Difficult_Queries Looks like great starting point to find bottleneck But so, Is possible in excellent conditions that two connections duplicate the quantity of transactions per second? Thanks! -----Mensaje original----- De: tv@fuzzy.cz [mailto:tv@fuzzy.cz] Enviado el: viernes, 10 de junio de 2011 08:10 a.m. Para: Anibal David Acosta CC: pgsql-performance@postgresql.org Asunto: Re: [PERFORM] how much postgres can scale up? > I have a function in pgsql language, this function do some select to > some tables for verify some conditions and then do one insert to a > table with NO index. Update are not performed in the function > > When 1 client connected postgres do 180 execution per second With 2 > clients connected postgres do 110 execution per second With 3 clients > connected postgres do 90 execution per second > > Finally with 6 connected clients postgres do 60 executions per second > (totally 360 executions per second) > > While testing, I monitor disk, memory and CPU and not found any overload. There's always a bottleneck - otherwise the system might run faster (and hit another bottleneck eventually). It might beCPU, I/O, memory, locking and maybe some less frequent things. > I know that with this information you can figure out somethigns, but > in normal conditions, Is normal the degradation of performance per > connection when connections are incremented? > Or should I spect 180 in the first and something similar in the second > connection? Maybe 170? > > > The server is a dual xeon quad core with 16 GB of ram and a very fast > storage The OS is a windows 2008 R2 x64 Might be, but we need more details about how the system works. On Linux I'd ask for output from 'iostat -x 1' and 'vmstat1' but you're on Windows so there are probably other tools. What version of PostgreSQL is this? What are the basic config values (shared_buffers, work_mem, effective_cache_size, ...)?Have you done some tuning? There's a wiki page about this: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server Have you tried to log slow queries? Maybe there's one query that makes the whole workload slow? See this: http://wiki.postgresql.org/wiki/Logging_Difficult_Queries Tomas
On 06/10/2011 07:29 PM, Anibal David Acosta wrote: > I know that with this information you can figure out somethigns, but in > normal conditions, Is normal the degradation of performance per connection > when connections are incremented? With most loads, you will find that the throughput per-worker decreases as you add workers. The overall throughput will usually increase with number of workers until you reach a certain "sweet spot" then decrease as you add more workers after that. Where that sweet spot is depends on how much your queries rely on CPU vs disk vs memory, your Pg version, how many disks you have, how fast they are and in what configuration they are in, what/how many CPUs you have, how much RAM you have, how fast your RAM is, etc. There's no simple formula because it's so workload dependent. The usual *very* rough rule of thumb given here is that your sweet spot should be *vaguely* number of cpu cores + number of hard drives. That's *incredibly* rough; if you care you should benchmark it using your real workload. If you need lots and lots of clients then it may be beneficial to use a connection pool like pgbouncer or PgPool-II so you don't have lots more connections trying to do work at once than your hardware can cope with. Having fewer connections doing work in the database at the same time can improve overall performance. -- Craig Ringer
On 06/10/2011 08:56 PM, Anibal David Acosta wrote: > The version is Postgres 9.0 > Yes, I setup the postgres.conf according to instructions in the > http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server > > > Cool, I will check this > http://wiki.postgresql.org/wiki/Logging_Difficult_Queries > > Looks like great starting point to find bottleneck > > But so, Is possible in excellent conditions that two connections duplicate the quantity of transactions per second? For two connections, if you have most of the data cached in RAM or you have lots of fast disks, then sure. For that matter, if they're synchronized scans of the same table then the second transaction might perform even faster than the first one! There are increasing overheads with transaction synchronization, etc with number of connections, and they'll usually land up contending for system resources like RAM (for disk cache, work_mem, etc), disk I/O, and CPU time. So you won't generally get linear scaling with number of connections. Greg Smith has done some excellent and detailed work on this. I highly recommend reading his writing, and you should consider buying his recent book "PostgreSQL 9.0 High Performance". See also: http://wiki.postgresql.org/wiki/Performance_Optimization There have been lots of postgresql scaling benchmarks done over time, too. You'll find a lot of information if you look around the wiki and Google. -- Craig Ringer
Excellent. Thanks I'll buy and read that book :) Thanks! -----Mensaje original----- De: Craig Ringer [mailto:craig@postnewspapers.com.au] Enviado el: viernes, 10 de junio de 2011 09:13 a.m. Para: Anibal David Acosta CC: tv@fuzzy.cz; pgsql-performance@postgresql.org Asunto: Re: [PERFORM] how much postgres can scale up? On 06/10/2011 08:56 PM, Anibal David Acosta wrote: > The version is Postgres 9.0 > Yes, I setup the postgres.conf according to instructions in the > http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server > > > Cool, I will check this > http://wiki.postgresql.org/wiki/Logging_Difficult_Queries > > Looks like great starting point to find bottleneck > > But so, Is possible in excellent conditions that two connections duplicate the quantity of transactions per second? For two connections, if you have most of the data cached in RAM or you have lots of fast disks, then sure. For that matter,if they're synchronized scans of the same table then the second transaction might perform even faster than the firstone! There are increasing overheads with transaction synchronization, etc with number of connections, and they'll usually landup contending for system resources like RAM (for disk cache, work_mem, etc), disk I/O, and CPU time. So you won't generallyget linear scaling with number of connections. Greg Smith has done some excellent and detailed work on this. I highly recommend reading his writing, and you should considerbuying his recent book "PostgreSQL 9.0 High Performance". See also: http://wiki.postgresql.org/wiki/Performance_Optimization There have been lots of postgresql scaling benchmarks done over time, too. You'll find a lot of information if you look aroundthe wiki and Google. -- Craig Ringer
> When 1 client connected postgres do 180 execution per second This is suspiciously close to 10.000 executions per minute. You got 10k RPM disks ? How's your IO system setup ? Try setting synchronous_commit to OFF in postgresql.conf and see if that changes the results. That'll give useful information.
> When 1 client connected postgres do 180 execution per second This is suspiciously close to 10.000 executions per minute. You got 10k RPM disks ? How's your IO system setup ? Try setting synchronous_commit to OFF in postgresql.conf and see if that changes the results. That'll give useful information.
On 06/10/2011 07:29 AM, Anibal David Acosta wrote: > When 1 client connected postgres do 180 execution per second > With 2 clients connected postgres do 110 execution per second > With 3 clients connected postgres do 90 execution per second > > Finally with 6 connected clients postgres do 60 executions per second > (totally 360 executions per second) > > While testing, I monitor disk, memory and CPU and not found any overload. > > I know that with this information you can figure out somethigns, but in > normal conditions, Is normal the degradation of performance per connection > when connections are incremented? > Or should I spect 180 in the first and something similar in the second > connection? Maybe 170? > Let's reformat this the way most people present it: clients tps 1 180 2 220 3 270 6 360 It's common for a single connection doing INSERT statements to hit a bottleneck based on how fast the drives used can spin. That's anywhere from 100 to 200 inserts/section, approximately, unless you have a battery-backed write cache. See http://wiki.postgresql.org/wiki/Reliable_Writes for more information. However, multiple clients can commit at once when a backlog occurs. So what you'll normally see in this situation is that the rate goes up faster than this as clients are added. Here's a real sample, from a server that's only physically capable of doing 120 commits/second on its 7200 RPM drive: clients tps 1 107 2 109 3 163 4 216 5 271 6 325 8 432 10 530 15 695 This is how it's supposed to scale even on basic hardware You didn't explore this far enough to really know how well your scaling is working here though. Since commit rates are limited by disk spin in this situation, the situation for 1 to 5 clients is not really representative of how a large number of clients will end up working. As already mentioning, turning off synchronous_commit should give you an interesting alternate set of numbers. It's also possible there may be something wrong with whatever client logic you are using here. Something about the way you've written it may be acquiring a lock that blocks other clients from executing efficiently for example. I'd suggest turning on log_lock_waits and setting deadlock_timeout to a small number, which should show you some extra logging in situations where people are waiting for locks. Running some queries to look at the lock data such as the examples at http://wiki.postgresql.org/wiki/Lock_Monitoring might be helpful too. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
Ok, I think I found possible bottleneck. The function that do some selects run really fast, more than 1.000 executions per seconds But the whole thing slowdown when update of one record in a very very small table happed I test with insert instead of update and same behavior occur. So, the only way to go up is turning off synchronous_commit, but it can be dangerous. Any, thanks a lot for your time. Anibal -----Mensaje original----- De: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] En nombre de Greg Smith Enviado el: viernes, 10 de junio de 2011 12:50 p.m. Para: pgsql-performance@postgresql.org Asunto: Re: [PERFORM] how much postgres can scale up? On 06/10/2011 07:29 AM, Anibal David Acosta wrote: > When 1 client connected postgres do 180 execution per second With 2 > clients connected postgres do 110 execution per second With 3 clients > connected postgres do 90 execution per second > > Finally with 6 connected clients postgres do 60 executions per second > (totally 360 executions per second) > > While testing, I monitor disk, memory and CPU and not found any overload. > > I know that with this information you can figure out somethigns, but > in normal conditions, Is normal the degradation of performance per > connection when connections are incremented? > Or should I spect 180 in the first and something similar in the second > connection? Maybe 170? > Let's reformat this the way most people present it: clients tps 1 180 2 220 3 270 6 360 It's common for a single connection doing INSERT statements to hit a bottleneck based on how fast the drives used can spin. That's anywhere from 100 to 200 inserts/section, approximately, unless you have a battery-backed write cache. See http://wiki.postgresql.org/wiki/Reliable_Writes for more information. However, multiple clients can commit at once when a backlog occurs. So what you'll normally see in this situation is that the rate goes up faster than this as clients are added. Here's a real sample, from a server that's only physically capable of doing 120 commits/second on its 7200 RPM drive: clients tps 1 107 2 109 3 163 4 216 5 271 6 325 8 432 10 530 15 695 This is how it's supposed to scale even on basic hardware You didn't explore this far enough to really know how well your scaling is working here though. Since commit rates are limited by disk spin in this situation, the situation for 1 to 5 clients is not really representative of how a large number of clients will end up working. As already mentioning, turning off synchronous_commit should give you an interesting alternate set of numbers. It's also possible there may be something wrong with whatever client logic you are using here. Something about the way you've written it may be acquiring a lock that blocks other clients from executing efficiently for example. I'd suggest turning on log_lock_waits and setting deadlock_timeout to a small number, which should show you some extra logging in situations where people are waiting for locks. Running some queries to look at the lock data such as the examples at http://wiki.postgresql.org/wiki/Lock_Monitoring might be helpful too. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Greg's book is highly recommended, and in my opinion a "must" for anyone doing serious work with Postgres. > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance- > owner@postgresql.org] On Behalf Of Anibal David Acosta > Sent: Friday, June 10, 2011 7:19 AM > To: 'Craig Ringer' > Cc: tv@fuzzy.cz; pgsql-performance@postgresql.org > Subject: Re: [PERFORM] how much postgres can scale up? > > Excellent. > > Thanks I'll buy and read that book :) > > > Thanks! > > > > -----Mensaje original----- > De: Craig Ringer [mailto:craig@postnewspapers.com.au] > Enviado el: viernes, 10 de junio de 2011 09:13 a.m. > Para: Anibal David Acosta > CC: tv@fuzzy.cz; pgsql-performance@postgresql.org > Asunto: Re: [PERFORM] how much postgres can scale up? > > On 06/10/2011 08:56 PM, Anibal David Acosta wrote: > > The version is Postgres 9.0 > > Yes, I setup the postgres.conf according to instructions in the > > http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server > > > > > > Cool, I will check this > > http://wiki.postgresql.org/wiki/Logging_Difficult_Queries > > > > Looks like great starting point to find bottleneck > > > > But so, Is possible in excellent conditions that two connections > duplicate the quantity of transactions per second? > > For two connections, if you have most of the data cached in RAM or you > have lots of fast disks, then sure. For that matter, if they're > synchronized scans of the same table then the second transaction might > perform even faster than the first one! > > There are increasing overheads with transaction synchronization, etc > with number of connections, and they'll usually land up contending for > system resources like RAM (for disk cache, work_mem, etc), disk I/O, > and CPU time. So you won't generally get linear scaling with number of > connections. > > Greg Smith has done some excellent and detailed work on this. I highly > recommend reading his writing, and you should consider buying his > recent book "PostgreSQL 9.0 High Performance". > > See also: > > http://wiki.postgresql.org/wiki/Performance_Optimization > > There have been lots of postgresql scaling benchmarks done over time, > too. You'll find a lot of information if you look around the wiki and > Google. > > -- > Craig Ringer > > > -- > Sent via pgsql-performance mailing list (pgsql- > performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance