Thread: Rather large LA

Rather large LA

From

Richard Shaw

Date:

05 September 2011, 07:35:08

Hi,

I have a database server that's part of a web stack and is experiencing prolonged load average spikes of up to 400+
whenthe db is restarted and first accessed by the other parts of the stack and has generally poor performance on even
simpleselect queries. 

There are 30 DBs in total on the server coming in at 226GB.  The one that's used the most is 67GB and there are another
29that come to 159GB.  

I'd really appreciate it if you could review my configurations below and make any suggestions that might help alleviate
theperformance issues.  I've been looking more into the shared buffers to the point of installing the contrib module to
checkwhat they're doing, possibly installing more RAM as the most used db @ 67GB might appreciate it, or moving the
mostused DB onto another set of disks, possible SSD. 


PostgreSQL 9.0.4
Pgbouncer 1.4.1

Linux 2.6.18-238.9.1.el5 #1 SMP Tue Apr 12 18:10:13 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

CentOS release 5.6 (Final)

4 x Intel Xeon-NehalemEX E7540-HexCore [2GHz] ( 24 physical cores )
32GB DDR3 RAM
1 x Adaptec 5805 Z  SATA/SAS RAID with battery backup
4 x Seagate Cheetah ST3300657SS 300GB 15RPM SAS drives in RAID 10
1 x 500GB 7200RPM SATA disk

Postgres and the OS reside on the same ex3 filesystem, whilst query and archive logging go onto the SATA disk which is
alsoext3. 


              name              |                                                  current_setting
                            

--------------------------------+-------------------------------------------------------------------------------------------------------------------
 version                        | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20080704(Red Hat 4.1.2-48), 64-bit 
 archive_command                | tar jcf /disk1/db-wal/%f.tar.bz2 %p
 archive_mode                   | on
 autovacuum                     | off
 checkpoint_completion_target   | 0.9
 checkpoint_segments            | 10
 client_min_messages            | notice
 effective_cache_size           | 17192MB
 external_pid_file              | /var/run/postgresql/9-main.pid
 fsync                          | off
 full_page_writes               | on
 lc_collate                     | en_US.UTF-8
 lc_ctype                       | en_US.UTF-8
 listen_addresses               |
 log_checkpoints                | on
 log_destination                | stderr
 log_directory                  | /disk1/pg_log
 log_error_verbosity            | verbose
 log_filename                   | postgresql-%Y-%m-%d_%H%M%S.log
 log_line_prefix                | %m %u %h
 log_min_duration_statement     | 250ms
 log_min_error_statement        | error
 log_min_messages               | notice
 log_rotation_age               | 1d
 logging_collector              | on
 maintenance_work_mem           | 32MB
 max_connections                | 1000
 max_prepared_transactions      | 25
 max_stack_depth                | 4MB
 port                           | 6432
 server_encoding                | UTF8
 shared_buffers                 | 8GB
 superuser_reserved_connections | 3
 synchronous_commit             | on
 temp_buffers                   | 5120
 TimeZone                       | UTC
 unix_socket_directory          | /var/run/postgresql
 wal_buffers                    | 10MB
 wal_level                      | archive
 wal_sync_method                | fsync
 work_mem                       | 16MB


Pgbouncer config

[databases]
* = port=6432
[pgbouncer]
user=postgres
pidfile = /tmp/pgbouncer.pid
listen_addr =
listen_port = 5432
unix_socket_dir = /var/run/postgresql
auth_type = trust
auth_file = /etc/pgbouncer/userlist.txt
admin_users = postgres
stats_users = postgres
pool_mode = session
server_reset_query = DISCARD ALL;
server_check_query = select 1
server_check_delay = 10
server_idle_timeout = 5
server_lifetime = 0
max_client_conn = 4096
default_pool_size = 100
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
client_idle_timeout = 30
reserve_pool_size = 800


Thanks in advance

Richard

Re: Rather large LA

From

Craig Ringer

Date:

05 September 2011, 07:50:08

On 5/09/2011 6:28 PM, Richard Shaw wrote:
>   max_connections                | 1000

Woah! No wonder you have "stampeding herd" problems after a DB or server
restart and are having performance issues.

When you have 1000 clients trying to do work at once, they'll all be
fighting over memory, disk I/O bandwidth, and CPU power which is nowhere
near sufficient to allow them to all actually achieve something all at
once. You'll have a lot of overhead as the OS tries to be fair and allow
each to make progress - at the expense of overall throughput.

If most of those connections are idle most of the time - say, they're
peristent connections from some webapp that requrires one connection per
webserver thread - then the situation isn't so bad. They're still
costing you backend RAM and various housekeeping overhead (including
task switching) related to lock management and shared memory, though.

Consider using a connection pooler like PgPool-II or PgBouncer if your
application is suitable. Most apps will be quite happy using pooled
connections; only a few things like advisory locking and HOLD cursors
work poorly with pooled connections. Using a pool allows you to reduce
the number of actively working and busy connections to the real Pg
backend to something your hardware can cope with, which should
dramatically increase performance and reduce startup load spikes. The
general very rough rule of thumb for number of active connections is
"number of CPU cores + number of HDDs" but of course this is only
incredibly rough and depends a lot on your workload and DB.

Ideally PostgreSQL would take care of this pooling inside the server,
breaking the "one connection = one worker backend" equivalence.
Unfortunately the server's process-based design makes that harder than
it could be. There's also a lot of debate about whether pooling is even
the core DB server's job and if it is, which of the several possible
approaches is the most appropriate. Then there's the issue of whether
in-server connection pooling is even appropriate without admission
control - which brings up the "admission control is insanely hard"
problem. So for now, pooling lives outside the server in projects like
PgPool-II and PgBouncer.

--
Craig Ringer

Re: Rather large LA

From

Richard Shaw

Date:

05 September 2011, 07:55:30

Hi Craig,

Apologies, I should have made that clearer, I am using PgBouncer 1.4.1 in front of Postgres and included the config at
thebottom of my original mail 

Regards

Richard

.........

On 5 Sep 2011, at 11:49, Craig Ringer wrote:

> On 5/09/2011 6:28 PM, Richard Shaw wrote:
>>  max_connections                | 1000
>
> Woah! No wonder you have "stampeding herd" problems after a DB or server restart and are having performance issues.
>
> When you have 1000 clients trying to do work at once, they'll all be fighting over memory, disk I/O bandwidth, and
CPUpower which is nowhere near sufficient to allow them to all actually achieve something all at once. You'll have a
lotof overhead as the OS tries to be fair and allow each to make progress - at the expense of overall throughput. 
>
> If most of those connections are idle most of the time - say, they're peristent connections from some webapp that
requriresone connection per webserver thread - then the situation isn't so bad. They're still costing you backend RAM
andvarious housekeeping overhead (including task switching) related to lock management and shared memory, though. 
>
> Consider using a connection pooler like PgPool-II or PgBouncer if your application is suitable. Most apps will be
quitehappy using pooled connections; only a few things like advisory locking and HOLD cursors work poorly with pooled
connections.Using a pool allows you to reduce the number of actively working and busy connections to the real Pg
backendto something your hardware can cope with, which should dramatically increase performance and reduce startup load
spikes.The general very rough rule of thumb for number of active connections is "number of CPU cores + number of HDDs"
butof course this is only incredibly rough and depends a lot on your workload and DB. 
>
> Ideally PostgreSQL would take care of this pooling inside the server, breaking the "one connection = one worker
backend"equivalence. Unfortunately the server's process-based design makes that harder than it could be. There's also a
lotof debate about whether pooling is even the core DB server's job and if it is, which of the several possible
approachesis the most appropriate. Then there's the issue of whether in-server connection pooling is even appropriate
withoutadmission control - which brings up the "admission control is insanely hard" problem. So for now, pooling lives
outsidethe server in projects like PgPool-II and PgBouncer. 
>
> --
> Craig Ringer

Re: Rather large LA

From

Craig Ringer

Date:

05 September 2011, 08:17:03

On 5/09/2011 6:55 PM, Richard Shaw wrote:
> Hi Craig,
>
> Apologies, I should have made that clearer, I am using PgBouncer 1.4.1 in front of Postgres and included the config
atthe bottom of my original mail 
>
Ah, I see. The point still stands: your hardware can *not* efficiently
do work for 1000 concurrent backend workers. Reduce the maximum number
of workers by setting a lower cap on the pool size and a lower
max_connections. This won't work (you'll run out of pooler connections)
unless you also set PgBouncer to transaction pooling mode instead of the
default session pooling mode, which you're currently using. It is
*important* to read the documentation on this before doing it, as there
are implications for apps that use extra-transactional features like
HOLD cursors and advisory locks.

See: http://pgbouncer.projects.postgresql.org/doc/usage.html

It may also be necessary to set PgBouncer to block (wait) rather than
report an error when there is no pooled connection available to start a
new transaction on. I'm not sure what PgBouncer's default behavior for
that is and didn't see anything immediately clear in the pgbouncer(5)
ini file man page.

--
Craig Ringer

Re: Rather large LA

From

Andy Colson

Date:

05 September 2011, 10:39:19

On 09/05/2011 05:28 AM, Richard Shaw wrote:
>
> Hi,
>
> I have a database server that's part of a web stack and is experiencing prolonged load average spikes of up to 400+
whenthe db is restarted and first accessed by the other parts of the stack and has generally poor performance on even
simpleselect queries. 
>

Is the slowness new?  Or has it always been a bit slow?  Have you checked for bloat on your tables/indexes?

When you start up, does it peg a cpu or sit around doing IO?

Have you reviewed the server logs?


autovacuum                     | off

Why?  I assume that's a problem.

fsync                          | off

Seriously?


-Andy



> There are 30 DBs in total on the server coming in at 226GB.  The one that's used the most is 67GB and there are
another29 that come to 159GB. 
>
> I'd really appreciate it if you could review my configurations below and make any suggestions that might help
alleviatethe performance issues.  I've been looking more into the shared buffers to the point of installing the contrib
moduleto check what they're doing, possibly installing more RAM as the most used db @ 67GB might appreciate it, or
movingthe most used DB onto another set of disks, possible SSD. 
>
>
> PostgreSQL 9.0.4
> Pgbouncer 1.4.1
>
> Linux 2.6.18-238.9.1.el5 #1 SMP Tue Apr 12 18:10:13 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
>
> CentOS release 5.6 (Final)
>
> 4 x Intel Xeon-NehalemEX E7540-HexCore [2GHz] ( 24 physical cores )
> 32GB DDR3 RAM
> 1 x Adaptec 5805 Z  SATA/SAS RAID with battery backup
> 4 x Seagate Cheetah ST3300657SS 300GB 15RPM SAS drives in RAID 10
> 1 x 500GB 7200RPM SATA disk
>
> Postgres and the OS reside on the same ex3 filesystem, whilst query and archive logging go onto the SATA disk which
isalso ext3. 
>
>
>                name              |                                                  current_setting
>
--------------------------------+-------------------------------------------------------------------------------------------------------------------
>   version                        | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20080704(Red Hat 4.1.2-48), 64-bit 
>   archive_command                | tar jcf /disk1/db-wal/%f.tar.bz2 %p
>   archive_mode                   | on
>   autovacuum                     | off
>   checkpoint_completion_target   | 0.9
>   checkpoint_segments            | 10
>   client_min_messages            | notice
>   effective_cache_size           | 17192MB
>   external_pid_file              | /var/run/postgresql/9-main.pid
>   fsync                          | off
>   full_page_writes               | on
>   lc_collate                     | en_US.UTF-8
>   lc_ctype                       | en_US.UTF-8
>   listen_addresses               |
>   log_checkpoints                | on
>   log_destination                | stderr
>   log_directory                  | /disk1/pg_log
>   log_error_verbosity            | verbose
>   log_filename                   | postgresql-%Y-%m-%d_%H%M%S.log
>   log_line_prefix                | %m %u %h
>   log_min_duration_statement     | 250ms
>   log_min_error_statement        | error
>   log_min_messages               | notice
>   log_rotation_age               | 1d
>   logging_collector              | on
>   maintenance_work_mem           | 32MB
>   max_connections                | 1000
>   max_prepared_transactions      | 25
>   max_stack_depth                | 4MB
>   port                           | 6432
>   server_encoding                | UTF8
>   shared_buffers                 | 8GB
>   superuser_reserved_connections | 3
>   synchronous_commit             | on
>   temp_buffers                   | 5120
>   TimeZone                       | UTC
>   unix_socket_directory          | /var/run/postgresql
>   wal_buffers                    | 10MB
>   wal_level                      | archive
>   wal_sync_method                | fsync
>   work_mem                       | 16MB
>
>
> Pgbouncer config
>
> [databases]
> * = port=6432
> [pgbouncer]
> user=postgres
> pidfile = /tmp/pgbouncer.pid
> listen_addr =
> listen_port = 5432
> unix_socket_dir = /var/run/postgresql
> auth_type = trust
> auth_file = /etc/pgbouncer/userlist.txt
> admin_users = postgres
> stats_users = postgres
> pool_mode = session
> server_reset_query = DISCARD ALL;
> server_check_query = select 1
> server_check_delay = 10
> server_idle_timeout = 5
> server_lifetime = 0
> max_client_conn = 4096
> default_pool_size = 100
> log_connections = 1
> log_disconnections = 1
> log_pooler_errors = 1
> client_idle_timeout = 30
> reserve_pool_size = 800
>
>
> Thanks in advance
>
> Richard
>
>

Re: Rather large LA

From

Richard Shaw

Date:

05 September 2011, 10:57:53

Hi Andy,

It's not a new issue no, It's a legacy system that is in no way ideal but is also not in a position to be overhauled.
Indexesare correct, tables are up to 25 million rows.     

On startup, it hits CPU more than IO, I'll provide some additional stats after I restart it tonight.

Server logs have been reviewed and where possible, slow queries have been fixed.

Autovacuum has been disabled and set to run manually via cron during a quiet period and fsync has recently been turned
offto gauge any real world performance increase, there is battery backup on the raid card providing some level of
resilience.

Thanks

Richard


On 5 Sep 2011, at 14:39, Andy Colson wrote:

> On 09/05/2011 05:28 AM, Richard Shaw wrote:
>>
>> Hi,
>>
>> I have a database server that's part of a web stack and is experiencing prolonged load average spikes of up to 400+
whenthe db is restarted and first accessed by the other parts of the stack and has generally poor performance on even
simpleselect queries. 
>>
>
> Is the slowness new?  Or has it always been a bit slow?  Have you checked for bloat on your tables/indexes?
>
> When you start up, does it peg a cpu or sit around doing IO?
>
> Have you reviewed the server logs?
>
>
> autovacuum                     | off
>
> Why?  I assume that's a problem.
>
> fsync                          | off
>
> Seriously?
>
>
> -Andy
>
>
>
>> There are 30 DBs in total on the server coming in at 226GB.  The one that's used the most is 67GB and there are
another29 that come to 159GB. 
>>
>> I'd really appreciate it if you could review my configurations below and make any suggestions that might help
alleviatethe performance issues.  I've been looking more into the shared buffers to the point of installing the contrib
moduleto check what they're doing, possibly installing more RAM as the most used db @ 67GB might appreciate it, or
movingthe most used DB onto another set of disks, possible SSD. 
>>
>>
>> PostgreSQL 9.0.4
>> Pgbouncer 1.4.1
>>
>> Linux 2.6.18-238.9.1.el5 #1 SMP Tue Apr 12 18:10:13 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
>>
>> CentOS release 5.6 (Final)
>>
>> 4 x Intel Xeon-NehalemEX E7540-HexCore [2GHz] ( 24 physical cores )
>> 32GB DDR3 RAM
>> 1 x Adaptec 5805 Z  SATA/SAS RAID with battery backup
>> 4 x Seagate Cheetah ST3300657SS 300GB 15RPM SAS drives in RAID 10
>> 1 x 500GB 7200RPM SATA disk
>>
>> Postgres and the OS reside on the same ex3 filesystem, whilst query and archive logging go onto the SATA disk which
isalso ext3. 
>>
>>
>>               name              |                                                  current_setting
>>
--------------------------------+-------------------------------------------------------------------------------------------------------------------
>>  version                        | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20080704(Red Hat 4.1.2-48), 64-bit 
>>  archive_command                | tar jcf /disk1/db-wal/%f.tar.bz2 %p
>>  archive_mode                   | on
>>  autovacuum                     | off
>>  checkpoint_completion_target   | 0.9
>>  checkpoint_segments            | 10
>>  client_min_messages            | notice
>>  effective_cache_size           | 17192MB
>>  external_pid_file              | /var/run/postgresql/9-main.pid
>>  fsync                          | off
>>  full_page_writes               | on
>>  lc_collate                     | en_US.UTF-8
>>  lc_ctype                       | en_US.UTF-8
>>  listen_addresses               |
>>  log_checkpoints                | on
>>  log_destination                | stderr
>>  log_directory                  | /disk1/pg_log
>>  log_error_verbosity            | verbose
>>  log_filename                   | postgresql-%Y-%m-%d_%H%M%S.log
>>  log_line_prefix                | %m %u %h
>>  log_min_duration_statement     | 250ms
>>  log_min_error_statement        | error
>>  log_min_messages               | notice
>>  log_rotation_age               | 1d
>>  logging_collector              | on
>>  maintenance_work_mem           | 32MB
>>  max_connections                | 1000
>>  max_prepared_transactions      | 25
>>  max_stack_depth                | 4MB
>>  port                           | 6432
>>  server_encoding                | UTF8
>>  shared_buffers                 | 8GB
>>  superuser_reserved_connections | 3
>>  synchronous_commit             | on
>>  temp_buffers                   | 5120
>>  TimeZone                       | UTC
>>  unix_socket_directory          | /var/run/postgresql
>>  wal_buffers                    | 10MB
>>  wal_level                      | archive
>>  wal_sync_method                | fsync
>>  work_mem                       | 16MB
>>
>>
>> Pgbouncer config
>>
>> [databases]
>> * = port=6432
>> [pgbouncer]
>> user=postgres
>> pidfile = /tmp/pgbouncer.pid
>> listen_addr =
>> listen_port = 5432
>> unix_socket_dir = /var/run/postgresql
>> auth_type = trust
>> auth_file = /etc/pgbouncer/userlist.txt
>> admin_users = postgres
>> stats_users = postgres
>> pool_mode = session
>> server_reset_query = DISCARD ALL;
>> server_check_query = select 1
>> server_check_delay = 10
>> server_idle_timeout = 5
>> server_lifetime = 0
>> max_client_conn = 4096
>> default_pool_size = 100
>> log_connections = 1
>> log_disconnections = 1
>> log_pooler_errors = 1
>> client_idle_timeout = 30
>> reserve_pool_size = 800
>>
>>
>> Thanks in advance
>>
>> Richard
>>
>>
>

Re: Rather large LA

From

pasman pasmański

Date:

05 September 2011, 12:14:23

I think that wal_segments are too low, try 30.

2011/9/5, Andy Colson <andy@squeakycode.net>:
> On 09/05/2011 05:28 AM, Richard Shaw wrote:
>>
>> Hi,
>>
>> I have a database server that's part of a web stack and is experiencing
>> prolonged load average spikes of up to 400+ when the db is restarted and
>> first accessed by the other parts of the stack and has generally poor
>> performance on even simple select queries.
>>
>
> Is the slowness new?  Or has it always been a bit slow?  Have you checked
> for bloat on your tables/indexes?
>
> When you start up, does it peg a cpu or sit around doing IO?
>
> Have you reviewed the server logs?
>
>
> autovacuum                     | off
>
> Why?  I assume that's a problem.
>
> fsync                          | off
>
> Seriously?
>
>
> -Andy
>
>
>
>> There are 30 DBs in total on the server coming in at 226GB.  The one
>> that's used the most is 67GB and there are another 29 that come to 159GB.
>>
>> I'd really appreciate it if you could review my configurations below and
>> make any suggestions that might help alleviate the performance issues.
>> I've been looking more into the shared buffers to the point of installing
>> the contrib module to check what they're doing, possibly installing more
>> RAM as the most used db @ 67GB might appreciate it, or moving the most
>> used DB onto another set of disks, possible SSD.
>>
>>
>> PostgreSQL 9.0.4
>> Pgbouncer 1.4.1
>>
>> Linux 2.6.18-238.9.1.el5 #1 SMP Tue Apr 12 18:10:13 EDT 2011 x86_64 x86_64
>> x86_64 GNU/Linux
>>
>> CentOS release 5.6 (Final)
>>
>> 4 x Intel Xeon-NehalemEX E7540-HexCore [2GHz] ( 24 physical cores )
>> 32GB DDR3 RAM
>> 1 x Adaptec 5805 Z  SATA/SAS RAID with battery backup
>> 4 x Seagate Cheetah ST3300657SS 300GB 15RPM SAS drives in RAID 10
>> 1 x 500GB 7200RPM SATA disk
>>
>> Postgres and the OS reside on the same ex3 filesystem, whilst query and
>> archive logging go onto the SATA disk which is also ext3.
>>
>>
>>                name              |
>>          current_setting
>>
--------------------------------+-------------------------------------------------------------------------------------------------------------------
>>   version                        | PostgreSQL 9.0.4 on
>> x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red
>> Hat 4.1.2-48), 64-bit
>>   archive_command                | tar jcf /disk1/db-wal/%f.tar.bz2 %p
>>   archive_mode                   | on
>>   autovacuum                     | off
>>   checkpoint_completion_target   | 0.9
>>   checkpoint_segments            | 10
>>   client_min_messages            | notice
>>   effective_cache_size           | 17192MB
>>   external_pid_file              | /var/run/postgresql/9-main.pid
>>   fsync                          | off
>>   full_page_writes               | on
>>   lc_collate                     | en_US.UTF-8
>>   lc_ctype                       | en_US.UTF-8
>>   listen_addresses               |
>>   log_checkpoints                | on
>>   log_destination                | stderr
>>   log_directory                  | /disk1/pg_log
>>   log_error_verbosity            | verbose
>>   log_filename                   | postgresql-%Y-%m-%d_%H%M%S.log
>>   log_line_prefix                | %m %u %h
>>   log_min_duration_statement     | 250ms
>>   log_min_error_statement        | error
>>   log_min_messages               | notice
>>   log_rotation_age               | 1d
>>   logging_collector              | on
>>   maintenance_work_mem           | 32MB
>>   max_connections                | 1000
>>   max_prepared_transactions      | 25
>>   max_stack_depth                | 4MB
>>   port                           | 6432
>>   server_encoding                | UTF8
>>   shared_buffers                 | 8GB
>>   superuser_reserved_connections | 3
>>   synchronous_commit             | on
>>   temp_buffers                   | 5120
>>   TimeZone                       | UTC
>>   unix_socket_directory          | /var/run/postgresql
>>   wal_buffers                    | 10MB
>>   wal_level                      | archive
>>   wal_sync_method                | fsync
>>   work_mem                       | 16MB
>>
>>
>> Pgbouncer config
>>
>> [databases]
>> * = port=6432
>> [pgbouncer]
>> user=postgres
>> pidfile = /tmp/pgbouncer.pid
>> listen_addr =
>> listen_port = 5432
>> unix_socket_dir = /var/run/postgresql
>> auth_type = trust
>> auth_file = /etc/pgbouncer/userlist.txt
>> admin_users = postgres
>> stats_users = postgres
>> pool_mode = session
>> server_reset_query = DISCARD ALL;
>> server_check_query = select 1
>> server_check_delay = 10
>> server_idle_timeout = 5
>> server_lifetime = 0
>> max_client_conn = 4096
>> default_pool_size = 100
>> log_connections = 1
>> log_disconnections = 1
>> log_pooler_errors = 1
>> client_idle_timeout = 30
>> reserve_pool_size = 800
>>
>>
>> Thanks in advance
>>
>> Richard
>>
>>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


--
------------
pasman

Re: Rather large LA

From

Andy Colson

Date:

05 September 2011, 13:47:24

On 09/05/2011 08:57 AM, Richard Shaw wrote:
>
> Hi Andy,
>
> It's not a new issue no, It's a legacy system that is in no way ideal but is also not in a position to be overhauled.
Indexes are correct, tables are up to 25 million rows. 
>
> On startup, it hits CPU more than IO, I'll provide some additional stats after I restart it tonight.
>
> Server logs have been reviewed and where possible, slow queries have been fixed.
>
> Autovacuum has been disabled and set to run manually via cron during a quiet period and fsync has recently been
turnedoff to gauge any real world performance increase, there is battery backup on the raid card providing some level
ofresilience. 
>
> Thanks
>
> Richard
>
>

So I'm guessing that setting fsync off did not help your performance problems.  And you say CPU is high, so I think we
canrule out disk IO problems. 

> possibly installing more RAM as the most used db @ 67GB might appreciate it

That would only be if every row of that 67 gig is being used.  If its history stuff that never get's looked up, then
addingmore ram wont help because none of that data is being loaded anyway.  Out of that 67 Gig, what is the working
size? (Not really a number you can look up, I'm looking for more of an empirical little/some/lots/most). 

pgpool:

max_client_conn = 4096
reserve_pool_size = 800

I've not used pgpool, but these seem really high.  Does that mean pgpool will create 4K connectsions to the backend?
Ordoes it mean it'll allow 4K connections to pgpool but only 800 connections to the backend. 

I wonder...When you startup, if you watch vmstat for a bit, do you have tons of context switches?  If its not IO, and
youdont say "OMG, CPU is pegged!" so I assume its not CPU bound, I wonder if there are so many processes fighting for
resourcesthey are stepping on each other. 

When you get up and running (and its slow), what does this display:

ps ax|grep postgr|wc --lines

That and a minute of 'vmstat 2' would be neet to see as well.

-Andy

Re: Rather large LA

From

Andres Freund

Date:

05 September 2011, 14:24:34

On Monday, September 05, 2011 14:57:43 Richard Shaw wrote:
> Autovacuum has been disabled and set to run manually via cron during a quiet
> period and fsync has recently been turned off to gauge any real world
> performance increase, there is battery backup on the raid card providing
> some level of resilience.
That doesn't help you against a failure due to fsync() off as the BBU can only
protect data that actually has been written to disk. Without fsync=on no
guarantee about that exists.

Andres

Re: Rather large LA

From

Alan Hodgson

Date:

05 September 2011, 17:05:20

On September 5, 2011, Richard Shaw <richard@aggress.net> wrote:

> Hi Andy,

> It's not a new issue no, It's a legacy system that is in no way ideal but

> is also not in a position to be overhauled. Indexes are correct, tables

> are up to 25 million rows.

> On startup, it hits CPU more than IO, I'll provide some additional stats

> after I restart it tonight.

I bet it's I/O bound until a good chunk of the active data gets cached. Run a vmstat 1 while it's that busy, I bet most of the cpu time is really in io_wait.

Re: Rather large LA

From

Scott Marlowe

Date:

05 September 2011, 17:23:40

On Mon, Sep 5, 2011 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote:
> On Monday, September 05, 2011 14:57:43 Richard Shaw wrote:
>> Autovacuum has been disabled and set to run manually via cron during a quiet
>> period and fsync has recently been turned off to gauge any real world
>> performance increase, there is battery backup on the raid card providing
>> some level of resilience.
> That doesn't help you against a failure due to fsync() off as the BBU can only
> protect data that actually has been written to disk. Without fsync=on no
> guarantee about that exists.

Further, if you've got a bbu cache on the RAID card the gains from
fsync=off wll be low / nonexistent.

Re: Rather large LA

From

Richard Shaw

Date:

05 September 2011, 19:36:21

vmstat 1 and iostat -x output

Normal


procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0   2332 442428  73904 31287344    0    0    89    42    0    0  7  5 85  3  0
 4  1   2332 428428  73904 31288288    0    0  1440     0 6553 29066  5  2 91  1  0
 4  1   2332 422688  73904 31288688    0    0   856     0 4480 18860  3  1 95  1  0
 0  0   2332 476072  73920 31289444    0    0   544  1452 4478 19103  3  1 95  0  0
 3  0   2332 422288  73920 31290572    0    0  1268   496 5565 23410  5  3 91  1  0

cavg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.11    0.01    2.58    2.56    0.00   89.74

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               1.00   143.00 523.50 108.00  8364.00  2008.00    16.42     2.78    4.41   1.56  98.35
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              1.00   143.00 523.50 108.00  8364.00  2008.00    16.42     2.78    4.41   1.56  98.35
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.89    0.00    2.94    3.14    0.00   89.04

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               1.00     0.00 285.00  0.00  4808.00     0.00    16.87     2.46    8.29   3.02  86.20
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              1.00     0.00 285.00  0.00  4808.00     0.00    16.87     2.46    8.29   3.02  86.20
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00   161.50  0.00  6.50     0.00  1344.00   206.77     0.00    0.69   0.15   0.10
sdb1              0.00   161.50  0.00  6.50     0.00  1344.00   206.77     0.00    0.69   0.15   0.10


After Restart

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2 34   2332 5819012  75632 25855368    0    0    89    42    0    0  7  5 85  3  0
 4 39   2332 5813344  75628 25833588    0    0  5104   324 5480 27047  3  1 84 11  0
 2 47   2332 5815212  75336 25812064    0    0  4356  1664 5627 28695  3  1 84 12  0
 2 40   2332 5852452  75340 25817496    0    0  5632   828 5817 28832  3  1 84 11  0
 1 45   2332 5835704  75348 25817072    0    0  4960  1004 5111 25782  2  1 88  9  0
 2 42   2332 5840320  75356 25811632    0    0  3884   492 5405 27858  3  1 88  8  0
 0 47   2332 5826648  75348 25805296    0    0  4432  1268 5888 29556  3  1 83 13  0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.26    0.00    1.69   25.21    0.00   69.84

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.50    45.00 520.00  2.50  8316.00   380.00    16.64    71.70  118.28   1.92 100.10
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.50    45.00 520.00  2.50  8316.00   380.00    16.64    71.70  118.28   1.92 100.10
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00   196.50  0.00 10.50     0.00  1656.00   157.71     0.01    0.67   0.52   0.55
sdb1              0.00   196.50  0.00 10.50     0.00  1656.00   157.71     0.01    0.67   0.52   0.55

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.97    0.00    1.71   20.88    0.00   73.44

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               1.00     0.00 532.00  0.00  8568.00     0.00    16.11    73.73  148.44   1.88 100.05
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              1.00     0.00 532.00  0.00  8568.00     0.00    16.11    73.73  148.44   1.88 100.05
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00   106.50  0.00 11.50     0.00   944.00    82.09     0.00    0.39   0.30   0.35
sdb1              0.00   106.50  0.00 11.50     0.00   944.00    82.09     0.00    0.39   0.30   0.35

Regards

Richard

.........

On 5 Sep 2011, at 21:05, Alan Hodgson wrote:

> On September 5, 2011, Richard Shaw <richard@aggress.net> wrote:
> > Hi Andy,
> >
> > It's not a new issue no, It's a legacy system that is in no way ideal but
> > is also not in a position to be overhauled.  Indexes are correct, tables
> > are up to 25 million rows.
> >
> > On startup, it hits CPU more than IO, I'll provide some additional stats
> > after I restart it tonight.
>
> I bet it's I/O bound until a good chunk of the active data gets cached. Run a vmstat 1 while it's that busy, I bet
mostof the cpu time is really in io_wait.

Re: Rather large LA

From

Scott Marlowe

Date:

05 September 2011, 20:31:09

On Mon, Sep 5, 2011 at 4:36 PM, Richard Shaw <richard@aggress.net> wrote:
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
> sda               1.00   143.00 523.50 108.00  8364.00  2008.00    16.42     2.78    4.41   1.56  98.35
> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
> sda2              1.00   143.00 523.50 108.00  8364.00  2008.00    16.42     2.78    4.41   1.56  98.35

So what is /dev/sda2 mounted as?

Re: Rather large LA

From

Richard Shaw

Date:

06 September 2011, 11:20:09

/

OS and Postgres on same mount point

On 6 Sep 2011, at 00:31, Scott Marlowe wrote:

> On Mon, Sep 5, 2011 at 4:36 PM, Richard Shaw <richard@aggress.net> wrote:
>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
>> sda               1.00   143.00 523.50 108.00  8364.00  2008.00    16.42     2.78    4.41   1.56  98.35
>> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
>> sda2              1.00   143.00 523.50 108.00  8364.00  2008.00    16.42     2.78    4.41   1.56  98.35
>
> So what is /dev/sda2 mounted as?

Re: Rather large LA

From

Andres Freund

Date:

06 September 2011, 11:51:29

On Monday 05 Sep 2011 22:23:32 Scott Marlowe wrote:
> On Mon, Sep 5, 2011 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote:
> > On Monday, September 05, 2011 14:57:43 Richard Shaw wrote:
> >> Autovacuum has been disabled and set to run manually via cron during a
> >> quiet period and fsync has recently been turned off to gauge any real
> >> world performance increase, there is battery backup on the raid card
> >> providing some level of resilience.
> >
> > That doesn't help you against a failure due to fsync() off as the BBU can
> > only protect data that actually has been written to disk. Without
> > fsync=on no guarantee about that exists.
>
> Further, if you've got a bbu cache on the RAID card the gains from
> fsync=off wll be low / nonexistent.
Thats not necessarily true. If you have a mixed load of many small writes and
some parallel huge writes (especially in combination with big indexes)
fsync=off still can give you quite big performance increases. Even in the
presenence of synchronous_commit=off.

Andres

Re: Rather large LA

From

Alan Hodgson

Date:

06 September 2011, 16:08:18

On September 5, 2011 03:36:09 PM you wrote:
> After Restart
>
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------ r  b   swpd   free   buff  cache   si   so    bi    bo   in
>   cs us sy id wa st 2 34   2332 5819012  75632 25855368    0    0    89
> 42    0    0  7  5 85  3  0 4 39   2332 5813344  75628 25833588    0    0
> 5104   324 5480 27047  3  1 84 11  0 2 47   2332 5815212  75336 25812064
>  0    0  4356  1664 5627 28695  3  1 84 12  0 2 40   2332 5852452  75340
> 25817496    0    0  5632   828 5817 28832  3  1 84 11  0 1 45   2332
> 5835704  75348 25817072    0    0  4960  1004 5111 25782  2  1 88  9  0 2
> 42   2332 5840320  75356 25811632    0    0  3884   492 5405 27858  3  1
> 88  8  0 0 47   2332 5826648  75348 25805296    0    0  4432  1268 5888
> 29556  3  1 83 13  0
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            3.26    0.00    1.69   25.21    0.00   69.84
>
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> avgqu-sz   await  svctm  %util sda               0.50    45.00 520.00
> 2.50  8316.00   380.00    16.64    71.70  118.28   1.92 100.10 sda1
>       0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00
> 0.00   0.00   0.00 sda2              0.50    45.00 520.00  2.50  8316.00
> 380.00    16.64    71.70  118.28   1.92 100.10 sda3              0.00
> 0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
> sdb               0.00   196.50  0.00 10.50     0.00  1656.00   157.71
> 0.01    0.67   0.52   0.55 sdb1              0.00   196.50  0.00 10.50
> 0.00  1656.00   157.71     0.01    0.67   0.52   0.55
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            3.97    0.00    1.71   20.88    0.00   73.44

Yeah 20% I/O wait I imagine feels pretty slow. 8 cores?

Re: Rather large LA

From

Richard Shaw

Date:

06 September 2011, 16:11:21

24 :)

4 x Intel Xeon-NehalemEX E7540-HexCore [2GHz]

On 6 Sep 2011, at 20:07, Alan Hodgson wrote:

> On September 5, 2011 03:36:09 PM you wrote:
>> After Restart
>>
>> procs -----------memory---------- ---swap-- -----io---- --system--
>> -----cpu------ r  b   swpd   free   buff  cache   si   so    bi    bo   in
>>  cs us sy id wa st 2 34   2332 5819012  75632 25855368    0    0    89
>> 42    0    0  7  5 85  3  0 4 39   2332 5813344  75628 25833588    0    0
>> 5104   324 5480 27047  3  1 84 11  0 2 47   2332 5815212  75336 25812064
>> 0    0  4356  1664 5627 28695  3  1 84 12  0 2 40   2332 5852452  75340
>> 25817496    0    0  5632   828 5817 28832  3  1 84 11  0 1 45   2332
>> 5835704  75348 25817072    0    0  4960  1004 5111 25782  2  1 88  9  0 2
>> 42   2332 5840320  75356 25811632    0    0  3884   492 5405 27858  3  1
>> 88  8  0 0 47   2332 5826648  75348 25805296    0    0  4432  1268 5888
>> 29556  3  1 83 13  0
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>           3.26    0.00    1.69   25.21    0.00   69.84
>>
>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>> avgqu-sz   await  svctm  %util sda               0.50    45.00 520.00
>> 2.50  8316.00   380.00    16.64    71.70  118.28   1.92 100.10 sda1
>>      0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00
>> 0.00   0.00   0.00 sda2              0.50    45.00 520.00  2.50  8316.00
>> 380.00    16.64    71.70  118.28   1.92 100.10 sda3              0.00
>> 0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
>> sdb               0.00   196.50  0.00 10.50     0.00  1656.00   157.71
>> 0.01    0.67   0.52   0.55 sdb1              0.00   196.50  0.00 10.50
>> 0.00  1656.00   157.71     0.01    0.67   0.52   0.55
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>           3.97    0.00    1.71   20.88    0.00   73.44
>
> Yeah 20% I/O wait I imagine feels pretty slow. 8 cores?

Re: Rather large LA

From

Alan Hodgson

Date:

06 September 2011, 16:22:08

On September 6, 2011 12:11:10 PM Richard Shaw wrote:
> 24 :)
>
> 4 x Intel Xeon-NehalemEX E7540-HexCore [2GHz]
>

Nice box.

Still I/O-bound, though. SSDs would help a lot, I would think.

Re: Rather large LA

From

Richard Shaw

Date:

06 September 2011, 16:35:52

Thanks for the advice, It's one under consideration at the moment.   What are your thoughts on increasing RAM and
shared_buffers?

On 6 Sep 2011, at 20:21, Alan Hodgson wrote:

> On September 6, 2011 12:11:10 PM Richard Shaw wrote:
>> 24 :)
>>
>> 4 x Intel Xeon-NehalemEX E7540-HexCore [2GHz]
>>
>
> Nice box.
>
> Still I/O-bound, though. SSDs would help a lot, I would think.

Re: Rather large LA

From

Alan Hodgson

Date:

06 September 2011, 16:48:04

On September 6, 2011 12:35:35 PM Richard Shaw wrote:
> Thanks for the advice, It's one under consideration at the moment.   What
> are your thoughts on increasing RAM and shared_buffers?
>

If it's running OK after the startup rush, and it seems to be, I would leave
them alone. More RAM is always good, but I don't see it helping with this
particular issue.

Re: Rather large LA

From

Damon Snyder

Date:

06 September 2011, 18:01:35

If you are not doing so already, another approach to preventing the slam at startup would be to implement some form of caching either in memcache or an http accelerator such as varnish (https://www.varnish-cache.org/). Depending on your application and the usage patterns, you might be able to fairly easily insert varnish into your web stack.

Damon

On Tue, Sep 6, 2011 at 12:47 PM, Alan Hodgson <ahodgson@simkin.ca> wrote:

On September 6, 2011 12:35:35 PM Richard Shaw wrote:
> Thanks for the advice, It's one under consideration at the moment. What
> are your thoughts on increasing RAM and shared_buffers?
>

If it's running OK after the startup rush, and it seems to be, I would leave
them alone. More RAM is always good, but I don't see it helping with this
particular issue.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Rather large LA

From

Vitalii Tymchyshyn

Date:

07 September 2011, 03:54:39

Hello.

As it turned out to be iowait, I'd recommend to try to load at least
some hot relations into FS cache with dd on startup. With a lot of RAM
on FreeBSD I even sometimes use this for long queries that require a lot
of index scans.
This converts random IO into sequential IO that is much much faster.
You can try it even while your DB starting - if it works you will see
IOwait drop and user time raise.
What I do on FreeBSD (as I don't have enough RAM to load all the DB into
RAM) is:
1) ktrace on backend process[es]. Linux seems to have similar tool
2) Find files that take a lot of long reads
3) dd this files to /dev/null

In this way you can find hot files. As soon as you have them (or if you
can afford to load everything), you can put dd into startup scripts. Or
I can imagine an automatic script that will do such things for some time
after startup.

Best regards, Vitalii Tymchyshyn