Thread: Postgres slave not catching up (on 9.2)

From:
Ruben Domingo Gaspar Aparicio
Date:

Hello,

 

I have built up a hot-standby between a master running 9.2.4 and a slave running 9.2.9. I did the initial copy using "pg_basebackup" . My recovery.conf looks like:

 

standby_mode = 'on'

primary_conninfo = 'host=pXXXXXXXX port=XXXX user=replicator'

trigger_file = 'failover.now'

restore_command = 'test -f /ORA/dbs02/PUPDBTST/archive/%f  && ln -fns /ORA/dbs02/PUPDBTST/archive/%f %p'

archive_cleanup_command = '/usr/local/pgsql/postgresql-9.2.9/bin/pg_archivecleanup /ORA/dbs03/PUPDBTST/data %r'

 

The slave (I don't have control on the master) is using 2 NFS file systems, one for WALs and another one for the data, on Netapp controllers:

 

dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)

dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)

 

The master produces quite a lot of WALs. This is what I get on the slave (number of WAL files, date-hour, Total size in MB), so per day is more than 400GB:

 

   1582 20140930-17 25312MB

   1328 20140930-18 21248MB

   1960 20140930-19 31360MB

   1201 20140930-20 19216MB

   1467 20140930-21 23472MB

   1298 20140930-22 20768MB

   1579 20140930-23 25264MB

   1646 20141001-00 26336MB

   1274 20141001-01 20384MB

   1652 20141001-02 26432MB

   1756 20141001-03 28096MB

   1628 20141001-04 26048MB

   1015 20141001-05 16240MB

   1624 20141001-06 25984MB

   1652 20141001-07 26432MB

   1286 20141001-08 20576MB

   1485 20141001-09 23760MB

   1987 20141001-10 31792MB

   1432 20141001-11 22912MB

   1235 20141001-12 19760MB

   1690 20141001-13 27040MB

   1442 20141001-14 23072MB

   1500 20141001-15 24000MB

   1306 20141001-16 20896MB

   1491 20141001-17 23856MB

   1535 20141001-18 24560MB

   1548 20141001-19 24768MB

   1068 20141001-20 17088MB

   1519 20141001-21 24304MB

   1608 20141001-22 25728MB

   1019 20141001-23 16304MB

   1568 20141002-00 25088MB

   1411 20141002-01 22576MB

   1781 20141002-02 28496MB

   1280 20141002-03 20480MB

   1556 20141002-04 24896MB

   1114 20141002-05 17824MB

   1906 20141002-06 30496MB

   1316 20141002-07 21056MB

   1483 20141002-08 23728MB

   1245 20141002-09 19920MB

 

The slave is running on a server with 48GB, and 8 cores (Intel(R) Xeon(R) CPU E5630  @ 2.53GHz) running red hat 5.10, herewith the postgresql.conf:

postgres=# show all;

              name               |                 setting                  |                                                          description

---------------------------------+------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------

allow_system_table_mods         | off                                      | Allows modifications of the structure of system tables.

application_name                | psql                                     | Sets the application name to be reported in statistics and logs.

archive_command                 | (disabled)                               | Sets the shell command that will be called to archive a WAL file.

archive_mode                    | off                                      | Allows archiving of WAL files using archive_command.

archive_timeout                 | 0                                        | Forces a switch to the next xlog file if a new file has not been started within N seconds.

array_nulls                     | on                                       | Enable input of NULL elements in arrays.

authentication_timeout          | 1min                                     | Sets the maximum allowed time to complete client authentication.

autovacuum                      | on                                       | Starts the autovacuum subprocess.

autovacuum_analyze_scale_factor | 0.1                                      | Number of tuple inserts, updates, or deletes prior to analyze as a fraction of reltuples.

autovacuum_analyze_threshold    | 50                                       | Minimum number of tuple inserts, updates, or deletes prior to analyze.

autovacuum_freeze_max_age       | 200000000                                | Age at which to autovacuum a table to prevent transaction ID wraparound.

autovacuum_max_workers          | 3                                        | Sets the maximum number of simultaneously running autovacuum worker processes.

autovacuum_naptime              | 20s                                      | Time to sleep between autovacuum runs.

autovacuum_vacuum_cost_delay    | 5ms                                      | Vacuum cost delay in milliseconds, for autovacuum.

autovacuum_vacuum_cost_limit    | 1000                                     | Vacuum cost amount available before napping, for autovacuum.

autovacuum_vacuum_scale_factor  | 0.1                                      | Number of tuple updates or deletes prior to vacuum as a fraction of reltuples.

autovacuum_vacuum_threshold     | 50                                       | Minimum number of tuple updates or deletes prior to vacuum.

backslash_quote                 | safe_encoding                            | Sets whether "\'" is allowed in string literals.

bgwriter_delay                  | 200ms                                    | Background writer sleep time between rounds.

bgwriter_lru_maxpages           | 100                                      | Background writer maximum number of LRU pages to flush per round.

bgwriter_lru_multiplier         | 2                                        | Multiple of the average buffer usage to free per round.

block_size                      | 8192                                     | Shows the size of a disk block.

bonjour                         | off                                      | Enables advertising the server via Bonjour.

bonjour_name                    |                                          | Sets the Bonjour service name.

bytea_output                    | hex                                      | Sets the output format for bytea.

check_function_bodies           | on                                       | Check function bodies during CREATE FUNCTION.

checkpoint_completion_target    | 0.9                                      | Time spent flushing dirty buffers during checkpoint, as fraction of checkpoint interval.

checkpoint_segments             | 64                                       | Sets the maximum distance in log segments between automatic WAL checkpoints.

checkpoint_timeout              | 10min                                    | Sets the maximum time between automatic WAL checkpoints.

checkpoint_warning              | 1min                                     | Enables warnings if checkpoint segments are filled more frequently than this.

client_encoding                 | UTF8                                     | Sets the client's character set encoding.

client_min_messages             | notice                                   | Sets the message levels that are sent to the client.

commit_delay                    | 0                                        | Sets the delay in microseconds between transaction commit and flushing WAL to disk.

commit_siblings                 | 5                                        | Sets the minimum concurrent open transactions before performing commit_delay.

config_file                     | /ORA/dbs03/PUPDBTST/data/postgresql.conf | Sets the server's main configuration file.

constraint_exclusion            | on                                       | Enables the planner to use constraints to optimize queries.

cpu_index_tuple_cost            | 0.005                                    | Sets the planner's estimate of the cost of processing each index entry during an index scan.

cpu_operator_cost               | 0.0025                                   | Sets the planner's estimate of the cost of processing each operator or function call.

cpu_tuple_cost                  | 0.01                                     | Sets the planner's estimate of the cost of processing each tuple (row).

cursor_tuple_fraction           | 0.1                                      | Sets the planner's estimate of the fraction of a cursor's rows that will be retrieved.

data_directory                  | /ORA/dbs03/PUPDBTST/data                 | Sets the server's data directory.

DateStyle                       | ISO, MDY                                 | Sets the display format for date and time values.

db_user_namespace               | off                                      | Enables per-database user names.

deadlock_timeout                | 1s                                       | Sets the time to wait on a lock before checking for deadlock.

debug_assertions                | off                                      | Turns on various assertion checks.

debug_pretty_print              | on                                       | Indents parse and plan tree displays.

debug_print_parse               | off                                      | Logs each query's parse tree.

debug_print_plan                | off                                      | Logs each query's execution plan.

debug_print_rewritten           | off                                      | Logs each query's rewritten parse tree.

default_statistics_target       | 50                                       | Sets the default statistics target.

default_tablespace              |                                          | Sets the default tablespace to create tables and indexes in.

default_text_search_config      | pg_catalog.english                       | Sets default text search configuration.

default_transaction_deferrable  | off                                      | Sets the default deferrable status of new transactions.

default_transaction_isolation   | read committed                           | Sets the transaction isolation level of each new transaction.

default_transaction_read_only   | off                                      | Sets the default read-only status of new transactions.

default_with_oids               | off                                      | Create new tables with OIDs by default.

dynamic_library_path            | $libdir                                  | Sets the path for dynamically loadable modules.

effective_cache_size            | 36GB                                     | Sets the planner's assumption about the size of the disk cache.

effective_io_concurrency        | 1                                        | Number of simultaneous requests that can be handled efficiently by the disk subsystem.

enable_bitmapscan               | on                                       | Enables the planner's use of bitmap-scan plans.

enable_hashagg                  | on                                       | Enables the planner's use of hashed aggregation plans.

enable_hashjoin                 | on                                       | Enables the planner's use of hash join plans.

enable_indexonlyscan            | on                                       | Enables the planner's use of index-only-scan plans.

enable_indexscan                | on                                       | Enables the planner's use of index-scan plans.

enable_material                 | on                                       | Enables the planner's use of materialization.

enable_mergejoin                | on                                       | Enables the planner's use of merge join plans.

enable_nestloop                 | on                                       | Enables the planner's use of nested-loop join plans.

enable_seqscan                  | on                                       | Enables the planner's use of sequential-scan plans.

enable_sort                     | on                                       | Enables the planner's use of explicit sort steps.

enable_tidscan                  | on                                       | Enables the planner's use of TID scan plans.

escape_string_warning           | on                                       | Warn about backslash escapes in ordinary string literals.

event_source                    | PostgreSQL                               | Sets the application name used to identify PostgreSQL messages in the event log.

exit_on_error                   | off                                      | Terminate session on any error.

external_pid_file               |                                          | Writes the postmaster PID to the specified file.

extra_float_digits              | 0                                        | Sets the number of digits displayed for floating-point values.

from_collapse_limit             | 8                                        | Sets the FROM-list size beyond which subqueries are not collapsed.

fsync                           | off                                      | Forces synchronization of updates to disk.

full_page_writes                | on                                       | Writes full pages to WAL when first modified after a checkpoint.

geqo                            | on                                       | Enables genetic query optimization.

geqo_effort                     | 5                                        | GEQO: effort is used to set the default for other GEQO parameters.

geqo_generations                | 0                                        | GEQO: number of iterations of the algorithm.

geqo_pool_size                  | 0                                        | GEQO: number of individuals in the population.

geqo_seed                       | 0                                        | GEQO: seed for random path selection.

geqo_selection_bias             | 2                                        | GEQO: selective pressure within the population.

geqo_threshold                  | 12                                       | Sets the threshold of FROM items beyond which GEQO is used.

gin_fuzzy_search_limit          | 0                                        | Sets the maximum allowed result for exact search by GIN.

hba_file                        | /ORA/dbs03/PUPDBTST/data/pg_hba.conf     | Sets the server's "hba" configuration file.

hot_standby                     | on                                       | Allows connections and queries during recovery.

hot_standby_feedback            | off                                      | Allows feedback from a hot standby to the primary that will avoid query conflicts.

ident_file                      | /ORA/dbs03/PUPDBTST/data/pg_ident.conf   | Sets the server's "ident" configuration file.

ignore_system_indexes           | off                                      | Disables reading from system indexes.

integer_datetimes               | on                                       | Datetimes are integer based.

IntervalStyle                   | postgres                                 | Sets the display format for interval values.

join_collapse_limit             | 8                                        | Sets the FROM-list size beyond which JOIN constructs are not flattened.

krb_caseins_users               | off                                      | Sets whether Kerberos and GSSAPI user names should be treated as case-insensitive.

krb_server_keyfile              | FILE:/etc/postgresql/krb5.keytab         | Sets the location of the Kerberos server key file.

krb_srvname                     | postgres                                 | Sets the name of the Kerberos service.

lc_collate                      | en_US.UTF-8                              | Shows the collation order locale.

lc_ctype                        | en_US.UTF-8                              | Shows the character classification and case conversion locale.

lc_messages                     | en_US.UTF-8                              | Sets the language in which messages are displayed.

lc_monetary                     | en_US.UTF-8                              | Sets the locale for formatting monetary amounts.

lc_numeric                      | en_US.UTF-8                              | Sets the locale for formatting numbers.

lc_time                         | en_US.UTF-8                              | Sets the locale for formatting date and time values.

listen_addresses                | *                                        | Sets the host name or IP address(es) to listen to.

lo_compat_privileges            | off                                      | Enables backward compatibility mode for privilege checks on large objects.

local_preload_libraries         |                                          | Lists shared libraries to preload into each backend.

log_autovacuum_min_duration     | 0                                        | Sets the minimum execution time above which autovacuum actions will be logged.

log_checkpoints                 | off                                      | Logs each checkpoint.

log_connections                 | off                                      | Logs each successful connection.

log_destination                 | stderr                                   | Sets the destination for server log output.

log_directory                   | pg_log                                   | Sets the destination directory for log files.

log_disconnections              | off                                      | Logs end of a session, including duration.

log_duration                    | off                                      | Logs the duration of each completed SQL statement.

log_error_verbosity             | default                                  | Sets the verbosity of logged messages.

log_executor_stats              | off                                      | Writes executor performance statistics to the server log.

log_file_mode                   | 0600                                     | Sets the file permissions for log files.

log_filename                    | postgresql-%Y-%m-%d_%H%M%S.log           | Sets the file name pattern for log files.

log_hostname                    | off                                      | Logs the host name in the connection logs.

log_line_prefix                 | <%m>                                     | Controls information prefixed to each log line.

log_lock_waits                  | off                                      | Logs long lock waits.

log_min_duration_statement      | -1                                       | Sets the minimum execution time above which statements will be logged.

log_min_error_statement         | error                                    | Causes all statements generating error at or above this level to be logged.

log_min_messages                | warning                                  | Sets the message levels that are logged.

log_parser_stats                | off                                      | Writes parser performance statistics to the server log.

log_planner_stats               | off                                      | Writes planner performance statistics to the server log.

log_rotation_age                | 1d                                       | Automatic log file rotation will occur after N minutes.

log_rotation_size               | 10MB                                     | Automatic log file rotation will occur after N kilobytes.

log_statement                   | none                                     | Sets the type of statements logged.

log_statement_stats             | off                                      | Writes cumulative performance statistics to the server log.

log_temp_files                  | -1                                       | Log the use of temporary files larger than this number of kilobytes.

log_timezone                    | Europe/Paris                             | Sets the time zone to use in log messages.

log_truncate_on_rotation        | off                                      | Truncate existing log files of same name during log rotation.

logging_collector               | off                                      | Start a subprocess to capture stderr output and/or csvlogs into log files.

maintenance_work_mem            | 1GB                                      | Sets the maximum memory to be used for maintenance operations.

max_connections                 | 80                                       | Sets the maximum number of concurrent connections.

max_files_per_process           | 1000                                     | Sets the maximum number of simultaneously open files for each server process.

max_function_args               | 100                                      | Shows the maximum number of function arguments.

max_identifier_length           | 63                                       | Shows the maximum identifier length.

max_index_keys                  | 32                                       | Shows the maximum number of index keys.

max_locks_per_transaction       | 64                                       | Sets the maximum number of locks per transaction.

max_pred_locks_per_transaction  | 64                                       | Sets the maximum number of predicate locks per transaction.

max_prepared_transactions       | 0                                        | Sets the maximum number of simultaneously prepared transactions.

max_stack_depth                 | 2MB                                      | Sets the maximum stack depth, in kilobytes.

max_standby_archive_delay       | 30s                                      | Sets the maximum delay before canceling queries when a hot standby server is processing archived WAL data.

max_standby_streaming_delay     | 30s                                      | Sets the maximum delay before canceling queries when a hot standby server is processing streamed WAL data.

max_wal_senders                 | 0                                        | Sets the maximum number of simultaneously running WAL sender processes.

password_encryption             | on                                       | Encrypt passwords.

port                            | 6600                                     | Sets the TCP port the server listens on.

post_auth_delay                 | 0                                        | Waits N seconds on connection startup after authentication.

pre_auth_delay                  | 0                                        | Waits N seconds on connection startup before authentication.

quote_all_identifiers           | off                                      | When generating SQL fragments, quote all identifiers.

random_page_cost                | 4                                        | Sets the planner's estimate of the cost of a nonsequentially fetched disk page.

replication_timeout             | 1min                                     | Sets the maximum time to wait for WAL replication.

restart_after_crash             | on                                       | Reinitialize server after backend crash.

search_path                     | "$user",public                           | Sets the schema search order for names that are not schema-qualified.

segment_size                    | 1GB                                      | Shows the number of pages per disk file.

seq_page_cost                   | 1                                        | Sets the planner's estimate of the cost of a sequentially fetched disk page.

server_encoding                 | UTF8                                     | Sets the server (database) character set encoding.

server_version                  | 9.2.9                                    | Shows the server version.

server_version_num              | 90209                                    | Shows the server version as an integer.

session_replication_role        | origin                                   | Sets the session's behavior for triggers and rewrite rules.

shared_buffers                  | 12GB                                     | Sets the number of shared memory buffers used by the server.

shared_preload_libraries        |                                          | Lists shared libraries to preload into server.

sql_inheritance                 | on                                       | Causes subtables to be included by default in various commands.

ssl                             | off                                      | Enables SSL connections.

ssl_ca_file                     |                                          | Location of the SSL certificate authority file.

ssl_cert_file                   | server.crt                               | Location of the SSL server certificate file.

ssl_ciphers                     | ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH        | Sets the list of allowed SSL ciphers.

ssl_crl_file                    |                                          | Location of the SSL certificate revocation list file.

ssl_key_file                    | server.key                               | Location of the SSL server private key file.

ssl_renegotiation_limit         | 512MB                                    | Set the amount of traffic to send and receive before renegotiating the encryption keys.

standard_conforming_strings     | on                                       | Causes '...' strings to treat backslashes literally.

statement_timeout               | 0                                        | Sets the maximum allowed duration of any statement.

stats_temp_directory            | pg_stat_tmp                              | Writes temporary statistics files to the specified directory.

superuser_reserved_connections  | 3                                        | Sets the number of connection slots reserved for superusers.

synchronize_seqscans            | on                                       | Enable synchronized sequential scans.

synchronous_commit              | off                                      | Sets the current transaction's synchronization level.

synchronous_standby_names       |                                          | List of names of potential synchronous standbys.

syslog_facility                 | local0                                   | Sets the syslog "facility" to be used when syslog enabled.

syslog_ident                    | postgres                                 | Sets the program name used to identify PostgreSQL messages in syslog.

tcp_keepalives_count            | 0                                        | Maximum number of TCP keepalive retransmits.

tcp_keepalives_idle             | 0                                        | Time between issuing TCP keepalives.

tcp_keepalives_interval         | 0                                        | Time between TCP keepalive retransmits.

temp_buffers                    | 8MB                                      | Sets the maximum number of temporary buffers used by each session.

temp_file_limit                 | -1                                       | Limits the total size of all temporary files used by each session.

temp_tablespaces                |                                          | Sets the tablespace(s) to use for temporary tables and sort files.

TimeZone                        | Europe/Paris                             | Sets the time zone for displaying and interpreting time stamps.

timezone_abbreviations          | Default                                  | Selects a file of time zone abbreviations.

trace_notify                    | off                                      | Generates debugging output for LISTEN and NOTIFY.

trace_recovery_messages         | log                                      | Enables logging of recovery-related debugging information.

trace_sort                      | off                                      | Emit information about resource usage in sorting.

track_activities                | on                                       | Collects information about executing commands.

track_activity_query_size       | 1024                                     | Sets the size reserved for pg_stat_activity.query, in bytes.

track_counts                    | on                                       | Collects statistics on database activity.

track_functions                 | none                                     | Collects function-level statistics on database activity.

track_io_timing                 | off                                      | Collects timing statistics for database I/O activity.

transaction_deferrable          | off                                      | Whether to defer a read-only serializable transaction until it can be executed with no possible serialization failures.

transaction_isolation           | read committed                           | Sets the current transaction's isolation level.

transaction_read_only           | on                                       | Sets the current transaction's read-only status.

transform_null_equals           | off                                      | Treats "expr=NULL" as "expr IS NULL".

unix_socket_directory           | /var/lib/pgsql                           | Sets the directory where the Unix-domain socket will be created.

unix_socket_group               |                                          | Sets the owning group of the Unix-domain socket.

unix_socket_permissions         | 0777                                     | Sets the access permissions of the Unix-domain socket.

update_process_title            | on                                       | Updates the process title to show the active SQL command.

vacuum_cost_delay               | 0                                        | Vacuum cost delay in milliseconds.

vacuum_cost_limit               | 200                                      | Vacuum cost amount available before napping.

vacuum_cost_page_dirty          | 20                                       | Vacuum cost for a page dirtied by vacuum.

vacuum_cost_page_hit            | 1                                        | Vacuum cost for a page found in the buffer cache.

vacuum_cost_page_miss           | 10                                       | Vacuum cost for a page not found in the buffer cache.

vacuum_defer_cleanup_age        | 0                                        | Number of transactions by which VACUUM and HOT cleanup should be deferred, if any.

vacuum_freeze_min_age           | 50000000                                 | Minimum age at which VACUUM should freeze a table row.

vacuum_freeze_table_age         | 150000000                                | Age at which VACUUM should scan whole table to freeze tuples.

wal_block_size                  | 8192                                     | Shows the block size in the write ahead log.

wal_buffers                     | 8MB                                      | Sets the number of disk-page buffers in shared memory for WAL.

wal_keep_segments               | 0                                        | Sets the number of WAL files held for standby servers.

wal_level                       | hot_standby                              | Set the level of information written to the WAL.

wal_receiver_status_interval    | 10s                                      | Sets the maximum interval between WAL receiver status reports to the primary.

wal_segment_size                | 16MB                                     | Shows the number of pages per write ahead log segment.

wal_sync_method                 | fdatasync                                | Selects the method used for forcing WAL updates to disk.

wal_writer_delay                | 200ms                                    | WAL writer sleep time between WAL flushes.

work_mem                        | 288MB                                    | Sets the maximum memory to be used for query workspaces.

xmlbinary                       | base64                                   | Sets how binary values are to be encoded in XML.

xmloption                       | content                                  | Sets whether XML data in implicit parsing and serialization operations is to be considered as documents or content fragments.

zero_damaged_pages              | off                                      | Continues processing past damaged page headers.

 

The streaming is working perfectly:

 

/usr/local/pgsql/postgresql-9.2.9/bin/psql -U admin -h XXXXX -p XXXXX -d puppetdb -c "select pid,usesysid, usename, application_name, client_addr, state, sent_location,write_location,replay_location,sync_state from pg_stat_replication;"

  pid   | usesysid |  usename   | application_name | client_addr |   state   | sent_location | write_location | replay_location | sync_state

--------+----------+------------+------------------+-------------+-----------+---------------+----------------+-----------------+------------

117659 |    16384 | replicator | walreceiver      | 10.16.7.137 | streaming | AA74/DD630978 | AA74/DD630978  | A977/F84F0BE0   | async

(1 row)

 

But the lag is increasing constantly, it looks the replay can not cope with:

 

/usr/local/pgsql/postgresql-9.2.9/bin/psql -U postgres -h /var/lib/pgsql/ -p 6600 -d puppetdb -c "SELECT now(), now() - pg_last_xact_replay_timestamp()  AS time_lag" | perl -ne 'if (/\|\s+(\d{2}):(\d{2}):(\d{2})\.\d+/) {$hour=$1;$min=$2;$sec=$3; print $_;}'

 

2014-09-22 18:40:02.004482+02 | 00:00:03.166557

2014-09-22 18:50:02.001836+02 | 00:00:00.765323

2014-09-22 19:00:01.229943+02 | 00:00:00.600354

2014-09-22 19:10:01.655343+02 | 00:00:07.85969

2014-09-22 19:20:01.872653+02 | 00:07:23.727307

2014-09-22 19:30:01.458236+02 | 00:14:16.244349

2014-09-22 19:40:01.715044+02 | 00:20:31.616665

2014-09-22 19:50:01.949646+02 | 00:28:20.129136

2014-09-22 20:00:01.216571+02 | 00:31:11.289404

...

2014-09-30 14:00:01.67156+02 | 23:20:11.229815

2014-09-30 14:10:01.884168+02 | 23:23:59.162476

2014-09-30 14:20:02.022649+02 | 23:26:39.082807

2014-09-30 14:30:01.263667+02 | 23:34:02.517874

2014-09-30 14:40:01.421162+02 | 23:41:30.496393

2014-09-30 14:50:01.650239+02 | 23:49:11.102335

2014-09-30 15:00:02.055655+02 | 23:56:07.012862

 

 

I tried to play with how the IO is handled, making it less strict setting synchronous_commit and fsync to off with not much success.

I have also done a second test increasing shared_buffers from 12GB to 24GB (we are running on a 48GB, 8 cores server).

 

Please let me know if you can see something obvious I am missing.

Thanks for your help,

Ruben

From:
Robert Klemme
Date:

On Sat, Nov 8, 2014 at 2:11 PM, Ruben Domingo Gaspar Aparicio wrote:

> The slave (I don't have control on the master) is using 2 NFS file systems,
> one for WALs and another one for the data, on Netapp controllers:
>
> dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs
> (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)
>
> dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs
> (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)

You should use noatime to avoid unnecessary IO.

> The master produces quite a lot of WALs. This is what I get on the slave
> (number of WAL files, date-hour, Total size in MB), so per day is more than
> 400GB:

> I tried to play with how the IO is handled, making it less strict setting
> synchronous_commit and fsync to off with not much success.
>
> I have also done a second test increasing shared_buffers from 12GB to 24GB
> (we are running on a 48GB, 8 cores server).

> Please let me know if you can see something obvious I am missing.

Your IO system needs to be able to deliver sustained IO bandwith at
least as large as you need to read and write all the changes. What raw
IO bandwidth do those NFS file systems deliver _long term_? I am not
talking about spikes because there are buffers. I am talking about the
minimum of network throughput on one hand and raw disk IO those boxes
can do on the other hand. Then, how much of it is available to your
slave? Did you do the math to ensure that the IO bandwidth you have
available on the slave is at least as high as what is needed? Note
that it's not simply the WAL size that needs to be written and read
but also data pages.

Kind regards

robert

--
[guy, jim].each {|him| remember.him do |as, often| as.you_can - without end}
http://blog.rubybestpractices.com/


From:
Andres Freund
Date:

Hi,

On 2014-11-08 13:11:25 +0000, Ruben Domingo Gaspar Aparicio wrote:
> Hello,
>
> I have built up a hot-standby between a master running 9.2.4 and a slave running 9.2.9. I did the initial copy using
"pg_basebackup". My recovery.conf looks like: 
>
> standby_mode = 'on'
> primary_conninfo = 'host=pXXXXXXXX port=XXXX user=replicator'
> trigger_file = 'failover.now'
> restore_command = 'test -f /ORA/dbs02/PUPDBTST/archive/%f  && ln -fns /ORA/dbs02/PUPDBTST/archive/%f %p'
> archive_cleanup_command = '/usr/local/pgsql/postgresql-9.2.9/bin/pg_archivecleanup /ORA/dbs03/PUPDBTST/data %r'
>
> The slave (I don't have control on the master) is using 2 NFS file systems, one for WALs and another one for the
data,on Netapp controllers: 
>
> dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs
(rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)
> dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs
(rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)
>
> The streaming is working perfectly:
>
> /usr/local/pgsql/postgresql-9.2.9/bin/psql -U admin -h XXXXX -p XXXXX -d puppetdb -c "select pid,usesysid, usename,
application_name,client_addr, state, sent_location,write_location,replay_location,sync_state from pg_stat_replication;" 
>   pid   | usesysid |  usename   | application_name | client_addr |   state   | sent_location | write_location |
replay_location| sync_state 
>
--------+----------+------------+------------------+-------------+-----------+---------------+----------------+-----------------+------------
> 117659 |    16384 | replicator | walreceiver      | 10.16.7.137 | streaming | AA74/DD630978 | AA74/DD630978  |
A977/F84F0BE0  | async 
> (1 row)
>
> But the lag is increasing constantly, it looks the replay can not cope with:

I have a couple of questions:
1) Is the standby actually used for querying? Is it possible that replay
   frequently conflicts with active queries? As you don't have
   hot_standby_feedback enabled that seems quite possible.
2) Is the startup process on the standby CPU or IO bound?
3) Does the workload involve loads of temporary tables or generally
   transactions locking lots of tables exclusively in one transaction?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From:
Ruben Domingo Gaspar Aparicio
Date:

Indeed I could save some IO with noatime. I must say I haven’t found any recommendation about mount options for
postgresql,likely because this is not encourage. The ones you see are taking from a Oracle cluster configuration where
severalnodes see the same files. It's not the case on this setup.
 

The IO is not an issue. The storage is not at all saturated. Slave gets streaming perfectly but the apply is quite
slow,looks like always working with pages of 8k at a time:
 

--datafiles
[root@ ~]# /ORA/dbs01/syscontrol/projects/dfm/bin/smetrics -i 5 -n 100 -o vol dbnasg403-12a:/vol/dodpupdbtst03 Instance
total_opsread_ops write_ops read_data write_data avg_latency read_latency write_latenc
 
                /s       /s        /s       b/s        b/s          us           us           us
dodpupdbtst03      6466        0       162         0    2350619       31.53            0       764.70
dodpupdbtst03      6762        0       843         0    8751169       48.32            0       263.10
dodpupdbtst03      7023        0      1547         0   14914498      112.88            0       303.16
dodpupdbtst03      5373        0       321      6501    3809930       58.44     11287.75       467.21
dodpupdbtst03      5618        0       183         0    1661200       20.91            0       265.61
dodpupdbtst03      5538        0       214         0    3471380       29.24            0       374.27
dodpupdbtst03      5753        0       425         0    4973131       45.36            0       351.08
dodpupdbtst03      6110        0       142         0    2331695       20.96            0       378.95

--WALs
Bye Bye[root@ ~]# /ORA/dbs01/syscontrol/projects/dfm/bin/smetrics -i 5 -n 100 -o vol dbnasg401-12a:/vol/dodpupdbtst02
Instancetotal_ops read_ops write_ops read_data write_data avg_latency read_latency write_latenc
 
                /s       /s        /s       b/s        b/s          us           us           us
dodpupdbtst02      1017      202        93   5915899    2637111     2033.22     10116.09       172.61
dodpupdbtst02      1284      242       141   7368712    4309409     1235.11      6306.37       172.89
dodpupdbtst02      1357      231       268   6869816    8489466      957.55      5104.09       192.26
dodpupdbtst02      1566      264       288   8142965    9008529      747.96      4069.78       180.00
dodpupdbtst02      1333      235       153   7601051    4755791      993.81      5394.99       176.99
dodpupdbtst02      1261      199       287   6124821    9075170      896.32      5150.28       203.81
dodpupdbtst02       963      161       192   4955996    6066333     1757.66     10035.06       213.12
dodpupdbtst02       924      159       157   4782617    4807262     1092.61      5804.85       236.91
dodpupdbtst02       591       97       137   2899085    4275046     1218.24      6980.66       190.20

Writes are usually fast (us as they use the NVRAM )and reads are about 5 ms which is quite ok considering SATA disks
(theyhave a flash cache of 512GB, this is why we get this average).
 

Thank you,
Ruben

From:
Ruben Domingo Gaspar Aparicio
Date:

Hi Andres,

Sorry for my delay to reply. Here below my replies:

> I have a couple of questions:
> 1) Is the standby actually used for querying? Is it possible that replay
>    frequently conflicts with active queries? As you don't have
>    hot_standby_feedback enabled that seems quite possible.

Nowadays we don't manage to have a decent lag so the standby is not use at all. No clients connect to it.

> 2) Is the startup process on the standby CPU or IO bound?

The servers is almost idle.  I don't see any bottle neck either on CPU or IO.

> 3) Does the workload involve loads of temporary tables or generally
>    transactions locking lots of tables exclusively in one transaction?

We have monitored the master for a couple of days we haven't detected any "create temp table"  statement.
For the locks I see it's also not the case in my opinion, at a given point in time I don't see many tables lock in
exclusivemode: 

puppetdb=# SELECT locktype, relation::regclass, mode, transactionid AS tid, datname,
virtualtransaction AS vtid, pid, granted
 FROM pg_catalog.pg_locks l LEFT JOIN pg_catalog.pg_database db
 ON db.oid = l.database WHERE  NOT pid = pg_backend_pid();
   locktype    |                   relation                   |       mode       |    tid    | datname  |    vtid    |
pid  | granted 

---------------+----------------------------------------------+------------------+-----------+----------+------------+--------+---------
 relation      | resource_params_cache_pkey                   | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | resource_params_cache                        | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalog_resources_pkey                       | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | idx_catalog_resources_exported_true          | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | idx_catalog_resources_resource               | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | idx_catalog_resources_type                   | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | idx_catalog_resources_type_title             | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalog_resources                            | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | idx_catalogs_transaction_uuid                | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | idx_catalogs_transaction_uuid                | RowExclusiveLock |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalogs_certname_key                        | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalogs_certname_key                        | RowExclusiveLock |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalogs_hash_key                            | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalogs_hash_key                            | RowExclusiveLock |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalogs_pkey                                | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalogs_pkey                                | RowExclusiveLock |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalogs                                     | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | catalogs                                     | RowExclusiveLock |           | puppetdb | 5/3099716  |
54422| t 
 relation      | certnames_pkey                               | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | certnames_pkey                               | RowExclusiveLock |           | puppetdb | 5/3099716  |
54422| t 
 relation      | certnames                                    | AccessShareLock  |           | puppetdb | 5/3099716  |
54422| t 
 relation      | certnames                                    | RowExclusiveLock |           | puppetdb | 5/3099716  |
54422| t 
 virtualxid    |                                              | ExclusiveLock    |           |          | 5/3099716  |
54422| t 
 relation      | resource_params_cache_pkey                   | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | resource_params_cache                        | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalog_resources_pkey                       | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | idx_catalog_resources_exported_true          | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | idx_catalog_resources_resource               | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | idx_catalog_resources_type                   | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | idx_catalog_resources_type_title             | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalog_resources                            | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | idx_catalogs_transaction_uuid                | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | idx_catalogs_transaction_uuid                | RowExclusiveLock |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalogs_certname_key                        | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalogs_certname_key                        | RowExclusiveLock |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalogs_hash_key                            | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalogs_hash_key                            | RowExclusiveLock |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalogs_pkey                                | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalogs_pkey                                | RowExclusiveLock |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalogs                                     | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | catalogs                                     | RowExclusiveLock |           | puppetdb | 20/2901642 |
100098| t 
 relation      | certnames_pkey                               | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | certnames_pkey                               | RowExclusiveLock |           | puppetdb | 20/2901642 |
100098| t 
 relation      | certnames                                    | AccessShareLock  |           | puppetdb | 20/2901642 |
100098| t 
 relation      | certnames                                    | RowExclusiveLock |           | puppetdb | 20/2901642 |
100098| t 
 virtualxid    |                                              | ExclusiveLock    |           |          | 20/2901642 |
100098| t 
 relation      | idx_catalogs_transaction_uuid                | AccessShareLock  |           | puppetdb | 21/2767248 |
13044| t 
 relation      | catalogs_certname_key                        | AccessShareLock  |           | puppetdb | 21/2767248 |
13044| t 
 relation      | catalogs_hash_key                            | AccessShareLock  |           | puppetdb | 21/2767248 |
13044| t 
 relation      | catalogs_pkey                                | AccessShareLock  |           | puppetdb | 21/2767248 |
13044| t 
 relation      | catalogs                                     | AccessShareLock  |           | puppetdb | 21/2767248 |
13044| t 
 relation      | certnames_pkey                               | AccessShareLock  |           | puppetdb | 21/2767248 |
13044| t 
 relation      | certnames_pkey                               | RowExclusiveLock |           | puppetdb | 21/2767248 |
13044| t 
 relation      | certnames                                    | AccessShareLock  |           | puppetdb | 21/2767248 |
13044| t 
 relation      | certnames                                    | RowExclusiveLock |           | puppetdb | 21/2767248 |
13044| t 
 virtualxid    |                                              | ExclusiveLock    |           |          | 21/2767248 |
13044| t 
 relation      | edges                                        | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | resource_params_cache_pkey                   | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | resource_params_cache                        | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | resource_params_cache                        | RowShareLock     |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalog_resources_pkey                       | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalog_resources_pkey                       | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalog_resources_exported_true          | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalog_resources_exported_true          | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalog_resources_resource               | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalog_resources_resource               | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalog_resources_type                   | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalog_resources_type                   | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalog_resources_type_title             | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalog_resources_type_title             | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalog_resources                            | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalog_resources                            | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalogs_transaction_uuid                | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | idx_catalogs_transaction_uuid                | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalogs_certname_key                        | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalogs_certname_key                        | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalogs_hash_key                            | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalogs_hash_key                            | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalogs_pkey                                | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalogs_pkey                                | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalogs                                     | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | catalogs                                     | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | certnames_pkey                               | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | certnames_pkey                               | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 relation      | certnames                                    | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | certnames                                    | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 virtualxid    |                                              | ExclusiveLock    |           |          | 27/1597400 |
77873| t 
 relation      | certnames                                    | RowShareLock     |           | puppetdb | 27/1597400 |
77873| t 
 relation      | edges                                        | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 transactionid |                                              | ExclusiveLock    | 191755866 |          | 27/1597400 |
77873| t 
 transactionid |                                              | ExclusiveLock    | 191755880 |          | 20/2901642 |
100098| t 
 relation      | edges_certname_source_target_type_unique_key | AccessShareLock  |           | puppetdb | 27/1597400 |
77873| t 
 relation      | edges_certname_source_target_type_unique_key | RowExclusiveLock |           | puppetdb | 27/1597400 |
77873| t 
 transactionid |                                              | ExclusiveLock    | 191755874 |          | 5/3099716  |
54422| t 
 transactionid |                                              | ExclusiveLock    | 191755883 |          | 21/2767248 |
13044| t 
(95 rows)

Jus to comment that we are running a DBaaS, we don't know much about the apps running on our servers.
Thank you,
Ruben


From:
Ruben Domingo Gaspar Aparicio
Date:

> > The slave (I don't have control on the master) is using 2 NFS file
> > systems, one for WALs and another one for the data, on Netapp controllers:
> >
> > dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs
> >
> (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=60
> > 0)
> >
> > dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs
> >
> (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=60
> > 0)
> 
> You should use noatime to avoid unnecessary IO.
> 

Just to mention that changing the mount points from:

dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs (rw,
actimeo=0,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600)
dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs (rw,
actimeo=0,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600)

to

dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs
(rw,noatime,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600)
dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs
(rw,noatime,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600)

it did have a big impact. The profile of the recovery process on terms of calls changed quite a lot:

From: 

[postgres@itrac1202 tmp]$  strace -p 9596 -c
Process 9596 attached - interrupt to quit
Process 9596 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
78.73    0.217824           0    456855    381376 read
17.87    0.049453           0    515320           lseek
  2.89    0.007989          12       669       669 poll
  0.33    0.000912          25        36           open
  0.07    0.000206           0       994       994 stat
  0.05    0.000151           0       995       787 rt_sigreturn
  0.05    0.000133           0       673           write
  0.00    0.000000           0        36           close
  0.00    0.000000           0        52           kill
------ ----------- ----------- --------- --------- ----------------
100.00    0.276668                975630    383826 total


To:

[postgres@itrac1202 tmp]$  strace -p 9596 -c
Process 9596 attached - interrupt to quit
Process 9596 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
78.73    0.217824           0    456855    381376 read
17.87    0.049453           0    515320           lseek
  2.89    0.007989          12       669       669 poll
  0.33    0.000912          25        36           open
  0.07    0.000206           0       994       994 stat
  0.05    0.000151           0       995       787 rt_sigreturn
  0.05    0.000133           0       673           write
  0.00    0.000000           0        36           close
  0.00    0.000000           0        52           kill
------ ----------- ----------- --------- --------- ----------------
100.00    0.276668                975630    383826 total

We did also increased the shared_buffers from 12 to 24GB.

The lag has decreased most of the time:

*/10 * * * *     /usr/local/pgsql/postgresql-9.2.9/bin/psql -U postgres -h /var/lib/pgsql/ -p 6600 -d puppetdb -c
"SELECTnow(), now() - pg_last_xact_replay_timestamp()  AS time_lag" | perl -ne 'if
(/\|\s+(\d{2}):(\d{2}):(\d{2})\.\d+/){$hour=$1;$min=$2;$sec=$3; print $_;}' >> /tmp/lag929morememmount.log
 


...
2014-12-14 14:10:01.688947+01 | 00:00:00.096524
 2014-12-14 14:20:01.798223+01 | 00:00:00.024083
 2014-12-14 14:30:01.884448+01 | 00:00:00.420791
 2014-12-14 14:40:01.960623+01 | 00:00:00.168318
 2014-12-14 14:50:01.191487+01 | 00:00:00.163832
 2014-12-14 15:00:02.146436+01 | 00:00:00.026934
 2014-12-14 15:10:01.277963+01 | 00:00:00.332185
 2014-12-14 15:20:01.353979+01 | 00:00:00.020616
 2014-12-14 15:30:01.417092+01 | 00:00:00.584768
 2014-12-14 15:40:01.575347+01 | 00:00:00.151685
 2014-12-14 15:50:01.205507+01 | 00:00:00.102073
 2014-12-14 16:00:01.321511+01 | 00:00:00.590677
 2014-12-14 16:10:01.570474+01 | 00:00:00.182683
 2014-12-14 16:20:01.640095+01 | 00:00:00.420185
 2014-12-14 16:30:01.767033+01 | 00:00:00.015989
 2014-12-14 16:40:01.849532+01 | 00:00:00.106296
 2014-12-14 16:50:01.920876+01 | 00:00:00.258851
 2014-12-14 17:00:02.000278+01 | 00:00:00.119841
 2014-12-14 17:10:01.894227+01 | 00:00:00.091599
 2014-12-14 17:20:01.61729+01 | 00:00:00.367367
 2014-12-14 17:30:01.683326+01 | 00:00:00.103884
 2014-12-14 17:40:01.755904+01 | 00:00:00.051262
 2014-12-14 17:50:01.833825+01 | 00:00:00.06901
 2014-12-14 18:00:01.901236+01 | 00:00:00.17467
 2014-12-14 18:10:01.186283+01 | 00:00:00.214941
 2014-12-14 18:20:01.145413+01 | 00:00:00.03517
 2014-12-14 18:30:01.241746+01 | 00:00:00.207842
 2014-12-14 18:40:01.299413+01 | 00:00:00.147878
 2014-12-14 18:50:01.368541+01 | 00:00:00.393893
 2014-12-14 19:00:01.430736+01 | 00:00:00.031226
 2014-12-14 19:10:01.672117+01 | 00:05:03.512832
 2014-12-14 19:20:01.9195+01 | 00:06:39.08761
 2014-12-14 19:30:02.184486+01 | 00:00:00.307668
 2014-12-14 19:40:01.227278+01 | 00:00:00.054831
 2014-12-14 19:50:01.305485+01 | 00:00:00.425595
 2014-12-14 20:00:01.410501+01 | 00:00:00.394526
 2014-12-14 20:10:01.984196+01 | 00:00:00.388844
 2014-12-14 20:20:01.031042+01 | 00:00:00.503092
 2014-12-14 20:30:01.225871+01 | 00:00:00.241493
 2014-12-14 20:40:01.305696+01 | 00:00:00.280656
 2014-12-14 20:50:01.379617+01 | 00:00:00.151103
 2014-12-14 21:00:01.468849+01 | 00:00:00.014412
 2014-12-14 21:10:01.724514+01 | 00:00:00.147476
 2014-12-14 21:20:01.799292+01 | 00:00:00.08696
 2014-12-14 21:30:01.866336+01 | 00:00:00.035226
 2014-12-14 21:40:01.942882+01 | 00:00:00.111701
 2014-12-14 21:50:02.010419+01 | 00:00:00.215121
 2014-12-14 22:00:01.110033+01 | 00:00:16.460612
 2014-12-14 22:10:01.568286+01 | 00:00:00.077897
 2014-12-14 22:20:01.682714+01 | 00:00:00.104112
 2014-12-14 22:30:01.758958+01 | 00:00:00.061474
 2014-12-14 22:40:01.970545+01 | 00:00:00.108613
 2014-12-14 22:50:01.038908+01 | 00:00:00.039637
 2014-12-14 23:00:01.120295+01 | 00:00:00.338731
 2014-12-14 23:10:01.365371+01 | 00:00:00.680065
 2014-12-14 23:20:01.423365+01 | 00:00:00.154614
 2014-12-14 23:30:01.48998+01 | 00:00:00.014643
 2014-12-14 23:40:01.569452+01 | 00:00:00.126961
 2014-12-14 23:50:01.63047+01 | 00:00:00.303156
 2014-12-15 00:00:01.278047+01 | 00:00:00.351391
 2014-12-15 00:10:01.382566+01 | 00:00:00.012265
 2014-12-15 00:20:01.444746+01 | 00:07:39.002651
 2014-12-15 00:30:01.510413+01 | 00:16:13.476753
 2014-12-15 00:40:01.97735+01 | 00:00:00.105011
 2014-12-15 00:50:01.082528+01 | 00:01:10.313796
 2014-12-15 01:00:01.124843+01 | 00:00:01.508016
 2014-12-15 01:10:01.818415+01 | 00:00:00.082441
 2014-12-15 01:20:01.961064+01 | 00:00:00.048221
 2014-12-15 01:30:01.355472+01 | 00:00:00.37941
 2014-12-15 01:40:01.42728+01 | 00:00:00.013836
 2014-12-15 01:50:01.486446+01 | 00:00:00.110321
 2014-12-15 02:00:01.566731+01 | 00:00:00.290281
 2014-12-15 02:10:01.236574+01 | 00:01:15.954532
 2014-12-15 02:20:01.440259+01 | 00:00:00.471677
 2014-12-15 02:30:01.5733+01 | 00:00:00.208574
 2014-12-15 02:40:01.662425+01 | 00:00:00.591091
 2014-12-15 02:50:01.263385+01 | 00:00:00.050648
 2014-12-15 03:00:01.340777+01 | 00:00:00.289115
 2014-12-15 03:10:01.993079+01 | 00:00:00.790201
 2014-12-15 03:20:01.061826+01 | 00:00:00.043176
 2014-12-15 03:30:01.125639+01 | 00:00:00.172924
 2014-12-15 03:40:01.252033+01 | 00:03:05.113579
 2014-12-15 03:50:01.362396+01 | 00:00:00.254974
 2014-12-15 04:00:01.370922+01 | 00:00:00.208254
 2014-12-15 04:10:01.472816+01 | 00:00:00.077214
 2014-12-15 04:20:01.553443+01 | 00:00:00.135887
 2014-12-15 04:30:01.63607+01 | 00:00:00.027272
 2014-12-15 04:40:01.696442+01 | 00:00:00.130954
 2014-12-15 04:50:01.786961+01 | 00:00:00.572573
 2014-12-15 05:00:01.790753+01 | 00:00:00.491799
 2014-12-15 05:10:01.078332+01 | 00:07:58.438202  **** likely autovacuum
 2014-12-15 05:20:01.139541+01 | 00:00:00.057486
 2014-12-15 05:30:01.251079+01 | 00:00:00.053462
 2014-12-15 05:40:01.322349+01 | 00:00:00.084701
 2014-12-15 05:50:01.607937+01 | 00:00:00.205241
 2014-12-15 06:00:01.699406+01 | 00:00:00.121415
 2014-12-15 06:10:01.756047+01 | 00:00:00.20769
 2014-12-15 06:20:01.854222+01 | 00:00:00.03397
 2014-12-15 06:30:02.041054+01 | 00:03:07.271295
 2014-12-15 06:40:01.891882+01 | 00:00:00.263748
 2014-12-15 06:50:01.987809+01 | 00:00:00.155619
 2014-12-15 07:00:01.068556+01 | 00:00:00.119866
 2014-12-15 07:10:01.318478+01 | 00:00:00.092856
 2014-12-15 07:20:01.704899+01 | 00:00:00.106533
 2014-12-15 07:30:01.773268+01 | 00:00:00.135743
 2014-12-15 07:40:01.730152+01 | 00:00:00.06358
 2014-12-15 07:50:01.798179+01 | 00:00:00.529685
 2014-12-15 08:00:01.868205+01 | 00:00:00.194482
 2014-12-15 08:10:01.219339+01 | 00:00:00.063553
 2014-12-15 08:20:01.309426+01 | 00:00:00.056698
 2014-12-15 08:30:01.120431+01 | 00:00:00.425596
 2014-12-15 08:40:01.201882+01 | 00:00:00.00909
 2014-12-15 08:50:01.272526+01 | 00:00:00.019492
 2014-12-15 09:00:01.361022+01 | 00:00:00.423997
 2014-12-15 09:10:01.603702+01 | 00:00:00.066705
 2014-12-15 09:20:01.682277+01 | 00:00:09.251202
 2014-12-15 09:30:01.934477+01 | 00:00:00.311553
 2014-12-15 09:40:02.03221+01 | 00:00:00.125678
 2014-12-15 09:50:01.105372+01 | 00:00:00.294006
 2014-12-15 10:00:01.201109+01 | 00:00:00.014641
 2014-12-15 10:10:01.164478+01 | 00:01:51.375378
 2014-12-15 10:20:01.264589+01 | 00:09:54.476361  **** likely autovacuum
 2014-12-15 10:30:01.351103+01 | 00:00:00.213636
 2014-12-15 10:40:01.623903+01 | 00:00:00.488103
 2014-12-15 10:50:01.768132+01 | 00:00:00.080799
 2014-12-15 11:00:01.880247+01 | 00:00:20.401738
 2014-12-15 11:10:01.215509+01 | 00:00:00.036288
 2014-12-15 11:20:01.265607+01 | 00:00:00.057142
 2014-12-15 11:30:01.343731+01 | 00:00:00.036609
 2014-12-15 11:40:01.41248+01 | 00:00:00.218139
 2014-12-15 11:50:01.48113+01 | 00:00:00.242754
 2014-12-15 12:00:01.685114+01 | 00:00:00.82528
 2014-12-15 12:10:01.995243+01 | 00:02:29.971448
 2014-12-15 12:20:01.962833+01 | 00:00:00.118112
 2014-12-15 12:30:01.100587+01 | 00:00:00.214437
 2014-12-15 12:40:01.226111+01 | 00:00:00.052599
 2014-12-15 12:50:01.300061+01 | 00:00:00.162205
 2014-12-15 13:00:01.4007+01 | 00:00:00.707891
 2014-12-15 13:10:02.005526+01 | 00:00:00.162238
 2014-12-15 13:20:01.072375+01 | 00:00:00.214978
 2014-12-15 13:30:01.446005+01 | 00:00:00.121816
 2014-12-15 13:40:01.483524+01 | 00:00:00.650178
 2014-12-15 13:50:01.796143+01 | 00:00:00.065482
 2014-12-15 14:00:01.886071+01 | 00:00:00.237577
 2014-12-15 14:10:01.134148+01 | 00:00:00.193941
 2014-12-15 14:20:01.199047+01 | 00:00:00.068058
 2014-12-15 14:30:01.27777+01 | 00:00:00.022991
 2014-12-15 14:40:01.361959+01 | 00:00:00.439753
 2014-12-15 14:50:01.421515+01 | 00:00:00.037749
 2014-12-15 15:00:01.500559+01 | 00:00:00.174448
 2014-12-15 15:10:01.811804+01 | 00:06:09.196648  **** likely autovacuum
..



It goes up till a maximum of 25 minutes (for the last two weeks), it looks correlated with an autovacuum at the master
inone of the big tables of the schema. It happens at about 5hours interval. Is there a way to avoid this ? Should I ask
tothe master db dba to try to have a more active autovacuum policy?
 


Thank you,
Ruben

From:
Ruben Domingo Gaspar Aparicio
Date:

> dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs (rw,
> actimeo=0,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600)
> dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs (rw,
> actimeo=0,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600)
> 
> to
> 
> dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs
> (rw,noatime,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600)
> dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs
> (rw,noatime,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600)
> 
> it did have a big impact. The profile of the recovery process on terms of calls
> changed quite a lot:
> 
> From:
> 
> [postgres@itrac1202 tmp]$  strace -p 9596 -c Process 9596 attached -
> interrupt to quit Process 9596 detached
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 78.73    0.217824           0    456855    381376 read
> 17.87    0.049453           0    515320           lseek
>   2.89    0.007989          12       669       669 poll
>   0.33    0.000912          25        36           open
>   0.07    0.000206           0       994       994 stat
>   0.05    0.000151           0       995       787 rt_sigreturn
>   0.05    0.000133           0       673           write
>   0.00    0.000000           0        36           close
>   0.00    0.000000           0        52           kill
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.276668                975630    383826 total
> 

This one should read:

[root@itrac1202 ~]# strace -c -p 28073
Process 28073 attached - interrupt to quit

Process 28073 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
59.16   10.756007           5   2201974   1202832 read
40.69    7.398247           3   2367885           lseek
  0.14    0.025970         154       169           open
  0.00    0.000057           0       169           close
  0.00    0.000038           0       169           kill
  0.00    0.000033           1        29           write
  0.00    0.000000           0         1           semop
------ ----------- ----------- --------- --------- ----------------
100.00   18.180352               4570396   1202832 total


Apologies for the confusion.