Thread: 9.1.3 Standby catchup mode

9.1.3 Standby catchup mode

From
"hans wulf"
Date:
I am wondering how the catchup mode of a hot synchron slave server works on 9.1.3 if there is no WAL archive.

Can the slave only request WALs that are still in the xlog directory of the master server? Or does the master
regeneratesome kind of fake log for the catchup mode? E.g. in case of a slave failure I could use a weekly backup and
letthe catchup mode do the rest? Or does that only work if you use WAL archive? 

The Doc says the following:

"When a standby first attaches to the primary, it will not yet be properly synchronized. This is described as catchup
mode.Once the lag between standby and primary reaches zero for the first time we move to real-time streaming state. The
catch-upduration may be long immediately after the standby has been created." 

I sounds as if the catchup mode has got magic powers, but I don't know if I'm readying the bible correctly.


--
NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!
Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

Re: 9.1.3 Standby catchup mode

From
Adrian Klaver
Date:
On 04/05/2012 09:35 AM, hans wulf wrote:
> I am wondering how the catchup mode of a hot synchron slave server works on 9.1.3 if there is no WAL archive.


http://www.postgresql.org/docs/9.1/interactive/warm-standby.html#STREAMING-REPLICATION

"Streaming replication allows a standby server to stay more up-to-date than is possible
with file-based log shipping. The standby connects to the primary, which streams WAL
records to the standby as they're generated, without waiting for the WAL file to be filled."


>


--
Adrian Klaver
adrian.klaver@gmail.com

Re: 9.1.3 Standby catchup mode

From
Michael Nolan
Date:


On Thu, Apr 5, 2012 at 12:35 PM, hans wulf <lotu1@gmx.net> wrote:
I am wondering how the catchup mode of a hot synchron slave server works on 9.1.3 if there is no WAL archive.

Why would you not want to maintain a WAL archive?  Are you depending on the slave server(s) as your only form of backup?

It isn't clear what you want from synchronous streaming replication, or if you understand the difference between synchronous streaming replication and asynchronous streaming replication.
--
Mike Nolan

Re: 9.1.3 Standby catchup mode

From
"hans wulf"
Date:
> Why would you not want to maintain a WAL archive?  Are you depending on
> the
> slave server(s) as your only form of backup?

If the slave devide acts as a perfect backup, why would I need an additional 3rd entiy for WAL backups?

I know what the difference between sync and async is, but I don't see the need for a WAL archive in sync mode. Can you
pleaseexplain that? Thanks 


-------- Original-Nachricht --------
> Datum: Fri, 6 Apr 2012 17:54:54 -0400
> Von: Michael Nolan <htfoot@gmail.com>
> An: hans wulf <lotu1@gmx.net>
> CC: pgsql-general@postgresql.org
> Betreff: Re: [GENERAL] 9.1.3 Standby catchup mode

> On Thu, Apr 5, 2012 at 12:35 PM, hans wulf <lotu1@gmx.net> wrote:
>
> > I am wondering how the catchup mode of a hot synchron slave server works
> > on 9.1.3 if there is no WAL archive.
> >
>

>
> It isn't clear what you want from synchronous streaming replication, or if
> you understand the difference between synchronous streaming replication
> and
> asynchronous streaming replication.
> --
> Mike Nolan
--
NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!
Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

Re: 9.1.3 Standby catchup mode

From
Adrian Klaver
Date:
On 04/08/2012 03:51 AM, hans wulf wrote:
>> Why would you not want to maintain a WAL archive?  Are you depending on
>> the
>> slave server(s) as your only form of backup?
>
> If the slave devide acts as a perfect backup, why would I need an additional 3rd entiy for WAL backups?

Belt and suspenders mode:). Assuming you archive the WAL files to a
third machine and there is independent connection from that machine to
the standby, the standby will pick up the data from those WAL files if
it loses its streaming connection to the primary and the primary is
still up and generating WAL files. This would depend on you setting up
file archiving from the primary to the third machine and archive
retrieval from the third machine to the standby. What you get is a
history of WAL files that you can replay should the streaming link go
down. Basically a second copy of the primary.

>
> I know what the difference between sync and async is, but I don't see the need for a WAL archive in sync mode. Can
youplease explain that? Thanks 

>
>



--
Adrian Klaver
adrian.klaver@gmail.com

[streaming replication] 9.1.3 streaming replication bug ?

From
乔志强
Date:
I use postgresql-9.1.3-1-windows-x64.exe on windows 2008 R2 x64.

1 master and 1 standby. The standby is a synchronous standby use streaming replication (synchronous_standby_names =
'*',archive_mode = off), the master output:
 
       standby "walreceiver" is now the synchronous standby with priority 1
the standby output:
       LOG:  streaming replication successfully connected to primary

Then run the test program to write and commit large blob(10 to 1000 MB bytes rand size) to master server use 40
threads(40sessions) in loop,
 
The Master and standby is run on the same machine, and the client run on another machine with 100 mbps network.


But after some minutes the master output:
       requested WAL segment XXX has already been removed
the standby output:
       FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment XXX
            has already been removed


Question:
Why the master deletes the WAL segment before send to standby in synchronous mode? It is a streaming replication bug ?


I see if no standby connect to master when synchronous_standby_names = '*', 
all commit will delay to standby connect to master. It is good.

Use a bigger wal_keep_segments?  But I think the master should keep all WAL segments not sent to online standby (sync
orasync).
 
wal_keep_segments shoud be only for offline standby. 

If use synchronous_standby_names for sync standby, if no online standby, all commit will delay to standby connect to
master,
 
So wal_keep_segments is only for offline async standby actually.



////////////////////////////////////////

master server output:
LOG:  database system was interrupted; last known up at 2012-03-30 15:37:03 HKT
LOG:  database system was not properly shut down; automatic recovery in progress

LOG:  redo starts at 0/136077B0
LOG:  record with zero length at 0/17DF1E10
LOG:  redo done at 0/17DF1D98
LOG:  last completed transaction was at log time 2012-03-30 15:37:03.148+08
FATAL:  the database system is starting up
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
   ///////////////////// the standby is a synchronous standby
     LOG:  standby "walreceiver" is now the synchronous standby with priority 1
   /////////////////////
LOG:  checkpoints are occurring too frequently (16 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
LOG:  checkpoints are occurring too frequently (23 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
LOG:  checkpoints are occurring too frequently (24 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
LOG:  checkpoints are occurring too frequently (20 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
LOG:  checkpoints are occurring too frequently (22 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
FATAL:  requested WAL segment 000000010000000000000032 has already been removed
FATAL:  requested WAL segment 000000010000000000000032 has already been removed
FATAL:  requested WAL segment 000000010000000000000032 has already been removed
LOG:  checkpoints are occurring too frequently (8 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
FATAL:  requested WAL segment 000000010000000000000032 has already been removed 



////////////////////////
standby server output:
LOG:  database system was interrupted while in recovery at log time 2012-03-30 1
4:44:31 HKT
HINT:  If this has occurred more than once some data might be corrupted and you
might need to choose an earlier recovery target.
LOG:  entering standby mode
LOG:  redo starts at 0/16E4760
LOG:  consistent recovery state reached at 0/12D984D8
LOG:  database system is ready to accept read only connections
LOG:  record with zero length at 0/17DF1E68
LOG:  invalid magic number 0000 in log file 0, segment 50, offset 6946816
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 00
0000010000000000000032 has already been removed
 



Re: [streaming replication] 9.1.3 streaming replicationbug ?

From
Condor
Date:
On 09.04.2012 13:33, 乔志强 wrote:
> I use postgresql-9.1.3-1-windows-x64.exe on windows 2008 R2 x64.
>
> 1 master and 1 standby. The standby is a synchronous standby use
> streaming replication (synchronous_standby_names = '*', archive_mode
> =
> off), the master output:
>        standby "walreceiver" is now the synchronous standby with
> priority 1
> the standby output:
>        LOG:  streaming replication successfully connected to primary
>
> Then run the test program to write and commit large blob(10 to 1000
> MB bytes rand size) to master server use 40 threads(40 sessions) in
> loop,
> The Master and standby is run on the same machine, and the client run
> on another machine with 100 mbps network.
>
>
> But after some minutes the master output:
>        requested WAL segment XXX has already been removed
> the standby output:
>        FATAL:  could not receive data from WAL stream: FATAL:
> requested WAL segment XXX
>             has already been removed
>
>
> Question:
> Why the master deletes the WAL segment before send to standby in
> synchronous mode? It is a streaming replication bug ?
>
>
> I see if no standby connect to master when synchronous_standby_names
> = '*',
> all commit will delay to standby connect to master. It is good.
>
> Use a bigger wal_keep_segments?  But I think the master should keep
> all WAL segments not sent to online standby (sync or async).
> wal_keep_segments shoud be only for offline standby.
>
> If use synchronous_standby_names for sync standby, if no online
> standby, all commit will delay to standby connect to master,
> So wal_keep_segments is only for offline async standby actually.
>
>
>
> ////////////////////////////////////////
>
> master server output:
> LOG:  database system was interrupted; last known up at 2012-03-30
> 15:37:03 HKT
> LOG:  database system was not properly shut down; automatic recovery
> in progress
>
> LOG:  redo starts at 0/136077B0
> LOG:  record with zero length at 0/17DF1E10
> LOG:  redo done at 0/17DF1D98
> LOG:  last completed transaction was at log time 2012-03-30
> 15:37:03.148+08
> FATAL:  the database system is starting up
> LOG:  database system is ready to accept connections
> LOG:  autovacuum launcher started
>    ///////////////////// the standby is a synchronous standby
>      LOG:  standby "walreceiver" is now the synchronous standby with
> priority 1
>    /////////////////////
> LOG:  checkpoints are occurring too frequently (16 seconds apart)
> HINT:  Consider increasing the configuration parameter
> "checkpoint_segments".
> LOG:  checkpoints are occurring too frequently (23 seconds apart)
> HINT:  Consider increasing the configuration parameter
> "checkpoint_segments".
> LOG:  checkpoints are occurring too frequently (24 seconds apart)
> HINT:  Consider increasing the configuration parameter
> "checkpoint_segments".
> LOG:  checkpoints are occurring too frequently (20 seconds apart)
> HINT:  Consider increasing the configuration parameter
> "checkpoint_segments".
> LOG:  checkpoints are occurring too frequently (22 seconds apart)
> HINT:  Consider increasing the configuration parameter
> "checkpoint_segments".
> FATAL:  requested WAL segment 000000010000000000000032 has already
> been removed
> FATAL:  requested WAL segment 000000010000000000000032 has already
> been removed
> FATAL:  requested WAL segment 000000010000000000000032 has already
> been removed
> LOG:  checkpoints are occurring too frequently (8 seconds apart)
> HINT:  Consider increasing the configuration parameter
> "checkpoint_segments".
> FATAL:  requested WAL segment 000000010000000000000032 has already
> been removed
>
>
>
> ////////////////////////
> standby server output:
> LOG:  database system was interrupted while in recovery at log time
> 2012-03-30 1
> 4:44:31 HKT
> HINT:  If this has occurred more than once some data might be
> corrupted and you
> might need to choose an earlier recovery target.
> LOG:  entering standby mode
> LOG:  redo starts at 0/16E4760
> LOG:  consistent recovery state reached at 0/12D984D8
> LOG:  database system is ready to accept read only connections
> LOG:  record with zero length at 0/17DF1E68
> LOG:  invalid magic number 0000 in log file 0, segment 50, offset
> 6946816
> LOG:  streaming replication successfully connected to primary
> FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
> segment 00
> 0000010000000000000032 has already been removed


Well,
that is not a bug, just activate archive_mode = on on the master server
and set also wal_keep_segments = 1000 for example
to avoid that situation. I had the same situation, after digging on
search engines that was recomended settings. Well I forgot real
reason why, may be was too slow sending / receiving data from master /
sleave, but this fix the problem.


Regards,
Condor

Re: [streaming replication] 9.1.3 streaming replication bug ?

From
Adrian Klaver
Date:
On 04/09/2012 03:33 AM, 乔志强 wrote:
>
> But after some minutes the master output:
>         requested WAL segment XXX has already been removed
> the standby output:
>         FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment XXX
>              has already been removed
>
>
> Question:
> Why the master deletes the WAL segment before send to standby in synchronous mode? It is a streaming replication bug
?

>
>
> ////////////////////////////////////////
>
> master server output:
> LOG:  database system was interrupted; last known up at 2012-03-30 15:37:03 HKT
> LOG:  database system was not properly shut down; automatic recovery in progress


My question would be, what happened above? In other words what does the
log prior to this one show just before the database shutdown?




--
Adrian Klaver
adrian.klaver@gmail.com

>I see if no standby connect to master when synchronous_standby_names = 
> '*', all commit will delay to standby connect to master. It is good.


So I think the commit is sync between master and standby, 


But why the master delete the WAL segment before the standby commit when the standby connected ?



-----邮件原件-----
发件人: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] 代表 Condor
发送时间: 2012年4月9日 21:33
收件人: pgsql-general@postgresql.org
主题: Re: [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?

On 09.04.2012 13:33, 乔志强 wrote:
> I use postgresql-9.1.3-1-windows-x64.exe on windows 2008 R2 x64.
>
> 1 master and 1 standby. The standby is a synchronous standby use 
> streaming replication (synchronous_standby_names = '*', archive_mode = 
> off), the master output:
>        standby "walreceiver" is now the synchronous standby with 
> priority 1 the standby output:
>        LOG:  streaming replication successfully connected to primary
>
> Then run the test program to write and commit large blob(10 to 1000 MB 
> bytes rand size) to master server use 40 threads(40 sessions) in loop, 
> The Master and standby is run on the same machine, and the client run 
> on another machine with 100 mbps network.
>
>
> But after some minutes the master output:
>        requested WAL segment XXX has already been removed the standby 
> output:
>        FATAL:  could not receive data from WAL stream: FATAL:
> requested WAL segment XXX
>             has already been removed
>
>
> Question:
> Why the master deletes the WAL segment before send to standby in 
> synchronous mode? It is a streaming replication bug ?
>
>
> I see if no standby connect to master when synchronous_standby_names = 
> '*', all commit will delay to standby connect to master. It is good.
>
> Use a bigger wal_keep_segments?  But I think the master should keep 
> all WAL segments not sent to online standby (sync or async).
> wal_keep_segments shoud be only for offline standby.
>
> If use synchronous_standby_names for sync standby, if no online 
> standby, all commit will delay to standby connect to master, So 
> wal_keep_segments is only for offline async standby actually.
>
>
>
> ////////////////////////////////////////
>
> master server output:
> LOG:  database system was interrupted; last known up at 2012-03-30
> 15:37:03 HKT
> LOG:  database system was not properly shut down; automatic recovery 
> in progress
>
> LOG:  redo starts at 0/136077B0
> LOG:  record with zero length at 0/17DF1E10
> LOG:  redo done at 0/17DF1D98
> LOG:  last completed transaction was at log time 2012-03-30
> 15:37:03.148+08
> FATAL:  the database system is starting up
> LOG:  database system is ready to accept connections
> LOG:  autovacuum launcher started
>    ///////////////////// the standby is a synchronous standby
>      LOG:  standby "walreceiver" is now the synchronous standby with 
> priority 1
>    /////////////////////
> LOG:  checkpoints are occurring too frequently (16 seconds apart)
> HINT:  Consider increasing the configuration parameter 
> "checkpoint_segments".
> LOG:  checkpoints are occurring too frequently (23 seconds apart)
> HINT:  Consider increasing the configuration parameter 
> "checkpoint_segments".
> LOG:  checkpoints are occurring too frequently (24 seconds apart)
> HINT:  Consider increasing the configuration parameter 
> "checkpoint_segments".
> LOG:  checkpoints are occurring too frequently (20 seconds apart)
> HINT:  Consider increasing the configuration parameter 
> "checkpoint_segments".
> LOG:  checkpoints are occurring too frequently (22 seconds apart)
> HINT:  Consider increasing the configuration parameter 
> "checkpoint_segments".
> FATAL:  requested WAL segment 000000010000000000000032 has already 
> been removed
> FATAL:  requested WAL segment 000000010000000000000032 has already 
> been removed
> FATAL:  requested WAL segment 000000010000000000000032 has already 
> been removed
> LOG:  checkpoints are occurring too frequently (8 seconds apart)
> HINT:  Consider increasing the configuration parameter 
> "checkpoint_segments".
> FATAL:  requested WAL segment 000000010000000000000032 has already 
> been removed
>
>
>
> ////////////////////////
> standby server output:
> LOG:  database system was interrupted while in recovery at log time 
> 2012-03-30 1
> 4:44:31 HKT
> HINT:  If this has occurred more than once some data might be 
> corrupted and you might need to choose an earlier recovery target.
> LOG:  entering standby mode
> LOG:  redo starts at 0/16E4760
> LOG:  consistent recovery state reached at 0/12D984D8
> LOG:  database system is ready to accept read only connections
> LOG:  record with zero length at 0/17DF1E68
> LOG:  invalid magic number 0000 in log file 0, segment 50, offset
> 6946816
> LOG:  streaming replication successfully connected to primary
> FATAL:  could not receive data from WAL stream: FATAL:  requested WAL 
> segment 00
> 0000010000000000000032 has already been removed


Well,
that is not a bug, just activate archive_mode = on on the master server and set also wal_keep_segments = 1000 for
exampleto avoid that situation. I had the same situation, after digging on search engines that was recomended settings.
WellI forgot real reason why, may be was too slow sending / receiving data from master / sleave, but this fix the
problem.


Regards,
Condor

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [streaming replication] 9.1.3 streaming replication bug ?

From
Fujii Masao
Date:
On Mon, Apr 9, 2012 at 7:33 PM, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
> Question:
> Why the master deletes the WAL segment before send to standby in synchronous mode?

Otherwise the master might be filled up with lots of unsent WAL files and
which might cause PANIC error in the master, when there is no standby.
IOW, the master tries to avoid a PANIC error rather than termination of
replication.

> It is a streaming replication bug ?

No. It's intentional.

> If use synchronous_standby_names for sync standby, if no online standby, all commit will delay to standby connect to
master,
> So wal_keep_segments is only for offline async standby actually.

What if synchronous_commit is set to local or async?

Regards,

--
Fujii Masao

Re: [streaming replication] 9.1.3 streaming replication bug ?

From
乔志强
Date:
Thank you for this good feature and your reply.


synchronous_commit is not set, default is "on" ?
#synchronous_commit = on        # synchronization level; on, off, or local

>Otherwise the master might be filled up with lots of unsent WAL files and which might cause PANIC error in the master,
whenthere is no standby.
 
>IOW, the master tries to avoid a PANIC error rather than termination of replication.

Can we have a config option for keep unsent WAL file for replication ?


How can I do when I need a backup standby server and 
    wal_keep_segments = 3 for save master disk usage(master will delete wal before send to standby now when heavy load,
Needmodify some config?)
 
    sync commit to master and standby (this is supportted now)





My config file of master server:

# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The "=" is optional.)  Whitespace may be used.  Comments are introduced with
# "#" anywhere on a line.  The complete list of parameter names and allowed
# values can be found in the PostgreSQL documentation.
#
# The commented-out settings shown in this file represent the default values.
# Re-commenting a setting is NOT sufficient to revert it to the default value;
# you need to reload the server.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal.  If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pg_ctl reload".  Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
#
# Any parameter can also be given as a command-line option to the server, e.g.,
# "postgres -c log_connections=on".  Some parameters can be changed at run time
# with the "SET" SQL command.
#
# Memory units:  kB = kilobytes        Time units:  ms  = milliseconds
#                MB = megabytes                     s   = seconds
#                GB = gigabytes                     min = minutes
#                                                   h   = hours
#                                                   d   = days


#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------

# The default values of these variables are driven from the -D command-line
# option or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'        # use data in another directory
                    # (change requires restart)
#hba_file = 'ConfigDir/pg_hba.conf'    # host-based authentication file
                    # (change requires restart)
#ident_file = 'ConfigDir/pg_ident.conf'    # ident configuration file
                    # (change requires restart)

# If external_pid_file is not explicitly set, no extra PID file is written.
#external_pid_file = '(none)'        # write an extra PID file
                    # (change requires restart)


#------------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#------------------------------------------------------------------------------

# - Connection Settings -

listen_addresses = '*'        # what IP address(es) to listen on;
                    # comma-separated list of addresses;
                    # defaults to 'localhost', '*' = all
                    # (change requires restart)
port = 5432                # (change requires restart)
max_connections = 100            # (change requires restart)
# Note:  Increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).
#superuser_reserved_connections = 3    # (change requires restart)
#unix_socket_directory = ''        # (change requires restart)
#unix_socket_group = ''            # (change requires restart)
#unix_socket_permissions = 0777        # begin with 0 to use octal notation
                    # (change requires restart)
#bonjour = off                # advertise server via Bonjour
                    # (change requires restart)
#bonjour_name = ''            # defaults to the computer name
                    # (change requires restart)

# - Security and Authentication -

#authentication_timeout = 1min        # 1s-600s
#ssl = off                # (change requires restart)
#ssl_ciphers = 'ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH'    # allowed SSL ciphers
                    # (change requires restart)
#ssl_renegotiation_limit = 512MB    # amount of data between renegotiations
#password_encryption = on
#db_user_namespace = off

# Kerberos and GSSAPI
#krb_server_keyfile = ''
#krb_srvname = 'postgres'        # (Kerberos only)
#krb_caseins_users = off

# - TCP Keepalives -
# see "man 7 tcp" for details

tcp_keepalives_idle = 20        # TCP_KEEPIDLE, in seconds;
                # 0 selects the system default
tcp_keepalives_interval = 2        # TCP_KEEPINTVL, in seconds;
                    # 0 selects the system default
tcp_keepalives_count = 0        # TCP_KEEPCNT;
                    # 0 selects the system default


#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------

# - Memory -

shared_buffers = 320MB            # min 128kB
                    # (change requires restart)
temp_buffers = 80MB            # min 800kB
#max_prepared_transactions = 0        # zero disables the feature
                    # (change requires restart)
# Note:  Increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
# It is not advisable to set max_prepared_transactions nonzero unless you
# actively intend to use prepared transactions.
#work_mem = 1MB                # min 64kB
#maintenance_work_mem = 16MB        # min 1MB
#max_stack_depth = 2MB            # min 100kB

# - Kernel Resource Usage -

#max_files_per_process = 1000        # min 25
                    # (change requires restart)
#shared_preload_libraries = ''        # (change requires restart)

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0ms        # 0-100 milliseconds
#vacuum_cost_page_hit = 1        # 0-10000 credits
#vacuum_cost_page_miss = 10        # 0-10000 credits
#vacuum_cost_page_dirty = 20        # 0-10000 credits
#vacuum_cost_limit = 200        # 1-10000 credits

# - Background Writer -

#bgwriter_delay = 200ms            # 10-10000ms between rounds
#bgwriter_lru_maxpages = 100        # 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0        # 0-10.0 multipler on buffers scanned/round

# - Asynchronous Behavior -

#effective_io_concurrency = 1        # 1-1000. 0 disables prefetching


#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------

# - Settings -

wal_level = hot_standby            # minimal, archive, or hot_standby
                    # (change requires restart)
#fsync = on                # turns forced synchronization on or off
#synchronous_commit = on        # synchronization level; on, off, or local
#wal_sync_method = fsync        # the default is the first option
                    # supported by the operating system:
                    #   open_datasync
                    #   fdatasync (default on Linux)
                    #   fsync
                    #   fsync_writethrough
                    #   open_sync
#full_page_writes = on            # recover from partial page writes
#wal_buffers = -1            # min 32kB, -1 sets based on shared_buffers
                    # (change requires restart)
#wal_writer_delay = 200ms        # 1-10000 milliseconds

#commit_delay = 0            # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000

# - Checkpoints -

#checkpoint_segments = 3        # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min        # range 30s-1h
#checkpoint_completion_target = 0.5    # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s        # 0 disables

# - Archiving -

#archive_mode = off        # allows archiving to be done
                # (change requires restart)
#archive_command = ''        # command to use to archive a logfile segment
#archive_timeout = 0        # force a logfile segment switch after this
                # number of seconds; 0 disables


#------------------------------------------------------------------------------
# REPLICATION
#------------------------------------------------------------------------------

# - Master Server -

# These settings are ignored on a standby server

max_wal_senders = 1        # max number of walsender processes
                # (change requires restart)
wal_sender_delay = 1s        # walsender cycle time, 1-10000 milliseconds
wal_keep_segments = 3        # in logfile segments, 16MB each; 0 disables
#vacuum_defer_cleanup_age = 0    # number of xacts by which cleanup is delayed
replication_timeout = 60s         # in milliseconds; 0 disables
synchronous_standby_names = '*'    # standby servers that provide sync rep
                # comma-separated list of application_name
                # from standby(s); '*' = all

# - Standby Servers -

# These settings are ignored on a master server

hot_standby = on            # "on" allows queries during recovery
                    # (change requires restart)
#max_standby_archive_delay = 30s    # max delay before canceling queries
                    # when reading WAL from archive;
                    # -1 allows indefinite delay
#max_standby_streaming_delay = 30s    # max delay before canceling queries
                    # when reading streaming WAL;
                    # -1 allows indefinite delay
wal_receiver_status_interval = 10s    # send replies at least this often
                    # 0 disables
#hot_standby_feedback = off        # send info from standby to prevent
                    # query conflicts


#------------------------------------------------------------------------------
# QUERY TUNING
#------------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_material = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0            # measured on an arbitrary scale
#random_page_cost = 4.0            # same scale as above
#cpu_tuple_cost = 0.01            # same scale as above
#cpu_index_tuple_cost = 0.005        # same scale as above
#cpu_operator_cost = 0.0025        # same scale as above
#effective_cache_size = 128MB

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5            # range 1-10
#geqo_pool_size = 0            # selects default based on effort
#geqo_generations = 0            # selects default based on effort
#geqo_selection_bias = 2.0        # range 1.5-2.0
#geqo_seed = 0.0            # range 0.0-1.0

# - Other Planner Options -

#default_statistics_target = 100    # range 1-10000
#constraint_exclusion = partition    # on, off, or partition
#cursor_tuple_fraction = 0.1        # range 0.0-1.0
#from_collapse_limit = 8
#join_collapse_limit = 8        # 1 disables collapsing of explicit
                    # JOIN clauses


#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'        # Valid values are combinations of
                    # stderr, csvlog, syslog, and eventlog,
                    # depending on platform.  csvlog
                    # requires logging_collector to be on.

# This is used when logging to stderr:
#logging_collector = off        # Enable capturing of stderr and csvlog
                    # into log files. Required to be on for
                    # csvlogs.
                    # (change requires restart)

# These are only used if logging_collector is on:
#log_directory = 'pg_log'        # directory where log files are written,
                    # can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'    # log file name pattern,
                    # can include strftime() escapes
#log_file_mode = 0600            # creation mode for log files,
                    # begin with 0 to use octal notation
#log_truncate_on_rotation = off        # If on, an existing log file with the
                    # same name as the new log file will be
                    # truncated rather than appended to.
                    # But such truncation only occurs on
                    # time-driven rotation, not on restarts
                    # or size-driven rotation.  Default is
                    # off, meaning append to existing files
                    # in all cases.
#log_rotation_age = 1d            # Automatic rotation of logfiles will
                    # happen after that time.  0 disables.
#log_rotation_size = 10MB        # Automatic rotation of logfiles will
                    # happen after that much log output.
                    # 0 disables.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'

#silent_mode = off            # Run server silently.
                    # DO NOT USE without syslog or
                    # logging_collector
                    # (change requires restart)


# - When to Log -

#client_min_messages = notice        # values in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   log
                    #   notice
                    #   warning
                    #   error

#log_min_messages = warning        # values in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic

#log_min_error_statement = error    # values in order of decreasing detail:
                     #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                     #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic (effectively off)

#log_min_duration_statement = -1    # -1 is disabled, 0 logs all statements
                    # and their durations, > 0 logs only
                    # statements running at least this number
                    # of milliseconds


# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = on
#log_checkpoints = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_error_verbosity = default        # terse, default, or verbose messages
#log_hostname = off
#log_line_prefix = ''            # special values:
                    #   %a = application name
                    #   %u = user name
                    #   %d = database name
                    #   %r = remote host and port
                    #   %h = remote host
                    #   %p = process ID
                    #   %t = timestamp without milliseconds
                    #   %m = timestamp with milliseconds
                    #   %i = command tag
                    #   %e = SQL state
                    #   %c = session ID
                    #   %l = session line number
                    #   %s = session start timestamp
                    #   %v = virtual transaction ID
                    #   %x = transaction ID (0 if none)
                    #   %q = stop here in non-session
                    #        processes
                    #   %% = '%'
                    # e.g. '<%u%%%d> '
#log_lock_waits = off            # log lock waits >= deadlock_timeout
#log_statement = 'none'            # none, ddl, mod, all
#log_temp_files = -1            # log temporary files equal or larger
                    # than the specified size in kilobytes;
                    # -1 disables, 0 logs all temp files
#log_timezone = '(defaults to server environment setting)'


#------------------------------------------------------------------------------
# RUNTIME STATISTICS
#------------------------------------------------------------------------------

# - Query/Index Statistics Collector -

#track_activities = on
#track_counts = on
#track_functions = none            # none, pl, all
#track_activity_query_size = 1024     # (change requires restart)
#update_process_title = on
#stats_temp_directory = 'pg_stat_tmp'


# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off


#------------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#------------------------------------------------------------------------------

#autovacuum = on            # Enable autovacuum subprocess?  'on'
                    # requires track_counts to also be on.
#log_autovacuum_min_duration = -1    # -1 disables, 0 logs all actions and
                    # their durations, > 0 logs only
                    # actions running at least this number
                    # of milliseconds.
#autovacuum_max_workers = 3        # max number of autovacuum subprocesses
                    # (change requires restart)
#autovacuum_naptime = 1min        # time between autovacuum runs
#autovacuum_vacuum_threshold = 50    # min number of row updates before
                    # vacuum
#autovacuum_analyze_threshold = 50    # min number of row updates before
                    # analyze
#autovacuum_vacuum_scale_factor = 0.2    # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1    # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000    # maximum XID age before forced vacuum
                    # (change requires restart)
#autovacuum_vacuum_cost_delay = 20ms    # default vacuum cost delay for
                    # autovacuum, in milliseconds;
                    # -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1    # default vacuum cost limit for
                    # autovacuum, -1 means use
                    # vacuum_cost_limit


#------------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#------------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '"$user",public'        # schema names
#default_tablespace = ''        # a tablespace name, '' uses the default
#temp_tablespaces = ''            # a list of tablespace names, '' uses
                    # only default tablespace
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off

#default_transaction_deferrable = off
#session_replication_role = 'origin'
#statement_timeout = 0            # in milliseconds, 0 is disabled
#vacuum_freeze_min_age = 50000000
#vacuum_freeze_table_age = 150000000
#bytea_output = 'hex'            # hex, escape
#xmlbinary = 'base64'
#xmloption = 'content'

# - Locale and Formatting -

datestyle = 'iso, ymd'
#intervalstyle = 'postgres'
#timezone = '(defaults to server environment setting)'
#timezone_abbreviations = 'Default'     # Select the set of available time zone
                    # abbreviations.  Currently, there are
                    #   Default
                    #   Australia
                    #   India
                    # You can create your own file in
                    # share/timezonesets/.
#extra_float_digits = 0            # min -15, max 3
#client_encoding = sql_ascii        # actually, defaults to database
                    # encoding

# These settings are initialized by initdb, but they can be changed.
lc_messages = 'C'            # locale for system error message
                    # strings
lc_monetary = 'C'            # locale for monetary formatting
lc_numeric = 'C'            # locale for number formatting
lc_time = 'C'                # locale for time formatting

# default configuration for text search
default_text_search_config = 'pg_catalog.english'

# - Other Defaults -

#dynamic_library_path = '$libdir'
#local_preload_libraries = ''


#------------------------------------------------------------------------------
# LOCK MANAGEMENT
#------------------------------------------------------------------------------

#deadlock_timeout = 1s
#max_locks_per_transaction = 64        # min 10
                    # (change requires restart)
# Note:  Each lock table slot uses ~270 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.
#max_pred_locks_per_transaction = 64    # min 10
                    # (change requires restart)

#------------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#------------------------------------------------------------------------------

# - Previous PostgreSQL Versions -

#array_nulls = on
#backslash_quote = safe_encoding    # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = on
#lo_compat_privileges = off
#quote_all_identifiers = off
#sql_inheritance = on
#standard_conforming_strings = on
#synchronize_seqscans = on

# - Other Platforms and Clients -

#transform_null_equals = off


#------------------------------------------------------------------------------
# ERROR HANDLING
#------------------------------------------------------------------------------

#exit_on_error = off                # terminate session on any error?
#restart_after_crash = on            # reinitialize after backend crash?


#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------

#custom_variable_classes = ''        # list of custom variable class names


-----邮件原件-----
发件人: Fujii Masao [mailto:masao.fujii@gmail.com] 
发送时间: 2012年4月10日 23:08
收件人: 乔志强
抄送: pgsql-general@postgresql.org
主题: Re: [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?

On Mon, Apr 9, 2012 at 7:33 PM, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
> Question:
> Why the master deletes the WAL segment before send to standby in synchronous mode?

Otherwise the master might be filled up with lots of unsent WAL files and which might cause PANIC error in the master,
whenthere is no standby.
 
IOW, the master tries to avoid a PANIC error rather than termination of replication.

> It is a streaming replication bug ?

No. It's intentional.

> If use synchronous_standby_names for sync standby, if no online 
> standby, all commit will delay to standby connect to master, So wal_keep_segments is only for offline async standby
actually.

What if synchronous_commit is set to local or async?

Regards,

--
Fujii Masao

Re: [streaming replication] 9.1.3 streaming replication bug ?

From
Fujii Masao
Date:
On Wed, Apr 11, 2012 at 10:06 AM, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
> synchronous_commit is not set, default is "on" ?
> #synchronous_commit = on                # synchronization level; on, off, or local

Yes.

>>Otherwise the master might be filled up with lots of unsent WAL files and which might cause PANIC error in the
master,when there is no standby. 
>>IOW, the master tries to avoid a PANIC error rather than termination of replication.
>
> Can we have a config option for keep unsent WAL file for replication ?

No. We discussed about such feature before, but it had failed to be committed.
I think it's useful, so I hope it'll be usable in the future release.

> How can I do when I need a backup standby server and
>    wal_keep_segments = 3 for save master disk usage(master will delete wal before send to standby now when heavy
load,Need modify some config?) 

Yes, increase wal_keep_segments. Even if you set wal_keep_segments to 64,
the amount of disk space for WAL files is only 1GB, so there is no need to worry
so much, I think. No?

> #checkpoint_segments = 3                # in logfile segments, min 1, 16MB each

Increase checkpoint_segments. In this setting, I guess checkpoints run too
frequently in heavy load, and WAL files are removed too aggressively.

Regards,

--
Fujii Masao

Fwd: [streaming replication] 9.1.3 streaming replication bug ?

From
Michael Nolan
Date:


---------- Forwarded message ----------
From: Michael Nolan <htfoot@gmail.com>
Date: Tue, Apr 10, 2012 at 9:47 PM
Subject: Re: [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
To: Fujii Masao <masao.fujii@gmail.com>




On Tue, Apr 10, 2012 at 9:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Apr 11, 2012 at 10:06 AM, 乔志强


> How can I do when I need a backup standby server and
>    wal_keep_segments = 3 for save master disk usage(master will delete wal before send to standby now when heavy load, Need modify some config?)

Yes, increase wal_keep_segments. Even if you set wal_keep_segments to 64,
the amount of disk space for WAL files is only 1GB, so there is no need to worry
so much, I think. No?


If you're writing records with a 100MB blob object in them, you definitely need to keep more than 3 WAL segments at a time, because at 16MB each that won't hold even one of your largest records.

That's the kind of value added information that the DBA brings to the table that the database itself won't know, which is why one of the DBA's most important tasks is to properly configure the postgresql.conf file, and revise it as the database changes over time.
--
Mike Nolan 

Re: 9.1.3 Standby catchup mode

From
Fujii Masao
Date:
On Fri, Apr 6, 2012 at 1:35 AM, hans wulf <lotu1@gmx.net> wrote:
> I am wondering how the catchup mode of a hot synchron slave server works on 9.1.3 if there is no WAL archive.
>
> Can the slave only request WALs that are still in the xlog directory of the master server? Or does the master
regeneratesome kind of fake log for the catchup mode? 

No. If the WAL file which the standby requests doesn't exist in the
pg_xlog directory
of the master, replication just fails. In this case, you need to take
a fresh base backup and
start the standby from that backup.

> E.g. in case of a slave failure I could use a weekly backup and let the catchup mode do the rest? Or does that only
workif you use WAL archive? 

Or increase wal_keep_segments to high so that all WAL files which the
standby requests
are guaranteed to exist in the pg_xlog directory of the master.

Regards,

--
Fujii Masao

Re: [streaming replication] 9.1.3 streaming replication bug ?

From
乔志强
Date:
> Yes, increase wal_keep_segments. Even if you set wal_keep_segments to 64, the amount of disk space for WAL files is
only1GB, so there is no need to worry so much, I think. No?
 

But when a transaction larger than 1GB...


If synchronous_standby_names = '*', when commit the master wait the standby to commit, 
but why the master delete the WAL before sent to standby? And the standby can not commit forever if WAL was deleted.
 

in  http://www.postgresql.org/docs/9.1/static/warm-standby.html
25.2.6. Synchronous Replication     says:
When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received
thatthe commit has been written to the transaction log on disk of both the primary and standby server.
 

25.2.6.1. Basic Configuration    says:
This configuration will cause each commit to wait for confirmation that the standby has written the commit record to
durablestorage, even if that takes a very long time.
 
.........
After a commit record has been written to disk on the primary, the WAL record is then sent to the standby. The standby
sendsreply messages each time a new batch of WAL data is written to disk, unless wal_receiver_status_interval is set to
zeroon the standby.
 

25.2.6.3. Planning for High Availability    says:
Commits made when synchronous_commit is set to on will wait until the sync standby responds. The response may never
occurif the last, or only, standby should crash.
 



So in sync streaming replication, if master delete WAL before sent to the only standby, all transaction will fail
forever,
 
"the master tries to avoid a PANIC error rather than termination of replication." but in sync replication, termination
ofreplication is THE bigger PANIC error.
 


Another question:
  Does master send WAL to standby before the transaction commit ?




-----邮件原件-----
发件人: Fujii Masao [mailto:masao.fujii@gmail.com] 
发送时间: 2012年4月11日 10:09
收件人: 乔志强
抄送: pgsql-general@postgresql.org
主题: Re: [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?

On Wed, Apr 11, 2012 at 10:06 AM, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
> synchronous_commit is not set, default is "on" ?
> #synchronous_commit = on                # synchronization level; on, 
> off, or local

Yes.

>>Otherwise the master might be filled up with lots of unsent WAL files and which might cause PANIC error in the
master,when there is no standby.
 
>>IOW, the master tries to avoid a PANIC error rather than termination of replication.
>
> Can we have a config option for keep unsent WAL file for replication ?

No. We discussed about such feature before, but it had failed to be committed.
I think it's useful, so I hope it'll be usable in the future release.

> How can I do when I need a backup standby server and
>    wal_keep_segments = 3 for save master disk usage(master will delete 
> wal before send to standby now when heavy load, Need modify some 
> config?)

Yes, increase wal_keep_segments. Even if you set wal_keep_segments to 64, the amount of disk space for WAL files is
only1GB, so there is no need to worry so much, I think. No?
 

> #checkpoint_segments = 3                # in logfile segments, min 1, 
> 16MB each

Increase checkpoint_segments. In this setting, I guess checkpoints run too frequently in heavy load, and WAL files are
removedtoo aggressively.
 

Regards,

--
Fujii Masao

Re: [streaming replication] 9.1.3 streaming replication bug ?

From
Michael Nolan
Date:
On 4/11/12, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
>
>> Yes, increase wal_keep_segments. Even if you set wal_keep_segments to 64,
>> the amount of disk space for WAL files is only 1GB, so there is no need to
>> worry so much, I think. No?
>
> But when a transaction larger than 1GB...

Then you may need WAL space larger than 1GB as well.  For replication to work,
it seems likely that you may need to have sufficient WAL space to
handle a row, possibly the entire transaction..  But since a single
statement can update thousands or millions of rows, do you always need
enough WAL space to hold the entire transaction?

> So in sync streaming replication, if master delete WAL before sent to the
> only standby, all transaction will fail forever,
> "the master tries to avoid a PANIC error rather than termination of
> replication." but in sync replication, termination of replication is THE
> bigger PANIC error.

That's somewhat debatable.  Would I rather have a master that PANICED or
a slave that lost replication?  I would choose the latter.   A third
option, which
may not even be feasible, would be to have the master fail the
transaction if synchronous replication cannot be achieved, although
that might have negative consequences as well.

> Another question:
>   Does master send WAL to standby before the transaction commit ?

That's another question for the core team, I suspect.  A related
question is what happens
if there is a rollback?
--
Mike Nolan

Re: [HACKERS] [streaming replication] 9.1.3 streaming replication bug ?

From
"Kevin Grittner"
Date:
Michael Nolan <htfoot@gmail.com> wrote:
> On 4/11/12, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:

>> But when a transaction larger than 1GB...
>
> Then you may need WAL space larger than 1GB as well.  For
> replication to work, it seems likely that you may need to have
> sufficient WAL space to handle a row, possibly the entire
> transaction..  But since a single statement can update thousands
> or millions of rows, do you always need enough WAL space to hold
> the entire transaction?

No.

>>   Does master send WAL to standby before the transaction commit ?

Yes.

> A related question is what happens if there is a rollback?

PostgreSQL doesn't use a rollback log; WAL files can be reclaimed as
soon as the work they represent has been persisted to the database
by a CHECKPOINT, even if it is not committed.  Because there can be
multiple versions of each row in the base table, each with its own
xmin (telling which transaction committed it) and xmax (telling
which transaction expired it) visibiliity checking can handle the
commits and rollbacks correctly.  It also uses a commit log (CLOG),
hint bits, and other structures to help resolve visibility.  It is a
complex topic, but it does work.

-Kevin

Re: [HACKERS] [streaming replication] 9.1.3 streaming replication bug ?

From
Michael Nolan
Date:
On 4/11/12, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
> Michael Nolan <htfoot@gmail.com> wrote:
>> On 4/11/12, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
>
>>> But when a transaction larger than 1GB...
>>
>> Then you may need WAL space larger than 1GB as well.  For
>> replication to work, it seems likely that you may need to have
>> sufficient WAL space to handle a row, possibly the entire
>> transaction..  But since a single statement can update thousands
>> or millions of rows, do you always need enough WAL space to hold
>> the entire transaction?
>
> No.
>
>>>   Does master send WAL to standby before the transaction commit ?
>
> Yes.
>
>> A related question is what happens if there is a rollback?
>
> PostgreSQL doesn't use a rollback log; WAL files can be reclaimed as
> soon as the work they represent has been persisted to the database
> by a CHECKPOINT, even if it is not committed.  Because there can be
> multiple versions of each row in the base table, each with its own
> xmin (telling which transaction committed it) and xmax (telling
> which transaction expired it) visibiliity checking can handle the
> commits and rollbacks correctly.  It also uses a commit log (CLOG),
> hint bits, and other structures to help resolve visibility.  It is a
> complex topic, but it does work.

Thanks, Kevin.  That does lead to a question about the problem that
started this thread, though.  How does one determine how big the WAL
space needs to be to not cause streaming replication to fail?  Or
maybe this is a bug after all?
--
Mike Nolan

Re: [streaming replication] 9.1.3 streaming replication bug ?

From
Fujii Masao
Date:
On Wed, Apr 11, 2012 at 3:31 PM, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
> So in sync streaming replication, if master delete WAL before sent to the only standby, all transaction will fail
forever,
> "the master tries to avoid a PANIC error rather than termination of replication." but in sync replication,
terminationof replication is THE bigger PANIC error. 

I see your point. When there are backends waiting for replication, the WAL files
which the standby might not have received yet must not be removed. If they are
removed, replication keeps failing forever because required WAL files don't
exist in the master, and then waiting backends will never be released unless
replication mode is changed to async. This should be avoided.

To fix this issue, we should prevent the master from deleting the WAL files
including the minimum waiting LSN or bigger ones. I'll think more and implement
the patch.

Regards,

--
Fujii Masao

Re: [HACKERS] [streaming replication] 9.1.3 streaming replication bug ?

From
Michael Nolan
Date:
On 4/11/12, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Apr 11, 2012 at 3:31 PM, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
>> So in sync streaming replication, if master delete WAL before sent to the
>> only standby, all transaction will fail forever,
>> "the master tries to avoid a PANIC error rather than termination of
>> replication." but in sync replication, termination of replication is THE
>> bigger PANIC error.
>
> I see your point. When there are backends waiting for replication, the WAL
> files
> which the standby might not have received yet must not be removed. If they
> are
> removed, replication keeps failing forever because required WAL files don't
> exist in the master, and then waiting backends will never be released unless
> replication mode is changed to async. This should be avoided.
>
> To fix this issue, we should prevent the master from deleting the WAL files
> including the minimum waiting LSN or bigger ones. I'll think more and
> implement
> the patch.

With asynchonous replication, does the master even know if a slave
fails because of a WAL problem?  And does/should it care?

Isn't there a separate issue with synchronous replication?  If it
fails, what's the appropriate action to take on the master?  PANICing
it seems to be a bad idea, but having transactions never complete
because they never hear back from the synchronous slave (for whatever
reason) seems bad too.
--
Mike Nolan

Re: [streaming replication] 9.1.3 streaming replication bug ?

From
Fujii Masao
Date:
On Thu, Apr 12, 2012 at 12:56 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Apr 11, 2012 at 3:31 PM, 乔志强 <qiaozhiqiang@leadcoretech.com> wrote:
>> So in sync streaming replication, if master delete WAL before sent to the only standby, all transaction will fail
forever,
>> "the master tries to avoid a PANIC error rather than termination of replication." but in sync replication,
terminationof replication is THE bigger PANIC error. 
>
> I see your point. When there are backends waiting for replication, the WAL files
> which the standby might not have received yet must not be removed. If they are
> removed, replication keeps failing forever because required WAL files don't
> exist in the master, and then waiting backends will never be released unless
> replication mode is changed to async. This should be avoided.

On second thought, we can avoid the issue by just increasing
wal_keep_segments enough. Even if the issue happens and some backends
get stuck to wait for replication, we can release them by taking fresh backup
and restarting the standby from that backup. This is the basic procedure to
restart replication after replication is terminated because required WAL files
are removed from the master. So this issue might not be worth implementing
the patch for now (though I'm not against improving things in the future), but
it seems just a tuning-problem of wal_keep_segments.

Regards,

--
Fujii Masao

Fwd: [HACKERS] [streaming replication] 9.1.3 streaming replication bug ?

From
Michael Nolan
Date:
---------- Forwarded message ----------
From: Michael Nolan <htfoot@gmail.com>
Date: Wed, 11 Apr 2012 14:48:18 -0400
Subject: Re: [HACKERS] [GENERAL] [streaming replication] 9.1.3
streaming replication bug ?
To: Robert Haas <robertmhaas@gmail.com>

On Wed, Apr 11, 2012 at 2:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

>
>
> We've talked about teaching the master to keep track of how far back
> all of its known standbys are, and retaining WAL back to that specific
> point, rather than the shotgun approach that is wal_keep_segments.
> It's not exactly clear what the interface to that should look like,
> though.
>
>
Moreover, how does the database decide when to drop a known standby from
the queue because it has failed or the DBA notify the database that a
particular standby should no longer be included?

Re: [HACKERS] [streaming replication] 9.1.3 streaming replication bug ?

From
Fujii Masao
Date:
On Thu, Apr 12, 2012 at 4:09 AM, Michael Nolan <htfoot@gmail.com> wrote:
> ---------- Forwarded message ----------
> From: Michael Nolan <htfoot@gmail.com>
> Date: Wed, 11 Apr 2012 14:48:18 -0400
> Subject: Re: [HACKERS] [GENERAL] [streaming replication] 9.1.3
> streaming replication bug ?
> To: Robert Haas <robertmhaas@gmail.com>
>
> On Wed, Apr 11, 2012 at 2:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>>
>>
>> We've talked about teaching the master to keep track of how far back
>> all of its known standbys are, and retaining WAL back to that specific
>> point, rather than the shotgun approach that is wal_keep_segments.
>> It's not exactly clear what the interface to that should look like,
>> though.
>>
>>
> Moreover, how does the database decide when to drop a known standby from
> the queue because it has failed or the DBA notify the database that a
> particular standby should no longer be included?

Probably the latter. So as Robert pointed out, we need neat API to register
and drop the standby. Though I have no good idea about this..

BTW, I have another idea about wal_keep_segments problem.
http://archives.postgresql.org/message-id/AANLkTinN=xsPOoaXzVFSp1OkfMDAB1f_d-F91xjEZDV8@mail.gmail.com

Regards,

--
Fujii Masao