Re: - PostgreSQL Replication Types - Mailing list pgsql-general

From Will McCormick
Subject Re: - PostgreSQL Replication Types
Date
Msg-id CA+jgkY7xpk8vTaeM4b_rW023gBNg4qyNV=n_UtJH5OOHW1suKA@mail.gmail.com
Whole thread Raw
In response to Re: - PostgreSQL Replication Types  (Adrian Klaver <adrian.klaver@aklaver.com>)
List pgsql-general
Thanks for the great assistance

On Thu, Dec 17, 2015 at 11:27 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 12/17/2015 07:56 AM, Will McCormick wrote:
Thanks a ton for the prompt response.

I've read most of this but some it was not clear until we discussed.

Updated with WLM:

On 12/17/2015 07:17 AM, Will McCormick wrote:

    I inherited a 9.1 replication environment

    Few basic questions that I can't find clear answers / clarifications for
    if possible:

    3 types of replication in 9.1 I've read about from the offical docs:

    1) warm standby
    2) hot standby
    3) streaming replication

    I'm using streaming replication I believe,  the only indication I have
    is that  there is the primary_conninfo on the standby. Is this the only
    indication?


WLM: I also see streaming replication in the logs.

On standby:

http://www.postgresql.org/docs/9.1/interactive/functions-admin.html
"
pg_last_xlog_receive_location()

Get last transaction log location received and synced to disk by
streaming replication. While streaming replication is in progress this
will increase monotonically. If recovery has completed this will remain
static at the value of the last WAL record received and synced to disk
during recovery. If streaming replication is disabled, or if it has not
yet started, the function returns NULL."

WLM: When I do this on the standby I get an error:

   ERROR:  recovery is in progress

   HINT:  WAL control functions cannot be ...

What is the full hint message?
The functions are supposed to be able to be run while the server is in recovery.





    Is it possible to get if using streaming replication under normal
    operations?

    /cp: cannot stat `/opt/postgres/9.1/archive/000000070000000F00000057':
    No such file or directory/

    /cp: cannot stat `/opt/postgres/9.1/archive/000000070000000F00000057':
    No such file or directory/

    /LOG:  streaming replication successfully connected to primary/

    /FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
    segment 000000070000000F00000057 has already been removed/


Assuming above is from standby log, correct? WLM: yes

The cp lines would seem to indicate a restore_command in the standby
recovery.conf, is that the case?:

http://www.postgresql.org/docs/9.1/static/archive-recovery-settings.html

restore_command (string) WLM: Correct


The FATAL indicates that the WAL file has already been recycled on the
master.

WLM: I had read this what confuses me a bit is:

   /cp: cannot stat `/opt/postgres/9.1/archive/000000070000000F00000057':
   No such file or directory/

    Does Streaming replication automatically use Archived WAL files when
WAL XLOG files don't contain a transaction?

    We did have wal_keep_segments set to 0. I changed this to 50 but
want to better understand this. Especially the correlation between the
Archived WALs and the XLOG WALs. My guess is the difference between
Streaming replication and the others is very simply that Streaming
replication can read the XLOG WALs as well? So if all the Archived WALs
have been shipped and processed to the Standby then the XLOGs are
processed but not shipped? This meaning at a transaction level "kindof"?

Not really see the section below(#STREAMING-REPLICATION) I posted previously. It is either or, if streaming is set up and the standby can reach the master xlog directory then it will stream the WAL files from there. If the standby cannot access the xlog directory and if you have WAL archiving set up on the master and archive restore setup on the standby it will switch to full WAL log shipping from the archived WAL directory, assuming the files it needs are there.



See:

http://www.postgresql.org/docs/9.1/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER

"wal_keep_segments (integer)

     Specifies the minimum number of past log file segments kept in the
pg_xlog directory, in case a standby server needs to fetch them for
streaming replication. Each segment is normally 16 megabytes. If a
standby server connected to the primary falls behind by more than
wal_keep_segments segments, the primary might remove a WAL segment still
needed by the standby, in which case the replication connection will be
terminated. (However, the standby server can recover by fetching the
segment from archive, if WAL archiving is in use.)

     This sets only the minimum number of segments retained in pg_xlog;
the system might need to retain more segments for WAL archival or to
recover from a checkpoint. If wal_keep_segments is zero (the default),
the system doesn't keep any extra segments for standby purposes, so the
number of old WAL segments available to standby servers is a function of
the location of the previous checkpoint and status of WAL archiving.
This parameter has no effect on restartpoints. This parameter can only
be set in the postgresql.conf file or on the server command line.
"

http://www.postgresql.org/docs/9.1/interactive/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous
archiving, you have to set wal_keep_segments in the master to a value
high enough to ensure that old WAL segments are not recycled too early,
while the standby might still need them to catch up. If the standby
falls behind too much, it needs to be reinitialized from a new base
backup. If you set up a WAL archive that's accessible from the standby,
wal_keep_segments is not required as the standby can always use the
archive to catch up."


    /
    /

    My understanding is that warm standby and hot standby do log shipping
    and there is a greater window for transactions not to be send to the
    standby because WAL XLOG must be filled.


Hot versus warm standby refer to whether it is possible to run read only
queries on the standby in the first case or not in the second case.



    Whereas Streaming replication basically sends at the transaction level?


The difference you are looking for is log shipping versus streaming,
where log shipping moves complete WAL files and streaming streams the
same files.

WLM: I still am having trouble with distinction. By the same files do
you mean XLOG WALs?

Yes. The WALs created in the xlog directory are the basis for replication. Leaving out logical replication, not available in 9.1, the only way to do replication using the Postgres core tools is to move those files from the master to the standby. You can either ship them whole which is log shipping or you can stream them, which is streaming. It also possible, which is what you are seeing, to do the belt and suspenders approach. That is to use streaming for the responsiveness, but have a parallel process that ships the whole logs to a archive directory where they can be accessed also. Having an archive directory also allows for PITR(Point In Time Recovery), but that is another topic:

http://www.postgresql.org/docs/9.1/static/continuous-archiving.html


See here for more detail: WLM: Reading now :)

http://www.postgresql.org/docs/9.1/interactive/high-availability.html

On Thu, Dec 17, 2015 at 10:37 AM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 12/17/2015 07:17 AM, Will McCormick wrote:

        I inherited a 9.1 replication environment

        Few basic questions that I can't find clear answers /
        clarifications for
        if possible:

        3 types of replication in 9.1 I've read about from the offical docs:

        1) warm standby
        2) hot standby
        3) streaming replication

        I'm using streaming replication I believe,  the only indication
        I have
        is that  there is the primary_conninfo on the standby. Is this
        the only
        indication?


    On standby:

    http://www.postgresql.org/docs/9.1/interactive/functions-admin.html
    "
    pg_last_xlog_receive_location()

    Get last transaction log location received and synced to disk by
    streaming replication. While streaming replication is in progress
    this will increase monotonically. If recovery has completed this
    will remain static at the value of the last WAL record received and
    synced to disk during recovery. If streaming replication is
    disabled, or if it has not yet started, the function returns NULL."


        Is it possible to get if using streaming replication under normal
        operations?

        /cp: cannot stat
        `/opt/postgres/9.1/archive/000000070000000F00000057':
        No such file or directory/

        /cp: cannot stat
        `/opt/postgres/9.1/archive/000000070000000F00000057':
        No such file or directory/

        /LOG:  streaming replication successfully connected to primary/

        /FATAL:  could not receive data from WAL stream: FATAL:
        requested WAL
        segment 000000070000000F00000057 has already been removed/


    Assuming above is from standby log, correct?

    The cp lines would seem to indicate a restore_command in the standby
    recovery.conf, is that the case?:

    http://www.postgresql.org/docs/9.1/static/archive-recovery-settings.html

    restore_command (string)


    The FATAL indicates that the WAL file has already been recycled on
    the master.

    See:

    http://www.postgresql.org/docs/9.1/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER

    "wal_keep_segments (integer)

         Specifies the minimum number of past log file segments kept in
    the pg_xlog directory, in case a standby server needs to fetch them
    for streaming replication. Each segment is normally 16 megabytes. If
    a standby server connected to the primary falls behind by more than
    wal_keep_segments segments, the primary might remove a WAL segment
    still needed by the standby, in which case the replication
    connection will be terminated. (However, the standby server can
    recover by fetching the segment from archive, if WAL archiving is in
    use.)

         This sets only the minimum number of segments retained in
    pg_xlog; the system might need to retain more segments for WAL
    archival or to recover from a checkpoint. If wal_keep_segments is
    zero (the default), the system doesn't keep any extra segments for
    standby purposes, so the number of old WAL segments available to
    standby servers is a function of the location of the previous
    checkpoint and status of WAL archiving. This parameter has no effect
    on restartpoints. This parameter can only be set in the
    postgresql.conf file or on the server command line.
    "

    http://www.postgresql.org/docs/9.1/interactive/warm-standby.html#STREAMING-REPLICATION

    "If you use streaming replication without file-based continuous
    archiving, you have to set wal_keep_segments in the master to a
    value high enough to ensure that old WAL segments are not recycled
    too early, while the standby might still need them to catch up. If
    the standby falls behind too much, it needs to be reinitialized from
    a new base backup. If you set up a WAL archive that's accessible
    from the standby, wal_keep_segments is not required as the standby
    can always use the archive to catch up."


        /
        /

        My understanding is that warm standby and hot standby do log
        shipping
        and there is a greater window for transactions not to be send to the
        standby because WAL XLOG must be filled.


    Hot versus warm standby refer to whether it is possible to run read
    only queries on the standby in the first case or not in the second case.



        Whereas Streaming replication basically sends at the transaction
        level?


    The difference you are looking for is log shipping versus streaming,
    where log shipping moves complete WAL files and streaming streams
    the same files.

    See here for more detail:

    http://www.postgresql.org/docs/9.1/interactive/high-availability.html




        I'm sure this is somewhat misinformed!


        Thanks,


        Will



    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: - PostgreSQL Replication Types
Next
From: Will McCormick
Date:
Subject: Re: - PostgreSQL Replication Types