Re: better architecture? - Mailing list pgsql-general

From Adrian Klaver
Subject Re: better architecture?
Date
Msg-id 54709BCF.4030704@aklaver.com
Whole thread Raw
In response to Re: better architecture?  (zach cruise <zachc1980@gmail.com>)
Responses Re: better architecture?  (zach cruise <zachc1980@gmail.com>)
List pgsql-general
On 11/21/2014 07:38 PM, zach cruise wrote:
> On 11/20/14, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>> On 11/20/2014 04:57 PM, zach cruise wrote:
>>> On 11/20/14, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>>> On 11/20/2014 12:30 PM, zach cruise wrote:
>>>>>>
>>>>>> For more info see:
>>>>>>
>>>>>> http://www.postgresql.org/docs/9.3/interactive/continuous-archiving.html
>>>>> to be clear- i change my 2 VMs setup {"1. master (dev) - 2. slave
>>>>> (prod) setup"} to 3 VMs {"1. master (dev) - 2. slave (prod) setup - 3.
>>>>> archive (wal)"}.
>>>>>
>>>>> but what do i gain?
>>>>
>>>> Extra protection against failure, maybe.
>>>>
>>>> So:
>>>>
>>>>              --->  WAL Archive ---
>>>>              |                    |
>>>>              |    Streaming       |
>>>> master ---  --------------------> slave
>>>>
>>>> If the direct link between the master and slave goes down, the slave can
>>>> still get WALs from the archive. If the archive machine goes down you
>>>> still have the direct link. If you take the slave down the master can
>>>> still push WALs to the archive. This assumes the 'machines' are actually
>>>> separated and connecting through different networks. You say you are
>>>> using VMs, but not where they are running. If they are all running on
>>>> the same machine running through the same network link then you really
>>>> do not have protection against network issues. The same if the host
>>>> machine goes down. This is one of those pen and paper times, when you
>>>> sketch out the arrangement and start doing what ifs.
>>
>> First of all, the below is really in need of whiteboard/paper diagram to
>> keep track of the moving parts. That being said here it goes:
>>
>>>
>>> master, slave and archive can be 3 separate VMs on 1 host, with their
>>> clones on 2nd and 3rd hosts.
>>
>> I can see the above being a potential nightmare. I am not sure how you
>> ensure that the cloning process results in clones that exactly mirror
>> the state of the originals at a particular point in time. Failing in
>> that would seem to me to lead to no end of issues in the replication
>> process.
>>
>>>
>>> a follow-up question on WAL recycling: ("When WAL archiving is being
>>> done, the log segments must be archived before being recycled or
>>> removed" from
>>> http://www.postgresql.org/docs/9.3/static/wal-configuration.html)
>>>
>>> say streaming is off-
>>> * if both master and archive are down, slave is still up and running.
>>> yes?
>>
>> Yes.
>>
>>> * if master writes data when archive is down, it will copy over to
>>> slave when archive is back up. yes?
>>
>> If streaming is off and you are doing archiving then it will copy over
>> to the archive.
>>
>>> * but if WAL is recycled before archive is back up, it will not copy
>>> over to slave. yes?
>>
>> The issue here as pointed out previously is that the WALs will stack up
>> on the master because it will not be able to archive them. So then you
>> run into a potential of of space issue on the master. From here:
>>
>> http://www.postgresql.org/docs/9.3/interactive/continuous-archiving.html
>>
>> "It is important that the archive command return zero exit status if and
>> only if it succeeds. Upon getting a zero result, PostgreSQL will assume
>> that the file has been successfully archived, and will remove or recycle
>> it. However, a nonzero status tells PostgreSQL that the file was not
>> archived; it will try again periodically until it succeeds."
>>
>> ...
>>
>> "While designing your archiving setup, consider what will happen if the
>> archive command fails repeatedly because some aspect requires operator
>> intervention or the archive runs out of space. For example, this could
>> occur if you write to tape without an autochanger; when the tape fills,
>> nothing further can be archived until the tape is swapped. You should
>> ensure that any error condition or request to a human operator is
>> reported appropriately so that the situation can be resolved reasonably
>> quickly. The pg_xlog/ directory will continue to fill with WAL segment
>> files until the situation is resolved. (If the file system containing
>> pg_xlog/ fills up, PostgreSQL will do a PANIC shutdown. No committed
>> transactions will be lost, but the database will remain offline until
>> you free some space.)"
>>
>>
>>> see my concern with a separate archive is if archive is down and
>>> master gets stuck retrying to push the same segment again and again,
>>> there may be a problem in recovery when archive is back up. no?
>>
>> See above.
>
> also streaming is working (checked with wireshark) and WAL is being
> written to by master, but if i comment out streaming in pg_hba.conf,
> WAL is not copying over to slave?!

Do you have a restore_command set on the slave?:

http://www.postgresql.org/docs/9.3/static/archive-recovery-settings.html#RESTORE-COMMAND

per the docs here:

http://www.postgresql.org/docs/9.3/static/standby-settings.html

standby_mode (boolean)

     Specifies whether to start the PostgreSQL server as a standby. If
this parameter is on, the server will not stop recovery when the end of
archived WAL is reached, but will keep trying to continue recovery by
fetching new WAL segments using restore_command and/or by connecting to
the primary server as specified by the primary_conninfo setting.


>
> how can i verify that WAL archiving is working for slave?

Look in the logs or use the following functions:

http://www.postgresql.org/docs/9.3/interactive/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL

>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Anil Menon
Date:
Subject: Re: Performance question
Next
From: zach cruise
Date:
Subject: Re: better architecture?