Odp: Re: Re: [ADMIN] [ADMIN] Replication mode Master-Slave - maintenance question. - Mailing list pgsql-admin

From czezz
Subject Odp: Re: Re: [ADMIN] [ADMIN] Replication mode Master-Slave - maintenance question.
Date
Msg-id d7d9880c8d564356b6a0f8e7d75fdda7@gwp
Whole thread Raw
In response to Re: [ADMIN] Replication mode Master-Slave - maintenance question.  (John Scalia <jayknowsunix@gmail.com>)
List pgsql-admin
Hi,
thanks for answers.
That's the result of: select * from pg_stat_replication;

postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port |         backend_start         |   state   | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state
-----+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
135 |    24576 | rep     | walreceiver      | 172.17.0.3  |                 |       56154 | 2017-03-29 12:12:22.800513+00 | streaming | 0/96000000    | 0/95FFFFF0     | 0/95FFFFF0     | 0/95FFFFF0      |             0 | async
(1 row)

Best regsrds,
czezz

Dnia 28 marca 2017 17:52 John Scalia <jayknowsunix@gmail.com> napisał(a):

Per your questions,

1: Yes, AFAIK.
2: Yes, again, but you need to look in the primary database's pg_stat_replication table and the last column indicates the type of replication.
3: Yes, and potentially, depending on how long the database is unavailable, those WAL segments may have been deleted.

For question 3, that's the best reason to move to a new version of PostgreSQL, one that supports replication slots.

On Tue, Mar 28, 2017 at 11:12 AM, czezz <czezz@o2.pl> wrote:
Hi John,
thanks again for your reply. I took some time to read a bit of documentation.

My version of PostgreSQL: 9.2.20.
Master-Slave setup is asynchronous/streaming replication/warm standby configuration - and for the moment it's OK for my needs like that.


Still, some things are not quite clear to me.
This is my recovery.conf:
   $ vi /var/lib/pgsql/9.2/data/recovery.conf
   restore_command = 'gunzip -c /datastore/WAL/%f > %p'
   standby_mode = 'on'
   primary_conninfo = 'host=MASTER_IP_ port=5432 user=rep password=password'
   trigger_file = '/var/lib/pgsql/MASTER.trigger'

NOTE! NOTE! NOTE!
/datastore/WAL/ - is NFS share on MASTER machine.

1. restore_command - it restores DB on SLAVE form archive logs AND this one is executed only once per startup. Correct?
2. primary_conninfo (this parameter means also: streaming replication, correct?) - once restore_command is done, it starts to read MASTER's pg_xlogs and replicate it AND this is happening constantly/nearly in real time. Correct?
3. In order to stop and later start again SLAVE, access to MASTER's archive logs is essential (keeping in the mind they are not recycled too early). Correct?

Dnia 24 marca 2017 13:03 John Scalia <jayknowsunix@gmail.com> napisał(a):

You may need to study up then PostgreSQL replication. Synchronous attaches the slave to the master through a replication slot, if you're using 9.4 or newer, you can check that on the master by querying pg_stat_replication, but the slave's recovery.conf file has a setting for which slot to use. Prior to 9.4 the named slot was not present but it still used a replication channel, but that would appear in the same table on the master. The other way to do replication is simply with WAL file shipping, but you can and should use WAL file shipping even if a replication channel or slot is in use. Think of it as insurance. With a WAL replica, all the slave does is read and process WAL files as they arrive. The master has no concept that a slave is attached. Thus it is possible over time for a master to expire WAL segments that the slave never received (if it was unavailable). The move to replication slots in 9.4 eliminated this possibility as master knows of the slave(s) and will not expire WAL segments until all of them have received the files. That's the quick and dirty explanation, but like I began, you maybe ought to study how to do replication the way that suits your firm best.

On Fri, Mar 24, 2017 at 6:05 AM, czezz <czezz@o2.pl> wrote:
Hi John,
thanks for this reply.

Honestly I do not know is it synchronous or asynchronous. I didnt set any parameter like this.
Is there a quick way to verify that?

Test case scenario I tested so far was: stop SLAVE for some time (2-3 hours) and do a lot of INSERTS on MASTER.
Afterwards,  when SLAVE is up again it synced delta from MASTER's WAL files.
Do it worked as I expected.

Why are you saying that "extended period of time" will be an issue? Is there any time limit for SLAVE to be down or you are just saying about worse case scenario? 
I assume that I will never lose any WAL files on MASTER.

Best regards,
czezz

Dnia 23 marca 2017 19:06 John Scalia <jayknowsunix@gmail.com> napisał(a):

You did not indicate whether you're Master-Slave setup is synchronous or asynchronous, but yes, what you described should work. The slave should have a recovery.conf file and the presence of that will cause that slave to wind through all WAL files present before it resumes further processing. At least this is what happens in synchronous streaming mode. Your problem might be if the slave is off for an extended period of time or if one of the WAL files disappears. At that point the slave would not have all the transactions for consistency.
Again,to your question near the end. It depends on how you're set up. Generally though slaves do make the request to their master.

On Thu, Mar 23, 2017 at 1:47 PM, czezz <czezz@o2.pl> wrote:
Hi,
I have set my Postgres into replication mode Master-Slave and it works good.
Though, I have some question about maintaining it.

Let me start from the beginning. How I set it up (briefly):
1. Stand Alone postgres (future MASTER) has been set to archive data to: /datastore/WAL (archive_command = 'cp %p /datastore/WAL/%f && gzip /datastore/WAL/%f ')
2. I have shared /datastore/WAL over NFS
3. I prepared SLAVE machine and created initial database out of MASTER's WAL-archival (over NFS) by preparing following file:

   $ vi /var/lib/pgsql/9.2/data/recovery.conf
   restore_command = 'gunzip -c /datastore/WAL/%f > %p'
   standby_mode = 'on'
   primary_conninfo = 'host=MASTER_IP_ port=5432 user=rep password=password'
   trigger_file = '/var/lib/pgsql/MASTER.trigger'

4. One last thing - I keep archive_mode = off on the SLAVE.

The above works perfect.

Now, here I need your help to understand what is happening in case following scenario.

SCENARIO:
I turn off SLAVE for some time (and MASTER receives a lot of new data) and later I turn SLAVE on.
Is it true/correct:
- SLAVE restores missing data/delta over NFS (from MASTER's archival). This is due to existing /var/lib/pgsql/9.2/data/recovery.conf and entry: restore_command = 'gunzip -c /datastore/WAL/%f > %p'
- Once restore is done, SLAVE will keep replicate new data from MASTER (btw. it is SLAVE who request replication, correct?)

Can anyone please confirm that my above assumption is correct and if not correct it?

Best regards,
czezz



pgsql-admin by date:

Previous
From: Tom Lane
Date:
Subject: Re: ERROR: XX000: cache lookup failed for index 2399344195
Next
From: Douglas Brown
Date:
Subject: Re: Integrating Postgresql with Active Directory