Thread: Replication with 9.4

Replication with 9.4

From

Madovsky

Date:

03 October 2015, 14:11:05

Hi,

I would like to fix a issue I'm facing of with the version 9.4 streaming
replication.
is it possible to set on the fly the synchronous commit on the master
(or standby?)
which only sync commit the hot standby node used by the client who has a
read only sql session on?
example:
node1 node2 node3 are hot standby with replication slots (physical) on
master 1
node1 has the priority 1 since it's first on the names list
now a client open a ready only session on node3 and start a transaction
update but
within this transaction a sql select is done on the hot standby from the
updated table

thanks

Franck

Re: Replication with 9.4

From

Michael Paquier

Date:

03 October 2015, 14:48:27

On Sat, Oct 3, 2015 at 8:09 PM, Madovsky <infos@madovsky.org> wrote:
> I would like to fix a issue I'm facing of with the version 9.4 streaming
> replication.
> is it possible to set on the fly the synchronous commit on the master (or
> standby?)
> which only sync commit the hot standby node used by the client who has a
> read only sql session on?

By referring to the docs:
http://www.postgresql.org/docs/devel/static/warm-standby.html#SYNCHRONOUS-REPLICATION
Synchronous replication gives the insurance that a transaction has
been flushed to the disk of the standby which is in sync, aka the one
with the lowest priority depending on the nodes currently connected.
This does not ensure that the transaction has been *replayed* on the
standby. You are sure that the transaction data is available. Hence if
you wish to know that a transaction in a standby is running a
transaction with enough data replayed, you should make the WAL
position of the master necessary for the transaction of the standby
something that your application is aware of.
--
Michael

Re: Replication with 9.4

From

Edson Richter

Date:

03 October 2015, 15:46:54

---- Madovsky escreveu ----

> Hi,
>
> I would like to fix a issue I'm facing of with the version 9.4 streaming
> replication.
> is it possible to set on the fly the synchronous commit on the master
> (or standby?)
> which only sync commit the hot standby node used by the client who has a
> read only sql session on?
> example:
> node1 node2 node3 are hot standby with replication slots (physical) on
> master 1
> node1 has the priority 1 since it's first on the names list
> now a client open a ready only session on node3 and start a transaction
> update

Sorry for my ignorance, but can you start a "transaction update" over a read only connection?

Edson

but
> within this transaction a sql select is done on the hot standby from the
> updated table
>
> thanks
>
> Franck
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Re: Replication with 9.4

From

Madovsky

Date:

03 October 2015, 16:15:29

On 10/3/2015 5:46 AM, Edson Richter wrote:

---- Madovsky escreveu ----
> Hi,
>
> I would like to fix a issue I'm facing of with the version 9.4 streaming
> replication.
> is it possible to set on the fly the synchronous commit on the master
> (or standby?)
> which only sync commit the hot standby node used by the client who has a
> read only sql session on?
> example:
> node1 node2 node3 are hot standby with replication slots (physical) on
> master 1
> node1 has the priority 1 since it's first on the names list
> now a client open a ready only session on node3 and start a transaction
> update
Sorry for my ignorance, but can you start a "transaction update" over a read only connection?
Edson
but
> within this transaction a sql select is done on the hot standby from the
> updated table
>
> thanks
>
> Franck
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Obviously not, my script open 2 sessions

Re: Replication with 9.4

From

Madovsky

Date:

03 October 2015, 16:22:03

On 10/3/2015 4:48 AM, Michael Paquier wrote:
> On Sat, Oct 3, 2015 at 8:09 PM, Madovsky <infos@madovsky.org> wrote:
>> I would like to fix a issue I'm facing of with the version 9.4 streaming
>> replication.
>> is it possible to set on the fly the synchronous commit on the master (or
>> standby?)
>> which only sync commit the hot standby node used by the client who has a
>> read only sql session on?
> By referring to the docs:
> http://www.postgresql.org/docs/devel/static/warm-standby.html#SYNCHRONOUS-REPLICATION
> Synchronous replication gives the insurance that a transaction has
> been flushed to the disk of the standby which is in sync, aka the one
> with the lowest priority depending on the nodes currently connected.
> This does not ensure that the transaction has been *replayed* on the
> standby. You are sure that the transaction data is available. Hence if
> you wish to know that a transaction in a standby is running a
> transaction with enough data replayed, you should make the WAL
> position of the master necessary for the transaction of the standby
> something that your application is aware of.

I really well understood Michael thanks,
the docs doesn't cover if the sync priorities can be changed
so one node can be considered fully sync and the other only async
thus to minimize sync request overhead...
usually a client connect to a node would like to see the results
on the node where  he has a session on.
I just wanted to avoid a SELECT request to the master and
stay on the HOT STANDBY for all read requests.
my script open 2 session, on on the master and one on the hot standby
in case of block transactions.

Re: Replication with 9.4

From

Michael Paquier

Date:

03 October 2015, 16:55:10

On Sat, Oct 3, 2015 at 10:20 PM, Madovsky <infos@madovsky.org> wrote:
>
>
> On 10/3/2015 4:48 AM, Michael Paquier wrote:
>>
>> On Sat, Oct 3, 2015 at 8:09 PM, Madovsky <infos@madovsky.org> wrote:
>>>
>>> I would like to fix a issue I'm facing of with the version 9.4 streaming
>>> replication.
>>> is it possible to set on the fly the synchronous commit on the master (or
>>> standby?)
>>> which only sync commit the hot standby node used by the client who has a
>>> read only sql session on?
>>
>> By referring to the docs:
>>
>> http://www.postgresql.org/docs/devel/static/warm-standby.html#SYNCHRONOUS-REPLICATION
>> Synchronous replication gives the insurance that a transaction has
>> been flushed to the disk of the standby which is in sync, aka the one
>> with the lowest priority depending on the nodes currently connected.
>> This does not ensure that the transaction has been *replayed* on the
>> standby. You are sure that the transaction data is available. Hence if
>> you wish to know that a transaction in a standby is running a
>> transaction with enough data replayed, you should make the WAL
>> position of the master necessary for the transaction of the standby
>> something that your application is aware of.
>
>
> I really well understood Michael thanks,
> the docs doesn't cover if the sync priorities can be changed
> so one node can be considered fully sync and the other only async
> thus to minimize sync request overhead...

The amount of overhead of a node is something that needs to be
evaluated externally of the Postgres backend, then you could always
adjust synchronous_standby_names to change the priorities as you wish.
You can for example do so with libpq or psql using ALTER SYSTEM
combined with "SELECT pg_reload_conf();". The configuration will be be
reloaded at the next query loop in a backup once it catches the
changes of the parameter via SIGHUP.

> usually a client connect to a node would like to see the results
> on the node where  he has a session on.
> I just wanted to avoid a SELECT request to the master and
> stay on the HOT STANDBY for all read requests.
> my script open 2 session, on on the master and one on the hot standby
> in case of block transactions.

Requesting the master would be necessary, still I don't really get why
you don't want to query the master for read queries... You could for
example plug on top of the master pgbouncer if you have many
connections, but well at this stage I have no idea of what is your use
case.
--
Michael

Re: Replication with 9.4

From

Madovsky

Date:

04 October 2015, 00:39:56


On 10/3/2015 6:55 AM, Michael Paquier wrote:
> On Sat, Oct 3, 2015 at 10:20 PM, Madovsky <infos@madovsky.org> wrote:
>>
>> On 10/3/2015 4:48 AM, Michael Paquier wrote:
>>> On Sat, Oct 3, 2015 at 8:09 PM, Madovsky <infos@madovsky.org> wrote:
>>>> I would like to fix a issue I'm facing of with the version 9.4 streaming
>>>> replication.
>>>> is it possible to set on the fly the synchronous commit on the master (or
>>>> standby?)
>>>> which only sync commit the hot standby node used by the client who has a
>>>> read only sql session on?
>>> By referring to the docs:
>>>
>>> http://www.postgresql.org/docs/devel/static/warm-standby.html#SYNCHRONOUS-REPLICATION
>>> Synchronous replication gives the insurance that a transaction has
>>> been flushed to the disk of the standby which is in sync, aka the one
>>> with the lowest priority depending on the nodes currently connected.
>>> This does not ensure that the transaction has been *replayed* on the
>>> standby. You are sure that the transaction data is available. Hence if
>>> you wish to know that a transaction in a standby is running a
>>> transaction with enough data replayed, you should make the WAL
>>> position of the master necessary for the transaction of the standby
>>> something that your application is aware of.
>>
>> I really well understood Michael thanks,
>> the docs doesn't cover if the sync priorities can be changed
>> so one node can be considered fully sync and the other only async
>> thus to minimize sync request overhead...
> The amount of overhead of a node is something that needs to be
> evaluated externally of the Postgres backend, then you could always
> adjust synchronous_standby_names to change the priorities as you wish.
> You can for example do so with libpq or psql using ALTER SYSTEM
> combined with "SELECT pg_reload_conf();". The configuration will be be
> reloaded at the next query loop in a backup once it catches the
> changes of the parameter via SIGHUP.
>
>> usually a client connect to a node would like to see the results
>> on the node where  he has a session on.
>> I just wanted to avoid a SELECT request to the master and
>> stay on the HOT STANDBY for all read requests.
>> my script open 2 session, on on the master and one on the hot standby
>> in case of block transactions.
> Requesting the master would be necessary, still I don't really get why
> you don't want to query the master for read queries... You could for
> example plug on top of the master pgbouncer if you have many
> connections, but well at this stage I have no idea of what is your use
> case.

Your idea is interesting, but unfortunately not dynamic and not for a
per user basis.
like we can change synchronous_commit on the fly and per block
transactions so
why not the same for standby priority?
I'm trying to use the master for write only.

Re: Replication with 9.4

From

Michael Paquier

Date:

04 October 2015, 01:30:57

On Sun, Oct 4, 2015 at 6:38 AM, Madovsky wrote:
> On 10/3/2015 6:55 AM, Michael Paquier wrote:
>> On Sat, Oct 3, 2015 at 10:20 PM, Madovsky wrote:
>> Requesting the master would be necessary, still I don't really get why
>> you don't want to query the master for read queries... You could for
>> example plug on top of the master pgbouncer if you have many
>> connections, but well at this stage I have no idea of what is your use
>> case.
>
>
> Your idea is interesting, but unfortunately not dynamic and not for a per
> user basis.
> like we can change synchronous_commit on the fly and per block transactions
> so
> why not the same for standby priority?
> I'm trying to use the master for write only.

There can be only one sync standby at a time, and please note again
the difference between WAL flush and WAL replay. Synchonous
replication ensures that the former has been done, not the latter. As
far as this thread goes, it seems to me that you are taking a more
bug-prone approach on your application-side which could be solved by
just querying the master... Another idea would be to add some
meta-data in the schema to check the validity of the things replayed,
like a state-of-data-per-user or a data-version-number-per-user in a
dedicated table, this would avoid having to make the application
WAL-aware, still you would need to make the application aware of this
meta-data in some way, which would surely require to query the master
or update some cache or a file on client side. Still I don't see why
you would need to use this approach, and no reason is given to justify
*why* this would be needed in your case. But well I guess I cannot
stop you to do so if you wish to do it :)
--
Michael

Re: Replication with 9.4

From

Michael Paquier

Date:

04 October 2015, 13:47:13

(Seems like you forgot to push the Reply-all button)

On Sun, Oct 4, 2015 at 7:01 PM, Madovsky wrote:
> On 10/3/2015 3:30 PM, Michael Paquier wrote:
>>  and no reason is given to justify *why* this would be needed in your case
> reason for a choice can be often an issue for other :D
>
> I thought that postgresql 9.4  user could change on the fly with
> synchronous_commit from local to on for ex
> which hotstandby would become in sync and which in async to avoid a big
> latency in case of let's say 100 hot standby.
> it was an idea, a concept to let the master write and update the nodes, like
> a queen bee ;)
> but I'm afraid it's not possible, so maybe future version of pg will do it,
> for now  read from the master is my only solution.

Well, Thomas Munro (adding him in CC) has sent for integration with
9.6 a patch that would cover your need, by adding to
synchronous_commit a mode called 'apply', in which case a master would
wait for the transaction to be applied on standby before committing
locally:
http://www.postgresql.org/message-id/CAEepm=1fqkivL4V-OTPHwSgw4aF9HcoGiMrCW-yBtjipX9gsag@mail.gmail.com
Perhaps you could help with the review of the patch, this has stalled
a bit lately.
Regards,
--
Michael

Re: Replication with 9.4

From

Madovsky

Date:

04 October 2015, 14:04:57


On 10/4/2015 3:47 AM, Michael Paquier wrote:
> (Seems like you forgot to push the Reply-all button)
>
> On Sun, Oct 4, 2015 at 7:01 PM, Madovsky wrote:
>> On 10/3/2015 3:30 PM, Michael Paquier wrote:
>>>   and no reason is given to justify *why* this would be needed in your case
>> reason for a choice can be often an issue for other :D
>>
>> I thought that postgresql 9.4  user could change on the fly with
>> synchronous_commit from local to on for ex
>> which hotstandby would become in sync and which in async to avoid a big
>> latency in case of let's say 100 hot standby.
>> it was an idea, a concept to let the master write and update the nodes, like
>> a queen bee ;)
>> but I'm afraid it's not possible, so maybe future version of pg will do it,
>> for now  read from the master is my only solution.
> Well, Thomas Munro (adding him in CC) has sent for integration with
> 9.6 a patch that would cover your need, by adding to
> synchronous_commit a mode called 'apply', in which case a master would
> wait for the transaction to be applied on standby before committing
> locally:
> http://www.postgresql.org/message-id/CAEepm=1fqkivL4V-OTPHwSgw4aF9HcoGiMrCW-yBtjipX9gsag@mail.gmail.com
> Perhaps you could help with the review of the patch, this has stalled
> a bit lately.
> Regards,
Brilliant, however I'm not to caught everything in this thread.
I would love to contribute to this patch, but I have absolutely no
C/C++ experience (webdev languages only).
to tell simple, when SET LOCAL synchronous_commit TO ON is used on the fly
for the current transaction it would be great to have an other option to
choose
if this setting is for all standby or a specific one, leaving the other
async... gotcha? :)

Re: Replication with 9.4

From

Thomas Munro

Date:

06 October 2015, 02:27:55

On Sun, Oct 4, 2015 at 11:47 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> (Seems like you forgot to push the Reply-all button)
>
> On Sun, Oct 4, 2015 at 7:01 PM, Madovsky wrote:
>> On 10/3/2015 3:30 PM, Michael Paquier wrote:
>>>  and no reason is given to justify *why* this would be needed in your case
>> reason for a choice can be often an issue for other :D
>>
>> I thought that postgresql 9.4  user could change on the fly with
>> synchronous_commit from local to on for ex
>> which hotstandby would become in sync and which in async to avoid a big
>> latency in case of let's say 100 hot standby.
>> it was an idea, a concept to let the master write and update the nodes, like
>> a queen bee ;)
>> but I'm afraid it's not possible, so maybe future version of pg will do it,
>> for now  read from the master is my only solution.
>
> Well, Thomas Munro (adding him in CC) has sent for integration with
> 9.6 a patch that would cover your need, by adding to
> synchronous_commit a mode called 'apply', in which case a master would
> wait for the transaction to be applied on standby before committing
> locally:
> http://www.postgresql.org/message-id/CAEepm=1fqkivL4V-OTPHwSgw4aF9HcoGiMrCW-yBtjipX9gsag@mail.gmail.com
> Perhaps you could help with the review of the patch, this has stalled
> a bit lately.

That patch (or something more sophisticated long those lines) is a
small piece of a bigger puzzle, though it might be enough if you only
have one standby, are prepared to block until manual intervention if
that standby fails, and don't mind potentially lumpy apply
performance.  See also the work being done to separate wal writing
from wal applying for smoother performance[1], and handle multiple
synchronous standbys[2].  But there is another piece of the puzzle
IMHO: how to know reliably that the standby that you are talking to
guarantees causal consistency, while also allowing standbys to
fail/drop out gracefully, and I'm currently working on an idea for
that.

Of course you can make your own causal consistency today if you are
prepared to have your clients explicitly wait for WAL to be applied.
You can call pg_current_xlog_location() on the master after
committing, and then wait until pg_last_xlog_replay_location() reports
that that LSN has been applied on any standby you talk to at the start
of any transaction that wants causal reads.  You could wrap the
waiting up in a user defined function
wait_for_xlog_replay_location(lsn, timeout) which could do a naive
poll/sleep loop (or do something more efficient with latches in core
code).  For example, imagine a client that inserts some new accounts
and then causes a large number of workers to regenerate some reports
that must include the new accounts against a pool of standbys.  It
just needs to give them the LSN they should wait for first.  And if
you don't want to pass LSNs around but don't mind introducing some
extra conservative lag, those workers could call
pg_current_xlog_location() on the master themselves to get some
arbitrary recent LSN and then wait for that to be applied before they
start their work on the standbys.

The explicit wait-for-LSN approach pushes the waiting over to readers
who want causal reads, instead of writer (the master), which still
might be interesting for some cases even if we do finish up with a
good optional master-waits system.  But a master-waits system will
allow naive clients to see up to date data no matter where they run
their queries (or be kicked off by standbys that can't guarantee that)
without having to think about LSNs and replication machinery, and I
think that would be a very useful feature.

[1] http://www.postgresql.org/message-id/flat/CA+U5nMJifauXvVbx=v3UbYbHO3Jw2rdT4haL6CCooEDM5=4ASQ@mail.gmail.com
[2] http://www.postgresql.org/message-id/flat/CAOG9ApHYCPmTypAAwfD3_V7sVOkbnECFivmRc1AxhB40ZBSwNQ@mail.gmail.com

--
Thomas Munro
http://www.enterprisedb.com

Re: Replication with 9.4

From

Thomas Munro

Date:

02 December 2015, 03:57:38

On Tue, Oct 6, 2015 at 12:27 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

On Sun, Oct 4, 2015 at 11:47 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> (Seems like you forgot to push the Reply-all button)
>
> On Sun, Oct 4, 2015 at 7:01 PM, Madovsky wrote:
>> On 10/3/2015 3:30 PM, Michael Paquier wrote:
>>> and no reason is given to justify *why* this would be needed in your case
>> reason for a choice can be often an issue for other :D
>>
>> I thought that postgresql 9.4 user could change on the fly with
>> synchronous_commit from local to on for ex
>> which hotstandby would become in sync and which in async to avoid a big
>> latency in case of let's say 100 hot standby.
>> it was an idea, a concept to let the master write and update the nodes, like
>> a queen bee ;)
>> but I'm afraid it's not possible, so maybe future version of pg will do it,
>> for now read from the master is my only solution.
>
> Well, Thomas Munro (adding him in CC) has sent for integration with
> 9.6 a patch that would cover your need, by adding to
> synchronous_commit a mode called 'apply', in which case a master would
> wait for the transaction to be applied on standby before committing
> locally:
> http://www.postgresql.org/message-id/CAEepm=1fqkivL4V-OTPHwSgw4aF9HcoGiMrCW-yBtjipX9gsag@mail.gmail.com
> Perhaps you could help with the review of the patch, this has stalled
> a bit lately.

That patch (or something more sophisticated long those lines) is a
small piece of a bigger puzzle, though it might be enough if you only
have one standby, are prepared to block until manual intervention if
that standby fails, and don't mind potentially lumpy apply
performance. See also the work being done to separate wal writing
from wal applying for smoother performance[1], and handle multiple
synchronous standbys[2]. But there is another piece of the puzzle
IMHO: how to know reliably that the standby that you are talking to
guarantees causal consistency, while also allowing standbys to
fail/drop out gracefully, and I'm currently working on an idea for
that.

FYI I posted the resulting proposal and patch over on the -hackers list. Feedback, ideas, flames welcome as always.

http://www.postgresql.org/message-id/flat/CAEepm=0n_OxB2_pNntXND6aD85v5PvADeUY8eZjv9CBLk=zNXA@mail.gmail.com

Thomas Munro
http://www.enterprisedb.com