Home > mailing lists

Thread: Time-Delayed Standbys

Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

18 October 2013, 18:03:23

Hi all,

The attached patch is a continuation of Robert's work [1].

I made some changes:

- use of Latches instead of pg_usleep, so we don't have to wakeup regularly.

- call HandleStartupProcInterrupts() before CheckForStandbyTrigger() because might change the trigger file's location

- compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and XLOG_XACT_COMMIT_COMPACT checks

- don't care about clockdrift because it's an admin problem.

Regards,

[1] http://www.postgresql.org/message-id/BANLkTi==TTzHDqWzwJDjmOf__8YuA7L1jw@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Attachment

time-delayed-standby-v1.patch

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

29 November 2013, 07:44:15

Hi Royes,

I'm sorry for my late review...

I feel potential of your patch in PG replication function, and it might be 
something useful for all people. I check your patch and have some comment for 
improvement. I haven't executed detail of unexpected sutuation yet. But I think 
that under following problem2 is important functionality problem. So I ask you to 
solve the problem in first.

* Regress test
No problem.

* Problem1
Your patch does not code recovery.conf.sample about recovery_time_delay.
Please add it.

* Problem2
When I set time-delayed standby and start standby server, I cannot access stanby 
server by psql. It is because PG is in first starting recovery which cannot 
access by psql. I think that time-delayed standby is only delayed recovery 
position, it must not affect other functionality.

I didn't test recoevery in master server with recovery_time_delay. If you have 
detail test result of these cases, please send me.

My first easy review of your patch is that all.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

29 November 2013, 20:35:06

On Fri, Nov 29, 2013 at 5:49 AM, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote:
>
> Hi Royes,
>
> I'm sorry for my late review...
>

No problem...

> I feel potential of your patch in PG replication function, and it might be something useful for all people. I check your patch and have some comment for improvement. I haven't executed detail of unexpected sutuation yet. But I think that under following problem2 is important functionality problem. So I ask you to solve the problem in first.
>
> * Regress test
> No problem.
>
> * Problem1
> Your patch does not code recovery.conf.sample about recovery_time_delay.
> Please add it.
>

Fixed.

> * Problem2
> When I set time-delayed standby and start standby server, I cannot access stanby server by psql. It is because PG is in first starting recovery which cannot access by psql. I think that time-delayed standby is only delayed recovery position, it must not affect other functionality.
>
> I didn't test recoevery in master server with recovery_time_delay. If you have detail test result of these cases, please send me.
>

Well, I could not reproduce the problem that you described.

I run the following test:

1) Clusters

- build master

- build slave and attach to the master using SR and config recovery_time_delay to 1min.

2) Stop de Slave

3) Run some transactions on the master using pgbench to generate a lot of archives

4) Start the slave and connect to it using psql and in another session I can see all archive recovery log

> My first easy review of your patch is that all.
>

Thanks.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Attachment

time-delayed-standby-v2.patch

Re: Time-Delayed Standbys

From

Christian Kruse

Date:

03 December 2013, 16:31:57

Hi Fabrizio,

looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
applies and compiles w/o errors or warnings. I set up a master and two
hot standbys replicating from the master, one with 5 minutes delay and
one without delay. After that I created a new database and generated
some test data:

CREATE TABLE test (val INTEGER);
INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));

The non-delayed standby nearly instantly had the data replicated, the
delayed standby was replicated after exactly 5 minutes. I did not
notice any problems, errors or warnings.

Greetings,CK

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

03 December 2013, 17:36:49

On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com> wrote:

Hi Fabrizio,

looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
applies and compiles w/o errors or warnings. I set up a master and two
hot standbys replicating from the master, one with 5 minutes delay and
one without delay. After that I created a new database and generated
some test data:

CREATE TABLE test (val INTEGER);
INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));

The non-delayed standby nearly instantly had the data replicated, the
delayed standby was replicated after exactly 5 minutes. I did not
notice any problems, errors or warnings.

Thanks for your review Christian...

Regards,

Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Re: Time-Delayed Standbys

From

Robert Haas

Date:

03 December 2013, 18:46:33

On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:
> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com>
> wrote:
>>
>> Hi Fabrizio,
>>
>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
>> applies and compiles w/o errors or warnings. I set up a master and two
>> hot standbys replicating from the master, one with 5 minutes delay and
>> one without delay. After that I created a new database and generated
>> some test data:
>>
>> CREATE TABLE test (val INTEGER);
>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));
>>
>> The non-delayed standby nearly instantly had the data replicated, the
>> delayed standby was replicated after exactly 5 minutes. I did not
>> notice any problems, errors or warnings.
>>
>
> Thanks for your review Christian...

So, I proposed this patch previously and I still think it's a good
idea, but it got voted down on the grounds that it didn't deal with
clock drift.  I view that as insufficient reason to reject the
feature, but others disagreed.  Unless some of those people have
changed their minds, I don't think this patch has much future here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Time-Delayed Standbys

From

"Joshua D. Drake"

Date:

03 December 2013, 18:54:26

On 12/03/2013 10:46 AM, Robert Haas wrote:
>
> On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
>> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com>
>> wrote:
>>>
>>> Hi Fabrizio,
>>>
>>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
>>> applies and compiles w/o errors or warnings. I set up a master and two
>>> hot standbys replicating from the master, one with 5 minutes delay and
>>> one without delay. After that I created a new database and generated
>>> some test data:
>>>
>>> CREATE TABLE test (val INTEGER);
>>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));
>>>
>>> The non-delayed standby nearly instantly had the data replicated, the
>>> delayed standby was replicated after exactly 5 minutes. I did not
>>> notice any problems, errors or warnings.
>>>
>>
>> Thanks for your review Christian...
>
> So, I proposed this patch previously and I still think it's a good
> idea, but it got voted down on the grounds that it didn't deal with
> clock drift.  I view that as insufficient reason to reject the
> feature, but others disagreed.  Unless some of those people have
> changed their minds, I don't think this patch has much future here.
>

I would agree that it is a good idea.

Joshua D. Drake

-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats

Re: Time-Delayed Standbys

From

Andres Freund

Date:

03 December 2013, 19:00:22

On 2013-12-03 13:46:28 -0500, Robert Haas wrote:
> On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
> > On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com>
> > wrote:
> >>
> >> Hi Fabrizio,
> >>
> >> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
> >> applies and compiles w/o errors or warnings. I set up a master and two
> >> hot standbys replicating from the master, one with 5 minutes delay and
> >> one without delay. After that I created a new database and generated
> >> some test data:
> >>
> >> CREATE TABLE test (val INTEGER);
> >> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));
> >>
> >> The non-delayed standby nearly instantly had the data replicated, the
> >> delayed standby was replicated after exactly 5 minutes. I did not
> >> notice any problems, errors or warnings.
> >>
> >
> > Thanks for your review Christian...
>
> So, I proposed this patch previously and I still think it's a good
> idea, but it got voted down on the grounds that it didn't deal with
> clock drift.  I view that as insufficient reason to reject the
> feature, but others disagreed.

I really fail to see why clock drift should be this patch's
responsibility. It's not like the world would go under^W data corruption
would ensue if the clocks drift. Your standby would get delayed
imprecisely. Big deal. From what I know of potential users of this
feature, they would set it to at the very least 30min - that's WAY above
the range for acceptable clock-drift on servers.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

03 December 2013, 19:33:24

On 18 October 2013 19:03, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

> The attached patch is a continuation of Robert's work [1].

Reviewing v2...

> I made some changes:
> - use of Latches instead of pg_usleep, so we don't have to wakeup regularly.

OK

> - call HandleStartupProcInterrupts() before CheckForStandbyTrigger() because
> might change the trigger file's location

OK

> - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
> XLOG_XACT_COMMIT_COMPACT checks

Why just those? Why not aborts and restore points also?

> - don't care about clockdrift because it's an admin problem.

Few minor points on things

* The code with comment "Clear any previous recovery delay time" is in
wrong place, move down or remove completely. Setting the delay to zero
doesn't prevent calling recoveryDelay(), so that logic looks wrong
anyway.

* The loop exit in recoveryDelay() is inelegant, should break if <= 0

* There's a spelling mistake in sample

* The patch has whitespace in one place

and one important point...

* The delay loop happens AFTER we check for a pause. Which means if
the user notices a problem on a commit, then hits pause button on the
standby, the pause will have no effect and the next commit will be
applied anyway. Maybe just one commit, but its an off by one error
that removes the benefit of the patch. So I think we need to test this
again after we finish delaying

if (xlogctl->recoveryPause)     recoveryPausesHere();

We need to explain in the docs that this is intended only for use in a
live streaming deployment. It will have little or no meaning in a
PITR.

I think recovery_time_delay should be called   <something>_apply_delay
to highlight the point that it is the apply of records that is
delayed, not the receipt. And hence the need to document that sync rep
is NOT slowed down by setting this value.

And to make the name consistent with other parameters, I suggest   min_standby_apply_delay

We also need to document caveats about the patch, in that it only
delays on timestamped WAL records and other records may be applied
sooner than the delay in some circumstances, so it is not a way to
avoid all cancellations.

We also need to document the behaviour of the patch is to apply all
data received as quickly as possible once triggered, so the specified
delay does not slow down promoting the server to a master. That might
also be seen as a negative behaviour, since promoting the master
effectively sets recovery_time_delay to zero.

I will handle the additional documentation, if you can update the
patch with the main review comments. Thanks.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

04 December 2013, 02:08:52

(2013/11/30 5:34), Fabrízio de Royes Mello wrote:
> On Fri, Nov 29, 2013 at 5:49 AM, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp
> <mailto:kondo.mitsumasa@lab.ntt.co.jp>> wrote:
>  > * Problem1
>  > Your patch does not code recovery.conf.sample about recovery_time_delay.
>  > Please add it.
> Fixed.
OK. It seems no problem.

>  > * Problem2
>  > When I set time-delayed standby and start standby server, I cannot access
> stanby server by psql. It is because PG is in first starting recovery which
> cannot access by psql. I think that time-delayed standby is only delayed recovery
> position, it must not affect other functionality.
>  >
>  > I didn't test recoevery in master server with recovery_time_delay. If you have
> detail test result of these cases, please send me.
>  >
> Well, I could not reproduce the problem that you described.
>
> I run the following test:
>
> 1) Clusters
> - build master
> - build slave and attach to the master using SR and config recovery_time_delay to
> 1min.
>
> 2) Stop de Slave
>
> 3) Run some transactions on the master using pgbench to generate a lot of archives
>
> 4) Start the slave and connect to it using psql and in another session I can see
> all archive recovery log
Hmm...　I had thought my mistake in reading your email, but it reproduce again.
When I sat small recovery_time_delay(=30000), it might work collectry. However, I
sat long timed recovery_time_delay(=3000000), it didn't work.

My reporduced operation log is under following.
> [mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4  -p5432
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 10
> query mode: simple
> number of clients: 8
> number of threads: 4
> duration: 30 s
> number of transactions actually processed: 68704
> latency average: 3.493 ms
> tps = 2289.196747 (including connections establishing)
> tps = 2290.175129 (excluding connections establishing)
> [mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
> [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
> server starting
> [mitsu-ko@localhost postgresql]$ LOG:  database system was shut down in recovery at 2013-12-03 10:26:41 JST
> LOG:  entering standby mode
> LOG:  consistent recovery state reached at 0/5C4D8668
> LOG:  redo starts at 0/5C4000D8
> [mitsu-ko@localhost postgresql]$ FATAL:  the database system is starting up
> FATAL:  the database system is starting up
> FATAL:  the database system is starting up
> FATAL:  the database system is starting up
> FATAL:  the database system is starting up
> [mitsu-ko@localhost postgresql]$ bin/psql -p6543
> psql: FATAL:  the database system is starting up
> [mitsu-ko@localhost postgresql]$ bin/psql -p6543
> psql: FATAL:  the database system is starting up
I attached my postgresql.conf and recovery.conf. It will be reproduced.

I think that your patch should be needed recovery flags which are like
ArchiveRecoveryRequested and InArchiveRecovery etc. It is because time-delayed
standy works only replication situasion. And I hope that it isn't bad in startup
standby server and archive recovery. Is it wrong with your image? I think this
patch have a lot of potential, however I think that standby functionality is more
important than this feature. And we might need to discuss that how behavior is
best in this patch.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Attachment

conf.tar.gz

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

04 December 2013, 02:21:29

(2013/12/04 4:00), Andres Freund wrote:
> On 2013-12-03 13:46:28 -0500, Robert Haas wrote:
>> On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
>> <fabriziomello@gmail.com> wrote:
>>> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com>
>>> wrote:
>>>>
>>>> Hi Fabrizio,
>>>>
>>>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
>>>> applies and compiles w/o errors or warnings. I set up a master and two
>>>> hot standbys replicating from the master, one with 5 minutes delay and
>>>> one without delay. After that I created a new database and generated
>>>> some test data:
>>>>
>>>> CREATE TABLE test (val INTEGER);
>>>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));
>>>>
>>>> The non-delayed standby nearly instantly had the data replicated, the
>>>> delayed standby was replicated after exactly 5 minutes. I did not
>>>> notice any problems, errors or warnings.
>>>>
>>>
>>> Thanks for your review Christian...
>>
>> So, I proposed this patch previously and I still think it's a good
>> idea, but it got voted down on the grounds that it didn't deal with
>> clock drift.  I view that as insufficient reason to reject the
>> feature, but others disagreed.
>
> I really fail to see why clock drift should be this patch's
> responsibility.  It's not like the world would go under^W data corruption
> would ensue if the clocks drift. Your standby would get delayed
> imprecisely. Big deal. From what I know of potential users of this
> feature, they would set it to at the very least 30min - that's WAY above
> the range for acceptable clock-drift on servers.
Yes. I think that purpose of this patch is long time delay in standby server,
and not for little bit careful timing delay.

Regards,
-- 
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Andres Freund

Date:

04 December 2013, 09:22:39

On 2013-12-04 11:13:58 +0900, KONDO Mitsumasa wrote:
> >4) Start the slave and connect to it using psql and in another session I can see
> >all archive recovery log
> Hmm...　I had thought my mistake in reading your email, but it reproduce again.
> When I sat small recovery_time_delay(=30000), it might work collectry.
> However, I sat long timed recovery_time_delay(=3000000), it didn't work.

> My reporduced operation log is under following.
> >[mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4  -p5432
> >starting vacuum...end.
> >transaction type: TPC-B (sort of)
> >scaling factor: 10
> >query mode: simple
> >number of clients: 8
> >number of threads: 4
> >duration: 30 s
> >number of transactions actually processed: 68704
> >latency average: 3.493 ms
> >tps = 2289.196747 (including connections establishing)
> >tps = 2290.175129 (excluding connections establishing)
> >[mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
> >[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
> >server starting
> >[mitsu-ko@localhost postgresql]$ LOG:  database system was shut down in recovery at 2013-12-03 10:26:41 JST
> >LOG:  entering standby mode
> >LOG:  consistent recovery state reached at 0/5C4D8668
> >LOG:  redo starts at 0/5C4000D8
> >[mitsu-ko@localhost postgresql]$ FATAL:  the database system is starting up
> >FATAL:  the database system is starting up
> >FATAL:  the database system is starting up
> >FATAL:  the database system is starting up
> >FATAL:  the database system is starting up
> >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> >psql: FATAL:  the database system is starting up
> >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> >psql: FATAL:  the database system is starting up
> I attached my postgresql.conf and recovery.conf. It will be reproduced.

So, you brought up a standby and it took more time to become consistent
because it waited on commits? That's the problem? If so, I don't think
that's a bug?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Christian Kruse

Date:

04 December 2013, 10:37:56

Hi,

On 04/12/13 11:13, KONDO Mitsumasa wrote:
> >1) Clusters
> >- build master
> >- build slave and attach to the master using SR and config recovery_time_delay to
> >1min.
> >
> >2) Stop de Slave
> >
> >3) Run some transactions on the master using pgbench to generate a lot of archives
> >
> >4) Start the slave and connect to it using psql and in another session I can see
> >all archive recovery log
> Hmm...　I had thought my mistake in reading your email, but it reproduce again.
> When I sat small recovery_time_delay(=30000), it might work collectry.
> However, I sat long timed recovery_time_delay(=3000000), it didn't work.
> […]

I'm not sure if I understand your problem correctly. I try to
summarize, please correct if I'm wrong:

You created a master node and a hot standby with 3000000 delay. Then
you stopped the standby, did the pgbench and startet the hot standby
again. It did not get in line with the master. Is this correct?

I don't see a problem here… the standby should not be in sync with the
master, it should be delayed. I did step by step what you did and
after 50 minutes (3000000ms) the standby was at the same level the
master was.

Did I missunderstand you?

Regards,Christian Kruse

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Mitsumasa KONDO

Date:

04 December 2013, 13:47:54

2013/12/4 Andres Freund <andres@2ndquadrant.com>

On 2013-12-04 11:13:58 +0900, KONDO Mitsumasa wrote:
> >4) Start the slave and connect to it using psql and in another session I can see
> >all archive recovery log
> Hmm...　I had thought my mistake in reading your email, but it reproduce again.
> When I sat small recovery_time_delay(=30000), it might work collectry.
> However, I sat long timed recovery_time_delay(=3000000), it didn't work.

> My reporduced operation log is under following.
> >[mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4 -p5432
> >starting vacuum...end.
> >transaction type: TPC-B (sort of)
> >scaling factor: 10
> >query mode: simple
> >number of clients: 8
> >number of threads: 4
> >duration: 30 s
> >number of transactions actually processed: 68704
> >latency average: 3.493 ms
> >tps = 2289.196747 (including connections establishing)
> >tps = 2290.175129 (excluding connections establishing)
> >[mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
> >[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
> >server starting
> >[mitsu-ko@localhost postgresql]$ LOG: database system was shut down in recovery at 2013-12-03 10:26:41 JST
> >LOG: entering standby mode
> >LOG: consistent recovery state reached at 0/5C4D8668
> >LOG: redo starts at 0/5C4000D8
> >[mitsu-ko@localhost postgresql]$ FATAL: the database system is starting up
> >FATAL: the database system is starting up
> >FATAL: the database system is starting up
> >FATAL: the database system is starting up
> >FATAL: the database system is starting up
> >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> >psql: FATAL: the database system is starting up
> >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> >psql: FATAL: the database system is starting up
> I attached my postgresql.conf and recovery.conf. It will be reproduced.

So, you brought up a standby and it took more time to become consistent
because it waited on commits? That's the problem? If so, I don't think
that's a bug?

When it happened, psql cannot connect standby server at all. I think this behavior is not good.
It should only delay recovery position and can seen old delay table data. Cannot connect server is not hoped behavior.

If you think this behavior is the best, I will set ready for commiter. And commiter will fix it better.

Rregards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Andres Freund

Date:

04 December 2013, 13:54:08

On 2013-12-04 22:47:47 +0900, Mitsumasa KONDO wrote:
> 2013/12/4 Andres Freund <andres@2ndquadrant.com>
> When it happened, psql cannot connect standby server at all. I think this
> behavior is not good.
> It should only delay recovery position and can seen old delay table data.

That doesn't sound like a good plan - even if the clients cannot connect
yet, you can still promote the server. Just not taking delay into
consideration at that point seems like it would possibly surprise users
rather badly in situations they really cannot use such surprises.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Andres Freund

Date:

04 December 2013, 14:02:09

Hi,

On 2013-12-03 19:33:16 +0000, Simon Riggs wrote:
> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
> > XLOG_XACT_COMMIT_COMPACT checks
> 
> Why just those? Why not aborts and restore points also?

What would the advantage of waiting on anything but commits be? If it's
not a commit, the action won't change the state of the database (yesyes,
there are exceptions, but those don't have a timestamp)...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Mitsumasa KONDO

Date:

04 December 2013, 14:02:11

2013/12/4 Christian Kruse <christian@2ndquadrant.com>

You created a master node and a hot standby with 3000000 delay. Then
you stopped the standby, did the pgbench and startet the hot standby
again. It did not get in line with the master. Is this correct?

No. First, I start master, and execute pgbench. Second, I start standby with 3000000ms(50min) delay.

Then it cannot connect standby server by psql at all. I'm not sure why standby did not start.
It might because delay feature is disturbed in REDO loop when first standby start-up.

I don't see a problem here… the standby should not be in sync with the
master, it should be delayed. I did step by step what you did and
after 50 minutes (3000000ms) the standby was at the same level the
master was.

I think we can connect standby server any time, nevertheless with delay option.

Did I missunderstand you?

I'm not sure... You might right or another best way might be existed.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Mitsumasa KONDO

Date:

04 December 2013, 14:16:38

2013/12/4 Andres Freund <andres@2ndquadrant.com>

On 2013-12-04 22:47:47 +0900, Mitsumasa KONDO wrote:
> 2013/12/4 Andres Freund <andres@2ndquadrant.com>
> When it happened, psql cannot connect standby server at all. I think this
> behavior is not good.
> It should only delay recovery position and can seen old delay table data.

That doesn't sound like a good plan - even if the clients cannot connect
yet, you can still promote the server.

I'm not sure your argument, but does a purpose of this patch slip off?

Just not taking delay into
consideration at that point seems like it would possibly surprise users
rather badly in situations they really cannot use such surprises.

Hmm... I think user will be surprised...

I think it is easy to fix behavior using recovery flag.
So we had better to wait for other comments.

Regards,

Mitsumasa KONDO

NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Kevin Grittner

Date:

04 December 2013, 15:22:14

Robert Haas <robertmhaas@gmail.com> wrote:

> So, I proposed this patch previously and I still think it's a
> good idea, but it got voted down on the grounds that it didn't
> deal with clock drift.  I view that as insufficient reason to
> reject the feature, but others disagreed.  Unless some of those
> people have changed their minds, I don't think this patch has
> much future here.

There are many things that a system admin can get wrong.  Failing
to supply this feature because the sysadmin might not be running
ntpd (or equivalent) correctly seems to me to be like not having
the software do fsync because the sysadmin might not have turned
off write-back buffering on drives without persistent storage.
Either way, poor system management can defeat the feature.  Either
way, I see no reason to withhold the feature from those who manage
their systems in a sane fashion.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Time-Delayed Standbys

From

Christian Kruse

Date:

04 December 2013, 15:30:19

Hi,

On 04/12/13 07:22, Kevin Grittner wrote:
> There are many things that a system admin can get wrong.  Failing
> to supply this feature because the sysadmin might not be running
> ntpd (or equivalent) correctly seems to me to be like not having
> the software do fsync because the sysadmin might not have turned
> off write-back buffering on drives without persistent storage.
> Either way, poor system management can defeat the feature.  Either
> way, I see no reason to withhold the feature from those who manage
> their systems in a sane fashion.

I agree. But maybe we should add a warning in the documentation about
time syncing?

Greetings,CK

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Peter Eisentraut

Date:

04 December 2013, 17:12:39

src/backend/access/transam/xlog.c:5889: trailing whitespace.

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

05 December 2013, 00:45:42

On 3 December 2013 18:46, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
>> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com>
>> wrote:
>>>
>>> Hi Fabrizio,
>>>
>>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
>>> applies and compiles w/o errors or warnings. I set up a master and two
>>> hot standbys replicating from the master, one with 5 minutes delay and
>>> one without delay. After that I created a new database and generated
>>> some test data:
>>>
>>> CREATE TABLE test (val INTEGER);
>>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));
>>>
>>> The non-delayed standby nearly instantly had the data replicated, the
>>> delayed standby was replicated after exactly 5 minutes. I did not
>>> notice any problems, errors or warnings.
>>>
>>
>> Thanks for your review Christian...
>
> So, I proposed this patch previously and I still think it's a good
> idea, but it got voted down on the grounds that it didn't deal with
> clock drift.  I view that as insufficient reason to reject the
> feature, but others disagreed.  Unless some of those people have
> changed their minds, I don't think this patch has much future here.

I had that objection and others. Since then many people have requested
this feature and have persuaded me that this is worth having and that
my objections are minor points. I now agree with the need for the
feature, almost as written.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Magnus Hagander

Date:

05 December 2013, 08:51:21

On Thu, Dec 5, 2013 at 1:45 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 3 December 2013 18:46, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
>> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com>
>> wrote:
>>>
>>> Hi Fabrizio,
>>>
>>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
>>> applies and compiles w/o errors or warnings. I set up a master and two
>>> hot standbys replicating from the master, one with 5 minutes delay and
>>> one without delay. After that I created a new database and generated
>>> some test data:
>>>
>>> CREATE TABLE test (val INTEGER);
>>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));
>>>
>>> The non-delayed standby nearly instantly had the data replicated, the
>>> delayed standby was replicated after exactly 5 minutes. I did not
>>> notice any problems, errors or warnings.
>>>
>>
>> Thanks for your review Christian...
>
> So, I proposed this patch previously and I still think it's a good
> idea, but it got voted down on the grounds that it didn't deal with
> clock drift. I view that as insufficient reason to reject the
> feature, but others disagreed. Unless some of those people have
> changed their minds, I don't think this patch has much future here.

I had that objection and others. Since then many people have requested
this feature and have persuaded me that this is worth having and that
my objections are minor points. I now agree with the need for the
feature, almost as written.

Not recalling the older thread, but it seems the "breaks on clock drift", I think we can fairly easily make that situation "good enough". Just have IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to start if the time difference is too great. Yes, that doesn't catch the case when the machines are in perfect sync when they start up and drift *later*, but it will catch the most common cases I bet. But I think that's good enough that we can accept the feature, given that *most* people will have ntp, and that it's a very useful feature for those people. But we could help people who run into it because of a simple config error..

Or maybe the suggested patch already does this, in which case ignore that part :)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

05 December 2013, 09:58:00

On 5 December 2013 08:51, Magnus Hagander <magnus@hagander.net> wrote:

> Not recalling the older thread, but it seems the "breaks on clock drift", I
> think we can fairly easily make that situation "good enough". Just have
> IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to
> start if the time difference is too great. Yes, that doesn't catch the case
> when the machines are in perfect sync when they start up and drift *later*,
> but it will catch the most common cases I bet. But I think that's good
> enough that we can accept the feature, given that *most* people will have
> ntp, and that it's a very useful feature for those people. But we could help
> people who run into it because of a simple config error..
>
> Or maybe the suggested patch already does this, in which case ignore that
> part :)

I think the very nature of *this* feature is that it doesnt *require*
the clocks to be exactly in sync, even though that is the basis for
measurement.

The setting of this parameter for sane usage would be minimum 5 mins,
but more likely 30 mins, 1 hour or more.

In that case, a few seconds drift either way makes no real difference
to this feature.

So IMHO, without prejudice to other features that may be more
critically reliant on time synchronisation, we are OK to proceed with
this specific feature.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

05 December 2013, 17:27:20

On Thu, Dec 5, 2013 at 7:57 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 5 December 2013 08:51, Magnus Hagander <magnus@hagander.net> wrote:

> Not recalling the older thread, but it seems the "breaks on clock drift", I
> think we can fairly easily make that situation "good enough". Just have
> IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to
> start if the time difference is too great. Yes, that doesn't catch the case
> when the machines are in perfect sync when they start up and drift *later*,
> but it will catch the most common cases I bet. But I think that's good
> enough that we can accept the feature, given that *most* people will have
> ntp, and that it's a very useful feature for those people. But we could help
> people who run into it because of a simple config error..
>
> Or maybe the suggested patch already does this, in which case ignore that
> part :)

I think the very nature of *this* feature is that it doesnt *require*
the clocks to be exactly in sync, even though that is the basis for
measurement.

The setting of this parameter for sane usage would be minimum 5 mins,
but more likely 30 mins, 1 hour or more.

In that case, a few seconds drift either way makes no real difference
to this feature.

So IMHO, without prejudice to other features that may be more
critically reliant on time synchronisation, we are OK to proceed with
this specific feature.

Hi all,

I saw the comments of all of you. I'm a few busy with some customers issues (has been a crazy week), but I'll reply and/or fix your suggestions later.

Thanks for all review and sorry to delay in reply.

Regards,

Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

06 December 2013, 04:07:23

On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
> > XLOG_XACT_COMMIT_COMPACT checks
>
> Why just those? Why not aborts and restore points also?
>

I think make no sense execute the delay after aborts and/or restore points, because it not change data in a standby server.

> > - don't care about clockdrift because it's an admin problem.
>
> Few minor points on things
>
> * The code with comment "Clear any previous recovery delay time" is in
> wrong place, move down or remove completely. Setting the delay to zero
> doesn't prevent calling recoveryDelay(), so that logic looks wrong
> anyway.
>

Fixed.

> * The loop exit in recoveryDelay() is inelegant, should break if <= 0
>

Fixed.

> * There's a spelling mistake in sample
>

Fixed.

> * The patch has whitespace in one place
>

Fixed.

> and one important point...
>
> * The delay loop happens AFTER we check for a pause. Which means if
> the user notices a problem on a commit, then hits pause button on the
> standby, the pause will have no effect and the next commit will be
> applied anyway. Maybe just one commit, but its an off by one error
> that removes the benefit of the patch. So I think we need to test this
> again after we finish delaying
>
> if (xlogctl->recoveryPause)
> recoveryPausesHere();
>

Fixed.

>
> We need to explain in the docs that this is intended only for use in a
> live streaming deployment. It will have little or no meaning in a
> PITR.
>

Fixed.

> I think recovery_time_delay should be called
> <something>_apply_delay
> to highlight the point that it is the apply of records that is
> delayed, not the receipt. And hence the need to document that sync rep
> is NOT slowed down by setting this value.
>

Fixed.

> And to make the name consistent with other parameters, I suggest
> min_standby_apply_delay
>

I agree. Fixed!

> We also need to document caveats about the patch, in that it only
> delays on timestamped WAL records and other records may be applied
> sooner than the delay in some circumstances, so it is not a way to
> avoid all cancellations.
>
> We also need to document the behaviour of the patch is to apply all
> data received as quickly as possible once triggered, so the specified
> delay does not slow down promoting the server to a master. That might
> also be seen as a negative behaviour, since promoting the master
> effectively sets recovery_time_delay to zero.
>
> I will handle the additional documentation, if you can update the
> patch with the main review comments. Thanks.
>

Thanks, your help is welcome.

Att,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Attachment

time-delayed-standby-v3.patch

Re: Time-Delayed Standbys

From

Robert Haas

Date:

06 December 2013, 15:36:25

On Thu, Dec 5, 2013 at 11:07 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:
> On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
>> > XLOG_XACT_COMMIT_COMPACT checks
>>
>> Why just those? Why not aborts and restore points also?
>>
>
> I think make no sense execute the delay after aborts and/or restore points,
> because it not change data in a standby server.

I see no reason to pause for aborts.  Aside from the fact that it
wouldn't be reliable in corner cases, as Fabrízio says, there's no
user-visible effect, just as there's no user-visible effect from
replaying a transaction up until just prior to the point where it
commits (which we also do).

Waiting for restore points seems like it potentially makes sense.  If
the standby is delayed by an hour, and you create a restore point and
wait 55 minutes, you might expect that that you can still kill the
standby and recover it to that restore point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

06 December 2013, 15:57:04

On Fri, Dec 6, 2013 at 1:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 5, 2013 at 11:07 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:
> On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
>> > XLOG_XACT_COMMIT_COMPACT checks
>>
>> Why just those? Why not aborts and restore points also?
>>
>
> I think make no sense execute the delay after aborts and/or restore points,
> because it not change data in a standby server.

I see no reason to pause for aborts. Aside from the fact that it
wouldn't be reliable in corner cases, as Fabrízio says, there's no
user-visible effect, just as there's no user-visible effect from
replaying a transaction up until just prior to the point where it
commits (which we also do).

Waiting for restore points seems like it potentially makes sense. If
the standby is delayed by an hour, and you create a restore point and
wait 55 minutes, you might expect that that you can still kill the
standby and recover it to that restore point.

Makes sense. Fixed.

Regards,

Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Attachment

time-delayed-standby-v4.patch

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

09 December 2013, 10:32:16

Hi Fabrízio,

I test your v4 patch, and send your review comments.

* Fix typo> 49 -# commited transactions from the master, specify a recovery time delay.> 49 +# committed transactions
fromthe master, specify a recovery time delay.
 

* Fix white space> 177 -               if (secs <= 0 && microsecs <=0)> 177 +               if (secs <= 0 && microsecs
<=0)
 

* Add functionality (I propose)
We can set negative number at min_standby_apply_delay. I think that this feature
is for world wide replication situation. For example, master server is in Japan 
and slave server is in San Francisco. Japan time fowards than San Francisco time
. And if we want to delay in this situation, it can need negative number in 
min_standby_apply_delay. So I propose that time delay conditional branch change 
under following.> - if (min_standby_apply_delay > 0)> + if (min_standby_apply_delay != 0)
What do you think? It might also be working collectry.


* Problem 1
I read your wittened document. There is "PITR has not affected".
However, when I run PITR with min_standby_apply_delay=3000000, it cannot start 
server. The log is under following.
> [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start
> server starting
> [mitsu-ko@localhost postgresql]$ LOG:  database system was interrupted; last known up at 2013-12-08 18:57:00 JST
> LOG:  creating missing WAL directory "pg_xlog/archive_status"
> cp: cannot stat `../arc/00000002.history':
> LOG:  starting archive recovery
> LOG:  restored log file "000000010000000000000041" from archive
> LOG:  redo starts at 0/41000028
> LOG:  consistent recovery state reached at 0/410000F0
> LOG:  database system is ready to accept read only connections
> LOG:  restored log file "000000010000000000000042" from archive
> FATAL:  cannot wait on a latch owned by another process
> LOG:  startup process (PID 30501) exited with exit code 1
> LOG:  terminating any other active server processes
We need recovery flag for controling PITR situation.


That's all for now.
If you are busy, please fix in your pace. I'm busy and I'd like to wait your 
time, too:-)

Regards,
-- 
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Pavel Stehule

Date:

09 December 2013, 10:36:36

2013/12/9 KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp>

Hi Fabrízio,

I test your v4 patch, and send your review comments.

* Fix typo
> 49 -# commited transactions from the master, specify a recovery time delay.
> 49 +# committed transactions from the master, specify a recovery time delay.

* Fix white space
> 177 - if (secs <= 0 && microsecs <=0)
> 177 + if (secs <= 0 && microsecs <=0 )

* Add functionality (I propose)
We can set negative number at min_standby_apply_delay. I think that this feature
is for world wide replication situation. For example, master server is in Japan and slave server is in San Francisco. Japan time fowards than San Francisco time
. And if we want to delay in this situation, it can need negative number in min_standby_apply_delay. So I propose that time delay conditional branch change under following.
> - if (min_standby_apply_delay > 0)
> + if (min_standby_apply_delay != 0)
What do you think? It might also be working collectry.

what using interval instead absolute time?

Regards

Pavel

* Problem 1
I read your wittened document. There is "PITR has not affected".
However, when I run PITR with min_standby_apply_delay=3000000, it cannot start server. The log is under following.
[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start
server starting
[mitsu-ko@localhost postgresql]$ LOG: database system was interrupted; last known up at 2013-12-08 18:57:00 JST
LOG: creating missing WAL directory "pg_xlog/archive_status"
cp: cannot stat `../arc/00000002.history':
LOG: starting archive recovery
LOG: restored log file "000000010000000000000041" from archive
LOG: redo starts at 0/41000028
LOG: consistent recovery state reached at 0/410000F0
LOG: database system is ready to accept read only connections
LOG: restored log file "000000010000000000000042" from archive
FATAL: cannot wait on a latch owned by another process
LOG: startup process (PID 30501) exited with exit code 1
LOG: terminating any other active server processes
We need recovery flag for controling PITR situation.

That's all for now.
If you are busy, please fix in your pace. I'm busy and I'd like to wait your time, too:-)

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

09 December 2013, 10:45:58

(2013/12/09 19:36), KONDO Mitsumasa wrote:
> * Problem 1
> I read your wittened document. There is "PITR has not affected".
> However, when I run PITR with min_standby_apply_delay=3000000, it cannot start
> server. The log is under following.
>> [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start
>> server starting
>> [mitsu-ko@localhost postgresql]$ LOG:  database system was interrupted; last
>> known up at 2013-12-08 18:57:00 JST
>> LOG:  creating missing WAL directory "pg_xlog/archive_status"
>> cp: cannot stat `../arc/00000002.history':
>> LOG:  starting archive recovery
>> LOG:  restored log file "000000010000000000000041" from archive
>> LOG:  redo starts at 0/41000028
>> LOG:  consistent recovery state reached at 0/410000F0
>> LOG:  database system is ready to accept read only connections
>> LOG:  restored log file "000000010000000000000042" from archive
>> FATAL:  cannot wait on a latch owned by another process
>> LOG:  startup process (PID 30501) exited with exit code 1
>> LOG:  terminating any other active server processes
> We need recovery flag for controling PITR situation.
Add my comment. We have to consider three situations.

1. PITR
2. replication standby
3. replication standby with restore_command

I think this patch cannot delay in 1 situation. So I think you should add only 
StandbyModeRequested flag in conditional branch.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

09 December 2013, 10:50:52

(2013/12/09 19:35), Pavel Stehule wrote:
>
>
>
> 2013/12/9 KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp
> <mailto:kondo.mitsumasa@lab.ntt.co.jp>>
>
>     Hi Fabrízio,
>
>     I test your v4 patch, and send your review comments.
>
>     * Fix typo
>      > 49 -# commited transactions from the master, specify a recovery time delay.
>      > 49 +# committed transactions from the master, specify a recovery time delay.
>
>     * Fix white space
>      > 177 -               if (secs <= 0 && microsecs <=0)
>      > 177 +               if (secs <= 0 && microsecs <=0 )
>
>     * Add functionality (I propose)
>     We can set negative number at min_standby_apply_delay. I think that this feature
>     is for world wide replication situation. For example, master server is in
>     Japan and slave server is in San Francisco. Japan time fowards than San
>     Francisco time
>     . And if we want to delay in this situation, it can need negative number in
>     min_standby_apply_delay. So I propose that time delay conditional branch
>     change under following.
>      > - if (min_standby_apply_delay > 0)
>      > + if (min_standby_apply_delay != 0)
>     What do you think? It might also be working collectry.
>
>
> what using interval instead absolute time?
This is because local time is recorded in XLOG. And it has big cost for 
calculating global time.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Andres Freund

Date:

09 December 2013, 11:29:46

On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote:
> Add my comment. We have to consider three situations.
> 
> 1. PITR
> 2. replication standby
> 3. replication standby with restore_command
> 
> I think this patch cannot delay in 1 situation.

Why?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Craig Ringer

Date:

09 December 2013, 12:16:43

On 12/04/2013 02:46 AM, Robert Haas wrote:
>> Thanks for your review Christian...
> 
> So, I proposed this patch previously and I still think it's a good
> idea, but it got voted down on the grounds that it didn't deal with
> clock drift.  I view that as insufficient reason to reject the
> feature, but others disagreed.  Unless some of those people have
> changed their minds, I don't think this patch has much future here.

Surely that's the operating system / VM host / sysadmin / whatever's
problem?

The only way to "deal with" clock drift that isn't fragile in the face
of variable latency, etc, is to basically re-implement (S)NTP in order
to find out what the clock difference with the remote is.

If we're going to do that, why not just let the OS deal with it?

It might well be worth complaining about obvious aberrations like
timestamps in the local future - preferably by complaining and not
actually dying. It does need to be able to cope with a *skewing* clock,
but I'd be surprised if it had any issues there in the first place.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Greg Stark

Date:

09 December 2013, 13:46:59

<p dir="ltr"><br /> On 9 Dec 2013 12:16, "Craig Ringer" <<a
href="mailto:craig@2ndquadrant.com">craig@2ndquadrant.com</a>>wrote:<p dir="ltr">> The only way to "deal with"
clockdrift that isn't fragile in the face<br /> > of variable latency, etc, is to basically re-implement (S)NTP in
order<br/> > to find out what the clock difference with the remote is.<br /><p dir="ltr">There's actually an
entirelydifferent way to "deal" with clock drift: test "master time" and "slave time" as two different incomparable
spaces.Similar to how you would treat measurements in different units.<p dir="ltr">If you do that then you can measure
andmanage the delay in the slave between receiving and applying a record and also measure the amount of master server
timewhich can be pending. These measurements don't depend at all on time sync between servers.<p dir="ltr">The
specifiedfeature depends explicitly on the conversion between master and slave time spaces so it's inevitable that sync
wouldbe an issue. It might be nice to print a warning on connection if the time is far out of sync or periodically
check.But I don't think reimplementing NTP is a good idea.

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

10 December 2013, 04:21:27

(2013/12/09 20:29), Andres Freund wrote:
> On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote:
>> Add my comment. We have to consider three situations.
>>
>> 1. PITR
>> 2. replication standby
>> 3. replication standby with restore_command
>>
>> I think this patch cannot delay in 1 situation.
>
> Why?

I have three reasons.
  1. It is written in document. Can we remove it?  2. Name of this feature is "Time-delayed *standbys*", not
"Time-delayed    *recovery*". Can we change it?  3. I think it is unnessesary in master PITR. And if it can delay in
master    PITR, it will become master at unexpected timing, not to continue to     recovery. It is meaningless.

I'd like to ask you what do you expect from this feature and how to use it.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Andres Freund

Date:

10 December 2013, 09:38:57

On 2013-12-10 13:26:27 +0900, KONDO Mitsumasa wrote:
> (2013/12/09 20:29), Andres Freund wrote:
> >On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote:
> >>Add my comment. We have to consider three situations.
> >>
> >>1. PITR
> >>2. replication standby
> >>3. replication standby with restore_command
> >>
> >>I think this patch cannot delay in 1 situation.
> >
> >Why?
> 
> I have three reasons.

None of these reasons seem to be of technical nature, right?

>   1. It is written in document. Can we remove it?
>   2. Name of this feature is "Time-delayed *standbys*", not "Time-delayed
>      *recovery*". Can we change it?

I don't think that'd be a win in clarity. But perhaps somebody else has
a better suggestion?

>   3. I think it is unnessesary in master PITR. And if it can delay in master
>      PITR, it will become master at unexpected timing, not to continue to
>      recovery. It is meaningless.

"master PITR"? What's that? All PITR is based on recovery.conf and thus
not really a "master"?

Why should we prohibit using this feature in PITR? I don't see any
advantage in doing so. If somebody doesn't want the delay, they
shouldn't set it in the configuration file. End of story.

There's not really a that meaningful distinction between PITR and
replication using archive_command. Especially when using
*pause_after. I think this feature will be used in a lot of scenarios in
which PITR is currently used.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

11 December 2013, 06:31:45

(2013/12/10 18:38), Andres Freund wrote:
> "master PITR"? What's that? All PITR is based on recovery.conf and thus
> not really a "master"?
"master PITR" is PITR with "standby_mode = off". It's just recovery from 
basebackup. They have difference between "master PITR" and "standby" that the 
former will be independent timelineID, but the latter is same timeline ID taht 
following the master sever. In the first place, purposes are different.

> Why should we prohibit using this feature in PITR? I don't see any
> advantage in doing so. If somebody doesn't want the delay, they
> shouldn't set it in the configuration file. End of story.
Unfortunately, there are a lot of stupid in the world... I think you have these 
clients, too.

> There's not really a that meaningful distinction between PITR and
> replication using archive_command. Especially when using
> *pause_after.
It is meaningless in "master PITR". It will be master which has new timelineID at 
unexpected timing.

> I think this feature will be used in a lot of scenarios in
> which PITR is currently used.
We have to judge which is better, we get something potential or to protect stupid.
And we had better to wait author's comment...

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

11 December 2013, 08:27:21

On 11 December 2013 06:36, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:

>> I think this feature will be used in a lot of scenarios in
>> which PITR is currently used.
>
> We have to judge which is better, we get something potential or to protect
> stupid.
> And we had better to wait author's comment...

I'd say just document that it wouldn't make sense to use it for PITR.

There may be some use case we can't see yet, so specifically
prohibiting a use case that is not dangerous seems too much at this
point. I will no doubt be reminded of these words in the future...

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

11 December 2013, 18:37:59

<div dir="ltr"><div class="gmail_extra"><br />On Wed, Dec 11, 2013 at 6:27 AM, Simon Riggs <<a
href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>>wrote:<br />><br />> On 11 December 2013 06:36,
KONDOMitsumasa<br /> > <<a href="mailto:kondo.mitsumasa@lab.ntt.co.jp">kondo.mitsumasa@lab.ntt.co.jp</a>>
wrote:<br/>><br />> >> I think this feature will be used in a lot of scenarios in<br />> >> which
PITRis currently used.<br /> > ><br />> > We have to judge which is better, we get something potential or
toprotect<br />> > stupid.<br />> > And we had better to wait author's comment...<br />><br />> I'd
sayjust document that it wouldn't make sense to use it for PITR.<br /> ><br />> There may be some use case we
can'tsee yet, so specifically<br />> prohibiting a use case that is not dangerous seems too much at this<br />>
point.I will no doubt be reminded of these words in the future...<br /> ><br /><br /></div><div
class="gmail_extra">Hiall,<br /><br />I tend to agree with Simon, but I confess that I don't liked to delay a server
withstandby_mode = 'off'.<br /><br /></div><div class="gmail_extra">The main goal of this patch is delay the Streaming
Replication,so if the slave server isn't a hot-standby I think makes no sense to delay it.<br /></div><div
class="gmail_extra"><br/></div><div class="gmail_extra">Mitsumasa suggested to add "StandbyModeRequested" in
conditionalbranch to skip this situation. I agree with him!<br /></div><div class="gmail_extra"><br /></div><div
class="gmail_extra">AndI'll change 'recoveryDelay' (functions, variables) to 'standbyDelay'. <br /><br /></div><div
class="gmail_extra">Regards,<br/><br /></div><div class="gmail_extra">--<br /> Fabrízio de Royes Mello<br
/>Consultoria/CoachingPostgreSQL<br />>> Timbira: <a
href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a
href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>

Re: Time-Delayed Standbys

From

Andres Freund

Date:

11 December 2013, 21:47:19

On 2013-12-11 16:37:54 -0200, Fabrízio de Royes Mello wrote:
> On Wed, Dec 11, 2013 at 6:27 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > >> I think this feature will be used in a lot of scenarios in
> > >> which PITR is currently used.
> > >
> > > We have to judge which is better, we get something potential or to protect
> > > stupid.
> > > And we had better to wait author's comment...
> >
> > I'd say just document that it wouldn't make sense to use it for PITR.
> >
> > There may be some use case we can't see yet, so specifically
> > prohibiting a use case that is not dangerous seems too much at this
> > point. I will no doubt be reminded of these words in the future...

> I tend to agree with Simon, but I confess that I don't liked to delay a
> server with standby_mode = 'off'.

> The main goal of this patch is delay the Streaming Replication, so if the
> slave server isn't a hot-standby I think makes no sense to delay it.

> Mitsumasa suggested to add "StandbyModeRequested" in conditional branch to
> skip this situation. I agree with him!

I don't think that position has any merit, sorry: Think about the way
this stuff gets setup. The user creates a new basebackup (pg_basebackup,
manual pg_start/stop_backup, shutdown primary). Then he creates a
recovery conf by either starting from scratch, using
--write-recovery-conf or by copying recovery.conf.sample. In none of
these cases delay will be configured.

So, with that in mind, the only way it could have been configured is by
the user *explicitly* writing it into recovery.conf. And now you want to
to react to this explicit step by just *silently* ignoring the setting
based on some random criteria (arguments have been made about
hot_standby=on/off, standby_mode=on/off which aren't directly
related). Why on earth would that by a usability improvement?

Also, you seem to assume there's no point in configuring it for any of
hot_standby=off, standby_mode=off, recovery_target=*. Why? There's
usecases for all of them:
* hot_standby=off: Makes delay useable with wal_level=archive (and thus a lower WAL volume)
* standby_mode=off: Configurations that use tools like pg_standby and similar simply don't need standby_mode=on. If you
wantto trigger failover from within the restore_command you *cannot* set it. 
* recovery_target_*: It can still make sense if you use pause_at_recovery_target.

In which scenarios does your restriction actually improve anything?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

11 December 2013, 22:24:06

On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>
> I don't think that position has any merit, sorry: Think about the way
> this stuff gets setup. The user creates a new basebackup (pg_basebackup,
> manual pg_start/stop_backup, shutdown primary). Then he creates a
> recovery conf by either starting from scratch, using
> --write-recovery-conf or by copying recovery.conf.sample. In none of
> these cases delay will be configured.
>

Ok.

> So, with that in mind, the only way it could have been configured is by
> the user *explicitly* writing it into recovery.conf. And now you want to
> to react to this explicit step by just *silently* ignoring the setting
> based on some random criteria (arguments have been made about
> hot_standby=on/off, standby_mode=on/off which aren't directly
> related). Why on earth would that by a usability improvement?
>
> Also, you seem to assume there's no point in configuring it for any of
> hot_standby=off, standby_mode=off, recovery_target=*. Why? There's
> usecases for all of them:
> * hot_standby=off: Makes delay useable with wal_level=archive (and thus
> a lower WAL volume)
> * standby_mode=off: Configurations that use tools like pg_standby and
> similar simply don't need standby_mode=on. If you want to trigger
> failover from within the restore_command you *cannot* set it.
> * recovery_target_*: It can still make sense if you use
> pause_at_recovery_target.
>
> In which scenarios does your restriction actually improve anything?
>

Given your arguments I'm forced to review my understanding of the problem. You are absolutely right in your assertions. I was not seeing the scenario on this perspective.

Anyway we need to improve docs, any suggestions?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

12 December 2013, 08:13:52

(2013/12/12 7:23), Fabrízio de Royes Mello wrote:
> On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund <andres@2ndquadrant.com
>  > * hot_standby=off: Makes delay useable with wal_level=archive (and thus
>  >   a lower WAL volume)
>  > * standby_mode=off: Configurations that use tools like pg_standby and
>  >   similar simply don't need standby_mode=on. If you want to trigger
>  >   failover from within the restore_command you *cannot* set it.
>  > * recovery_target_*: It can still make sense if you use
>  >   pause_at_recovery_target.

I don't think part of his arguments are right very much... We can just set 
stanby_mode=on when we use "min_standby_apply_delay" with pg_standby and similar 
simply tools. However, I tend to agree with not to need to prohibit except for 
standby_mode. So I'd like to propose that changing parameter name of 
"min_standby_apply_delay" to "min_recovery_apply_delay". It is natural for this 
feature.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

12 December 2013, 08:45:10

On 12 December 2013 08:19, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:
> (2013/12/12 7:23), Fabrízio de Royes Mello wrote:
>>
>> On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund <andres@2ndquadrant.com
>>  > * hot_standby=off: Makes delay useable with wal_level=archive (and thus
>>  >   a lower WAL volume)
>>  > * standby_mode=off: Configurations that use tools like pg_standby and
>>  >   similar simply don't need standby_mode=on. If you want to trigger
>>  >   failover from within the restore_command you *cannot* set it.
>>  > * recovery_target_*: It can still make sense if you use
>>  >   pause_at_recovery_target.
>
>
> I don't think part of his arguments are right very much... We can just set
> stanby_mode=on when we use "min_standby_apply_delay" with pg_standby and
> similar simply tools. However, I tend to agree with not to need to prohibit
> except for standby_mode. So I'd like to propose that changing parameter name
> of "min_standby_apply_delay" to "min_recovery_apply_delay". It is natural
> for this feature.

OK

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

12 December 2013, 09:09:34

On 9 December 2013 10:54, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote:
> (2013/12/09 19:35), Pavel Stehule wrote:
>>
>>
>>
>>
>> 2013/12/9 KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp
>> <mailto:kondo.mitsumasa@lab.ntt.co.jp>>
>>
>>
>>     Hi Fabrízio,
>>
>>     I test your v4 patch, and send your review comments.
>>
>>     * Fix typo
>>      > 49 -# commited transactions from the master, specify a recovery
>> time delay.
>>      > 49 +# committed transactions from the master, specify a recovery
>> time delay.
>>
>>     * Fix white space
>>      > 177 -               if (secs <= 0 && microsecs <=0)
>>      > 177 +               if (secs <= 0 && microsecs <=0 )
>>
>>     * Add functionality (I propose)
>>     We can set negative number at min_standby_apply_delay. I think that
>> this feature
>>     is for world wide replication situation. For example, master server is
>> in
>>     Japan and slave server is in San Francisco. Japan time fowards than
>> San
>>     Francisco time
>>     . And if we want to delay in this situation, it can need negative
>> number in
>>     min_standby_apply_delay. So I propose that time delay conditional
>> branch
>>     change under following.
>>      > - if (min_standby_apply_delay > 0)
>>      > + if (min_standby_apply_delay != 0)
>>     What do you think? It might also be working collectry.
>>
>>
>> what using interval instead absolute time?
>
> This is because local time is recorded in XLOG. And it has big cost for
> calculating global time.

I agree with your request here, but I don't think negative values are
the right way to implement that, at least it would not be very usable.

My suggestion would be to add the TZ to the checkpoint record. This
way all users of WAL can see the TZ of the master and act accordingly.
I'll do a separate patch for that.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

KONDO Mitsumasa

Date:

12 December 2013, 10:37:03

(2013/12/12 18:09), Simon Riggs wrote:
> On 9 December 2013 10:54, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote:
>> (2013/12/09 19:35), Pavel Stehule wrote:
>>>
>>>
>>>
>>>
>>> 2013/12/9 KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp
>>> <mailto:kondo.mitsumasa@lab.ntt.co.jp>>
>>>
>>>
>>>      Hi Fabrízio,
>>>
>>>      I test your v4 patch, and send your review comments.
>>>
>>>      * Fix typo
>>>       > 49 -# commited transactions from the master, specify a recovery
>>> time delay.
>>>       > 49 +# committed transactions from the master, specify a recovery
>>> time delay.
>>>
>>>      * Fix white space
>>>       > 177 -               if (secs <= 0 && microsecs <=0)
>>>       > 177 +               if (secs <= 0 && microsecs <=0 )
>>>
>>>      * Add functionality (I propose)
>>>      We can set negative number at min_standby_apply_delay. I think that
>>> this feature
>>>      is for world wide replication situation. For example, master server is
>>> in
>>>      Japan and slave server is in San Francisco. Japan time fowards than
>>> San
>>>      Francisco time
>>>      . And if we want to delay in this situation, it can need negative
>>> number in
>>>      min_standby_apply_delay. So I propose that time delay conditional
>>> branch
>>>      change under following.
>>>       > - if (min_standby_apply_delay > 0)
>>>       > + if (min_standby_apply_delay != 0)
>>>      What do you think? It might also be working collectry.
>>>
>>>
>>> what using interval instead absolute time?
>>
>> This is because local time is recorded in XLOG. And it has big cost for
>> calculating global time.
>
> I agree with your request here, but I don't think negative values are
> the right way to implement that, at least it would not be very usable.
I think that my proposal is the easiest and simplist way to solve this problem. 
And I believe that the man who cannot calculate the difference in time-zone 
doesn't set replication cluster across continents.

> My suggestion would be to add the TZ to the checkpoint record. This
> way all users of WAL can see the TZ of the master and act accordingly.
> I'll do a separate patch for that.
It is something useful for also other situations. However, it might be
going to happen long and complicated discussions... I　think that our hope is to 
commit this patch in this commit-fest or next final commit-fest.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

12 December 2013, 11:04:26

On 12 December 2013 10:42, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:

>> I agree with your request here, but I don't think negative values are
>> the right way to implement that, at least it would not be very usable.
>
> I think that my proposal is the easiest and simplist way to solve this
> problem. And I believe that the man who cannot calculate the difference in
> time-zone doesn't set replication cluster across continents.
>
>
>> My suggestion would be to add the TZ to the checkpoint record. This
>> way all users of WAL can see the TZ of the master and act accordingly.
>> I'll do a separate patch for that.
>
> It is something useful for also other situations. However, it might be
> going to happen long and complicated discussions... I　think that our hope is
> to commit this patch in this commit-fest or next final commit-fest.

Agreed on no delay for the delay patch, as shown by my commit.

Still think we need better TZ handling.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Andres Freund

Date:

12 December 2013, 11:05:17

On 2013-12-12 09:09:21 +0000, Simon Riggs wrote:
> >>     * Add functionality (I propose)
> >>     We can set negative number at min_standby_apply_delay. I think that
> >> this feature
> >>     is for world wide replication situation. For example, master server is
> >> in
> >>     Japan and slave server is in San Francisco. Japan time fowards than
> >> San
> >>     Francisco time
> >>     . And if we want to delay in this situation, it can need negative

> > This is because local time is recorded in XLOG. And it has big cost for
> > calculating global time.

Uhm? Isn't the timestamp in commit records actually a TimestampTz? And
thus essentially stored as UTC? I don't think this problem actually
exists?

> My suggestion would be to add the TZ to the checkpoint record. This
> way all users of WAL can see the TZ of the master and act accordingly.
> I'll do a separate patch for that.

Intuitively I'd say that might be useful - but I am not reall sure what
for. And we don't exactly have a great interface for looking at a
checkpoint's data. Maybe add it to the control file instead?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

12 December 2013, 11:27:14

On 12 December 2013 11:05, Andres Freund <andres@2ndquadrant.com> wrote:

>> My suggestion would be to add the TZ to the checkpoint record. This
>> way all users of WAL can see the TZ of the master and act accordingly.
>> I'll do a separate patch for that.
>
> Intuitively I'd say that might be useful - but I am not reall sure what
> for. And we don't exactly have a great interface for looking at a
> checkpoint's data. Maybe add it to the control file instead?

That's actually what I had in mind, I just phrased it badly in mid-thought.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Mitsumasa KONDO

Date:

12 December 2013, 13:46:35

2013/12/12 Simon Riggs <simon@2ndquadrant.com>

On 12 December 2013 10:42, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:

>> I agree with your request here, but I don't think negative values are
>> the right way to implement that, at least it would not be very usable.
>
> I think that my proposal is the easiest and simplist way to solve this
> problem. And I believe that the man who cannot calculate the difference in
> time-zone doesn't set replication cluster across continents.
>
>
>> My suggestion would be to add the TZ to the checkpoint record. This
>> way all users of WAL can see the TZ of the master and act accordingly.
>> I'll do a separate patch for that.
>
> It is something useful for also other situations. However, it might be
> going to happen long and complicated discussions... I　think that our hope is
> to commit this patch in this commit-fest or next final commit-fest.

Agreed on no delay for the delay patch, as shown by my commit.

Our forecast was very accurate...

Nice commit, Thanks!

Regards,

Mitsumasa KONDO

NTT Open Source Software Center

Re: Time-Delayed Standbys

From

Tom Lane

Date:

12 December 2013, 14:52:52

Simon Riggs <simon@2ndQuadrant.com> writes:
> On 12 December 2013 11:05, Andres Freund <andres@2ndquadrant.com> wrote:
>>> My suggestion would be to add the TZ to the checkpoint record. This
>>> way all users of WAL can see the TZ of the master and act accordingly.
>>> I'll do a separate patch for that.

>> Intuitively I'd say that might be useful - but I am not reall sure what
>> for. And we don't exactly have a great interface for looking at a
>> checkpoint's data. Maybe add it to the control file instead?

> That's actually what I had in mind, I just phrased it badly in mid-thought.

I don't think you realize what a can of worms that would be.  There's
no compact representation of "a timezone", unless you are only proposing
to store the UTC offset; and frankly I'm not particularly seeing the point
of that.
        regards, tom lane

Re: Time-Delayed Standbys

From

Robert Haas

Date:

12 December 2013, 15:03:19

On Thu, Dec 12, 2013 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> On 12 December 2013 11:05, Andres Freund <andres@2ndquadrant.com> wrote:
>>>> My suggestion would be to add the TZ to the checkpoint record. This
>>>> way all users of WAL can see the TZ of the master and act accordingly.
>>>> I'll do a separate patch for that.
>
>>> Intuitively I'd say that might be useful - but I am not reall sure what
>>> for. And we don't exactly have a great interface for looking at a
>>> checkpoint's data. Maybe add it to the control file instead?
>
>> That's actually what I had in mind, I just phrased it badly in mid-thought.
>
> I don't think you realize what a can of worms that would be.  There's
> no compact representation of "a timezone", unless you are only proposing
> to store the UTC offset; and frankly I'm not particularly seeing the point
> of that.

+1.  I can see the point of storing a timestamp in each checkpoint
record, if we don't already, but time zones should be completely
irrelevant to this feature.  Everything should be reckoned in seconds
since the epoch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

12 December 2013, 15:19:16

On 12 December 2013 15:03, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Dec 12, 2013 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>> On 12 December 2013 11:05, Andres Freund <andres@2ndquadrant.com> wrote:
>>>>> My suggestion would be to add the TZ to the checkpoint record. This
>>>>> way all users of WAL can see the TZ of the master and act accordingly.
>>>>> I'll do a separate patch for that.
>>
>>>> Intuitively I'd say that might be useful - but I am not reall sure what
>>>> for. And we don't exactly have a great interface for looking at a
>>>> checkpoint's data. Maybe add it to the control file instead?
>>
>>> That's actually what I had in mind, I just phrased it badly in mid-thought.
>>
>> I don't think you realize what a can of worms that would be.  There's
>> no compact representation of "a timezone", unless you are only proposing
>> to store the UTC offset; and frankly I'm not particularly seeing the point
>> of that.
>
> +1.  I can see the point of storing a timestamp in each checkpoint
> record, if we don't already, but time zones should be completely
> irrelevant to this feature.  Everything should be reckoned in seconds
> since the epoch.

Don't panic guys! I meant UTC offset only. And yes, it may not be
needed, will check.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

12 December 2013, 17:39:33

On 12 December 2013 15:19, Simon Riggs <simon@2ndquadrant.com> wrote:

> Don't panic guys! I meant UTC offset only. And yes, it may not be
> needed, will check.

Checked, all non-UTC TZ offsets work without further effort here.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

12 December 2013, 17:43:02

<div dir="ltr"><div class="gmail_extra"><br />On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs <<a
href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>>wrote:<br />><br />> On 12 December 2013 15:19,
SimonRiggs <<a href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>> wrote:<br /> ><br />> >
Don'tpanic guys! I meant UTC offset only. And yes, it may not be<br />> > needed, will check.<br />><br />>
Checked,all non-UTC TZ offsets work without further effort here.<br />><br /><br />Thanks!<br /><br />--<br
/>Fabríziode Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a
href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a
href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

12 December 2013, 21:59:01

On Thu, Dec 12, 2013 at 3:42 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote:
>
> On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > On 12 December 2013 15:19, Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > > Don't panic guys! I meant UTC offset only. And yes, it may not be
> > > needed, will check.
> >
> > Checked, all non-UTC TZ offsets work without further effort here.
> >
>
> Thanks!
>

Reviewing the committed patch I noted that the "CheckForStandbyTrigger()" after the delay was removed.

If we promote the standby during the delay and don't check the trigger immediately after the delay, then we will replay undesired WALs records.

The attached patch add this check.

Regards,

Attachment

check-for-standby-trigger-on-time-delayed-standby_v1.patch

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

13 December 2013, 11:56:53

On 12 December 2013 21:58, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:
>
> On Thu, Dec 12, 2013 at 3:42 PM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
>>
>> On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs <simon@2ndquadrant.com>
>> wrote:
>> >
>> > On 12 December 2013 15:19, Simon Riggs <simon@2ndquadrant.com> wrote:
>> >
>> > > Don't panic guys! I meant UTC offset only. And yes, it may not be
>> > > needed, will check.
>> >
>> > Checked, all non-UTC TZ offsets work without further effort here.
>> >
>>
>> Thanks!
>>
>
> Reviewing the committed patch I noted that the "CheckForStandbyTrigger()"
> after the delay was removed.
>
> If we promote the standby during the delay and don't check the trigger
> immediately after the delay, then we will replay undesired WALs records.
>
> The attached patch add this check.

I removed it because it was after the pause. I'll replace it, but
before the pause.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Andres Freund

Date:

13 December 2013, 11:59:14

On 2013-12-13 11:56:47 +0000, Simon Riggs wrote:
> On 12 December 2013 21:58, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()"
> > after the delay was removed.
> >
> > If we promote the standby during the delay and don't check the trigger
> > immediately after the delay, then we will replay undesired WALs records.
> >
> > The attached patch add this check.
>
> I removed it because it was after the pause. I'll replace it, but
> before the pause.

Doesn't after the pause make more sense? If somebody promoted while we
were waiting, we want to recognize that before rolling forward? The wait
can take a long while after all?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

13 December 2013, 13:09:28

On 13 December 2013 11:58, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-12-13 11:56:47 +0000, Simon Riggs wrote:
>> On 12 December 2013 21:58, Fabrízio de Royes Mello
>> <fabriziomello@gmail.com> wrote:
>> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()"
>> > after the delay was removed.
>> >
>> > If we promote the standby during the delay and don't check the trigger
>> > immediately after the delay, then we will replay undesired WALs records.
>> >
>> > The attached patch add this check.
>>
>> I removed it because it was after the pause. I'll replace it, but
>> before the pause.
>
> Doesn't after the pause make more sense? If somebody promoted while we
> were waiting, we want to recognize that before rolling forward? The wait
> can take a long while after all?

That would change the way pause currently works, which is OOS for that patch.

I'm happy to discuss such a change, but if agreed, it would need to
apply in all cases, not just this one.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Time-Delayed Standbys

From

Andres Freund

Date:

13 December 2013, 13:22:13

On 2013-12-13 13:09:13 +0000, Simon Riggs wrote:
> On 13 December 2013 11:58, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-12-13 11:56:47 +0000, Simon Riggs wrote:
> >> On 12 December 2013 21:58, Fabrízio de Royes Mello
> >> <fabriziomello@gmail.com> wrote:
> >> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()"
> >> > after the delay was removed.
> >> >
> >> > If we promote the standby during the delay and don't check the trigger
> >> > immediately after the delay, then we will replay undesired WALs records.
> >> >
> >> > The attached patch add this check.
> >>
> >> I removed it because it was after the pause. I'll replace it, but
> >> before the pause.
> >
> > Doesn't after the pause make more sense? If somebody promoted while we
> > were waiting, we want to recognize that before rolling forward? The wait
> > can take a long while after all?
>
> That would change the way pause currently works, which is OOS for that patch.

But this feature isn't pause itself - it's imo something
independent. Note that we currently
a) check pause again after recoveryApplyDelay(),
b) do check for promotion if the sleep in recoveryApplyDelay() is  interrupted. So not checking after the final sleep
seemsconfusing. 

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Time-Delayed Standbys

From

Simon Riggs

Date:

13 December 2013, 13:44:45

On 13 December 2013 13:22, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-12-13 13:09:13 +0000, Simon Riggs wrote:
>> On 13 December 2013 11:58, Andres Freund <andres@2ndquadrant.com> wrote:
>> > On 2013-12-13 11:56:47 +0000, Simon Riggs wrote:
>> >> On 12 December 2013 21:58, Fabrízio de Royes Mello
>> >> <fabriziomello@gmail.com> wrote:
>> >> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()"
>> >> > after the delay was removed.
>> >> >
>> >> > If we promote the standby during the delay and don't check the trigger
>> >> > immediately after the delay, then we will replay undesired WALs records.
>> >> >
>> >> > The attached patch add this check.
>> >>
>> >> I removed it because it was after the pause. I'll replace it, but
>> >> before the pause.
>> >
>> > Doesn't after the pause make more sense? If somebody promoted while we
>> > were waiting, we want to recognize that before rolling forward? The wait
>> > can take a long while after all?
>>
>> That would change the way pause currently works, which is OOS for that patch.
>
> But this feature isn't pause itself - it's imo something
> independent. Note that we currently
> a) check pause again after recoveryApplyDelay(),
> b) do check for promotion if the sleep in recoveryApplyDelay() is
>    interrupted. So not checking after the final sleep seems confusing.

I'm proposing the attached patch.

This patch implements a consistent view of recovery pause, which is
that when paused, we don't check for promotion, during or immediately
after. That is user noticeable behaviour and shouldn't be changed
without thought and discussion on a separate thread with a clear
descriptive title. (I might argue in favour of it myself, I'm not yet
decided).

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

snippet.patch

Re: Time-Delayed Standbys

From

Fabrízio de Royes Mello

Date:

13 December 2013, 13:52:59

<div dir="ltr"><div class="gmail_extra"><br />On Fri, Dec 13, 2013 at 11:44 AM, Simon Riggs <<a
href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>>wrote:<br />><br />> On 13 December 2013 13:22,
AndresFreund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>> wrote:<br /> > > On
2013-12-1313:09:13 +0000, Simon Riggs wrote:<br />> >> On 13 December 2013 11:58, Andres Freund <<a
href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />> >> > On 2013-12-13
11:56:47+0000, Simon Riggs wrote:<br /> > >> >> On 12 December 2013 21:58, Fabrízio de Royes Mello<br
/>>>> >> <<a href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>> wrote:<br />>
>>>> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()"<br /> > >>
>>> after the delay was removed.<br />> >> >> ><br />> >> >> > If we
promotethe standby during the delay and don't check the trigger<br />> >> >> > immediately after the
delay,then we will replay undesired WALs records.<br /> > >> >> ><br />> >> >> >
Theattached patch add this check.<br />> >> >><br />> >> >> I removed it because it was
afterthe pause. I'll replace it, but<br />> >> >> before the pause.<br /> > >> ><br />>
>>> Doesn't after the pause make more sense? If somebody promoted while we<br />> >> > were
waiting,we want to recognize that before rolling forward? The wait<br />> >> > can take a long while after
all?<br/> > >><br />> >> That would change the way pause currently works, which is OOS for that
patch.<br/>> ><br />> > But this feature isn't pause itself - it's imo something<br />> >
independent.Note that we currently<br /> > > a) check pause again after recoveryApplyDelay(),<br />> > b)
docheck for promotion if the sleep in recoveryApplyDelay() is<br />> >    interrupted. So not checking after the
finalsleep seems confusing.<br /> ><br />> I'm proposing the attached patch.<br />><br />> This patch
implementsa consistent view of recovery pause, which is<br />> that when paused, we don't check for promotion,
duringor immediately<br />> after. That is user noticeable behaviour and shouldn't be changed<br /> > without
thoughtand discussion on a separate thread with a clear<br />> descriptive title. (I might argue in favour of it
myself,I'm not yet<br />> decided).<br />><br /><br /></div><div class="gmail_extra">In my previous message [1] I
attacha patch equal to your ;-)<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">Regards,<br
/></div><divclass="gmail_extra"><br />[1] <a
href="http://www.postgresql.org/message-id/CAFcNs+qD0AJ=qzhsHD9+v_Mhz0RTBJ=cJPCT_T=uT_JvvnC1FQ@mail.gmail.com">http://www.postgresql.org/message-id/CAFcNs+qD0AJ=qzhsHD9+v_Mhz0RTBJ=cJPCT_T=uT_JvvnC1FQ@mail.gmail.com</a><br
/><br/>--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a
href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a
href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>

Re: Time-Delayed Standbys

From

Andres Freund

Date:

14 December 2013, 13:12:13

On 2013-12-13 13:44:30 +0000, Simon Riggs wrote:
> On 13 December 2013 13:22, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-12-13 13:09:13 +0000, Simon Riggs wrote:
> >> On 13 December 2013 11:58, Andres Freund <andres@2ndquadrant.com> wrote:
> >> >> I removed it because it was after the pause. I'll replace it, but
> >> >> before the pause.
> >> >
> >> > Doesn't after the pause make more sense? If somebody promoted while we
> >> > were waiting, we want to recognize that before rolling forward? The wait
> >> > can take a long while after all?
> >>
> >> That would change the way pause currently works, which is OOS for that patch.
> >
> > But this feature isn't pause itself - it's imo something
> > independent. Note that we currently
> > a) check pause again after recoveryApplyDelay(),
> > b) do check for promotion if the sleep in recoveryApplyDelay() is
> >    interrupted. So not checking after the final sleep seems confusing.
> 
> I'm proposing the attached patch.

LOoks good, although I'd move it down below the comment ;)

> This patch implements a consistent view of recovery pause, which is
> that when paused, we don't check for promotion, during or immediately
> after. That is user noticeable behaviour and shouldn't be changed
> without thought and discussion on a separate thread with a clear
> descriptive title. (I might argue in favour of it myself, I'm not yet
> decided).

Some more improvements in that are certainly would be good...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services