Thread: Time-Delayed Standbys
Hi all,
The attached patch is a continuation of Robert's work [1].
I made some changes:The attached patch is a continuation of Robert's work [1].
- compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and XLOG_XACT_COMMIT_COMPACT checks
- don't care about clockdrift because it's an admin problem.
Regards,
[1] http://www.postgresql.org/message-id/BANLkTi==TTzHDqWzwJDjmOf__8YuA7L1jw@mail.gmail.com
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Attachment
Hi Royes, I'm sorry for my late review... I feel potential of your patch in PG replication function, and it might be something useful for all people. I check your patch and have some comment for improvement. I haven't executed detail of unexpected sutuation yet. But I think that under following problem2 is important functionality problem. So I ask you to solve the problem in first. * Regress test No problem. * Problem1 Your patch does not code recovery.conf.sample about recovery_time_delay. Please add it. * Problem2 When I set time-delayed standby and start standby server, I cannot access stanby server by psql. It is because PG is in first starting recovery which cannot access by psql. I think that time-delayed standby is only delayed recovery position, it must not affect other functionality. I didn't test recoevery in master server with recovery_time_delay. If you have detail test result of these cases, please send me. My first easy review of your patch is that all. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
On Fri, Nov 29, 2013 at 5:49 AM, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote:
>
> Hi Royes,
>
> I'm sorry for my late review...
>
No problem...
> I feel potential of your patch in PG replication function, and it might be something useful for all people. I check your patch and have some comment for improvement. I haven't executed detail of unexpected sutuation yet. But I think that under following problem2 is important functionality problem. So I ask you to solve the problem in first.
>
> * Regress test
> No problem.
>
> * Problem1
> Your patch does not code recovery.conf.sample about recovery_time_delay.
> Please add it.
>
>
> Hi Royes,
>
> I'm sorry for my late review...
>
No problem...
> I feel potential of your patch in PG replication function, and it might be something useful for all people. I check your patch and have some comment for improvement. I haven't executed detail of unexpected sutuation yet. But I think that under following problem2 is important functionality problem. So I ask you to solve the problem in first.
>
> * Regress test
> No problem.
>
> * Problem1
> Your patch does not code recovery.conf.sample about recovery_time_delay.
> Please add it.
>
Fixed.
> * Problem2
> When I set time-delayed standby and start standby server, I cannot access stanby server by psql. It is because PG is in first starting recovery which cannot access by psql. I think that time-delayed standby is only delayed recovery position, it must not affect other functionality.
>
> I didn't test recoevery in master server with recovery_time_delay. If you have detail test result of these cases, please send me.
>
Well, I could not reproduce the problem that you described.
I run the following test:
1) Clusters
- build master
- build slave and attach to the master using SR and config recovery_time_delay to 1min.
2) Stop de Slave
3) Run some transactions on the master using pgbench to generate a lot of archives
4) Start the slave and connect to it using psql and in another session I can see all archive recovery log
> My first easy review of your patch is that all.
>
Thanks.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Attachment
Hi Fabrizio, looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It applies and compiles w/o errors or warnings. I set up a master and two hot standbys replicating from the master, one with 5 minutes delay and one without delay. After that I created a new database and generated some test data: CREATE TABLE test (val INTEGER); INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000)); The non-delayed standby nearly instantly had the data replicated, the delayed standby was replicated after exactly 5 minutes. I did not notice any problems, errors or warnings. Greetings,CK -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com> wrote:
Hi Fabrizio,
looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
applies and compiles w/o errors or warnings. I set up a master and two
hot standbys replicating from the master, one with 5 minutes delay and
one without delay. After that I created a new database and generated
some test data:
CREATE TABLE test (val INTEGER);
INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));
The non-delayed standby nearly instantly had the data replicated, the
delayed standby was replicated after exactly 5 minutes. I did not
notice any problems, errors or warnings.
Thanks for your review Christian...
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com> > wrote: >> >> Hi Fabrizio, >> >> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It >> applies and compiles w/o errors or warnings. I set up a master and two >> hot standbys replicating from the master, one with 5 minutes delay and >> one without delay. After that I created a new database and generated >> some test data: >> >> CREATE TABLE test (val INTEGER); >> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000)); >> >> The non-delayed standby nearly instantly had the data replicated, the >> delayed standby was replicated after exactly 5 minutes. I did not >> notice any problems, errors or warnings. >> > > Thanks for your review Christian... So, I proposed this patch previously and I still think it's a good idea, but it got voted down on the grounds that it didn't deal with clock drift. I view that as insufficient reason to reject the feature, but others disagreed. Unless some of those people have changed their minds, I don't think this patch has much future here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/03/2013 10:46 AM, Robert Haas wrote: > > On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello > <fabriziomello@gmail.com> wrote: >> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com> >> wrote: >>> >>> Hi Fabrizio, >>> >>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It >>> applies and compiles w/o errors or warnings. I set up a master and two >>> hot standbys replicating from the master, one with 5 minutes delay and >>> one without delay. After that I created a new database and generated >>> some test data: >>> >>> CREATE TABLE test (val INTEGER); >>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000)); >>> >>> The non-delayed standby nearly instantly had the data replicated, the >>> delayed standby was replicated after exactly 5 minutes. I did not >>> notice any problems, errors or warnings. >>> >> >> Thanks for your review Christian... > > So, I proposed this patch previously and I still think it's a good > idea, but it got voted down on the grounds that it didn't deal with > clock drift. I view that as insufficient reason to reject the > feature, but others disagreed. Unless some of those people have > changed their minds, I don't think this patch has much future here. > I would agree that it is a good idea. Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
On 2013-12-03 13:46:28 -0500, Robert Haas wrote: > On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello > <fabriziomello@gmail.com> wrote: > > On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com> > > wrote: > >> > >> Hi Fabrizio, > >> > >> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It > >> applies and compiles w/o errors or warnings. I set up a master and two > >> hot standbys replicating from the master, one with 5 minutes delay and > >> one without delay. After that I created a new database and generated > >> some test data: > >> > >> CREATE TABLE test (val INTEGER); > >> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000)); > >> > >> The non-delayed standby nearly instantly had the data replicated, the > >> delayed standby was replicated after exactly 5 minutes. I did not > >> notice any problems, errors or warnings. > >> > > > > Thanks for your review Christian... > > So, I proposed this patch previously and I still think it's a good > idea, but it got voted down on the grounds that it didn't deal with > clock drift. I view that as insufficient reason to reject the > feature, but others disagreed. I really fail to see why clock drift should be this patch's responsibility. It's not like the world would go under^W data corruption would ensue if the clocks drift. Your standby would get delayed imprecisely. Big deal. From what I know of potential users of this feature, they would set it to at the very least 30min - that's WAY above the range for acceptable clock-drift on servers. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 18 October 2013 19:03, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > The attached patch is a continuation of Robert's work [1]. Reviewing v2... > I made some changes: > - use of Latches instead of pg_usleep, so we don't have to wakeup regularly. OK > - call HandleStartupProcInterrupts() before CheckForStandbyTrigger() because > might change the trigger file's location OK > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and > XLOG_XACT_COMMIT_COMPACT checks Why just those? Why not aborts and restore points also? > - don't care about clockdrift because it's an admin problem. Few minor points on things * The code with comment "Clear any previous recovery delay time" is in wrong place, move down or remove completely. Setting the delay to zero doesn't prevent calling recoveryDelay(), so that logic looks wrong anyway. * The loop exit in recoveryDelay() is inelegant, should break if <= 0 * There's a spelling mistake in sample * The patch has whitespace in one place and one important point... * The delay loop happens AFTER we check for a pause. Which means if the user notices a problem on a commit, then hits pause button on the standby, the pause will have no effect and the next commit will be applied anyway. Maybe just one commit, but its an off by one error that removes the benefit of the patch. So I think we need to test this again after we finish delaying if (xlogctl->recoveryPause) recoveryPausesHere(); We need to explain in the docs that this is intended only for use in a live streaming deployment. It will have little or no meaning in a PITR. I think recovery_time_delay should be called <something>_apply_delay to highlight the point that it is the apply of records that is delayed, not the receipt. And hence the need to document that sync rep is NOT slowed down by setting this value. And to make the name consistent with other parameters, I suggest min_standby_apply_delay We also need to document caveats about the patch, in that it only delays on timestamped WAL records and other records may be applied sooner than the delay in some circumstances, so it is not a way to avoid all cancellations. We also need to document the behaviour of the patch is to apply all data received as quickly as possible once triggered, so the specified delay does not slow down promoting the server to a master. That might also be seen as a negative behaviour, since promoting the master effectively sets recovery_time_delay to zero. I will handle the additional documentation, if you can update the patch with the main review comments. Thanks. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2013/11/30 5:34), Fabrízio de Royes Mello wrote: > On Fri, Nov 29, 2013 at 5:49 AM, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp > <mailto:kondo.mitsumasa@lab.ntt.co.jp>> wrote: > > * Problem1 > > Your patch does not code recovery.conf.sample about recovery_time_delay. > > Please add it. > Fixed. OK. It seems no problem. > > * Problem2 > > When I set time-delayed standby and start standby server, I cannot access > stanby server by psql. It is because PG is in first starting recovery which > cannot access by psql. I think that time-delayed standby is only delayed recovery > position, it must not affect other functionality. > > > > I didn't test recoevery in master server with recovery_time_delay. If you have > detail test result of these cases, please send me. > > > Well, I could not reproduce the problem that you described. > > I run the following test: > > 1) Clusters > - build master > - build slave and attach to the master using SR and config recovery_time_delay to > 1min. > > 2) Stop de Slave > > 3) Run some transactions on the master using pgbench to generate a lot of archives > > 4) Start the slave and connect to it using psql and in another session I can see > all archive recovery log Hmm... I had thought my mistake in reading your email, but it reproduce again. When I sat small recovery_time_delay(=30000), it might work collectry. However, I sat long timed recovery_time_delay(=3000000), it didn't work. My reporduced operation log is under following. > [mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4 -p5432 > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 10 > query mode: simple > number of clients: 8 > number of threads: 4 > duration: 30 s > number of transactions actually processed: 68704 > latency average: 3.493 ms > tps = 2289.196747 (including connections establishing) > tps = 2290.175129 (excluding connections establishing) > [mitsu-ko@localhost postgresql]$ vim slave/recovery.conf > [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start > server starting > [mitsu-ko@localhost postgresql]$ LOG: database system was shut down in recovery at 2013-12-03 10:26:41 JST > LOG: entering standby mode > LOG: consistent recovery state reached at 0/5C4D8668 > LOG: redo starts at 0/5C4000D8 > [mitsu-ko@localhost postgresql]$ FATAL: the database system is starting up > FATAL: the database system is starting up > FATAL: the database system is starting up > FATAL: the database system is starting up > FATAL: the database system is starting up > [mitsu-ko@localhost postgresql]$ bin/psql -p6543 > psql: FATAL: the database system is starting up > [mitsu-ko@localhost postgresql]$ bin/psql -p6543 > psql: FATAL: the database system is starting up I attached my postgresql.conf and recovery.conf. It will be reproduced. I think that your patch should be needed recovery flags which are like ArchiveRecoveryRequested and InArchiveRecovery etc. It is because time-delayed standy works only replication situasion. And I hope that it isn't bad in startup standby server and archive recovery. Is it wrong with your image? I think this patch have a lot of potential, however I think that standby functionality is more important than this feature. And we might need to discuss that how behavior is best in this patch. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
Attachment
(2013/12/04 4:00), Andres Freund wrote: > On 2013-12-03 13:46:28 -0500, Robert Haas wrote: >> On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello >> <fabriziomello@gmail.com> wrote: >>> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com> >>> wrote: >>>> >>>> Hi Fabrizio, >>>> >>>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It >>>> applies and compiles w/o errors or warnings. I set up a master and two >>>> hot standbys replicating from the master, one with 5 minutes delay and >>>> one without delay. After that I created a new database and generated >>>> some test data: >>>> >>>> CREATE TABLE test (val INTEGER); >>>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000)); >>>> >>>> The non-delayed standby nearly instantly had the data replicated, the >>>> delayed standby was replicated after exactly 5 minutes. I did not >>>> notice any problems, errors or warnings. >>>> >>> >>> Thanks for your review Christian... >> >> So, I proposed this patch previously and I still think it's a good >> idea, but it got voted down on the grounds that it didn't deal with >> clock drift. I view that as insufficient reason to reject the >> feature, but others disagreed. > > I really fail to see why clock drift should be this patch's > responsibility. It's not like the world would go under^W data corruption > would ensue if the clocks drift. Your standby would get delayed > imprecisely. Big deal. From what I know of potential users of this > feature, they would set it to at the very least 30min - that's WAY above > the range for acceptable clock-drift on servers. Yes. I think that purpose of this patch is long time delay in standby server, and not for little bit careful timing delay. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
On 2013-12-04 11:13:58 +0900, KONDO Mitsumasa wrote: > >4) Start the slave and connect to it using psql and in another session I can see > >all archive recovery log > Hmm... I had thought my mistake in reading your email, but it reproduce again. > When I sat small recovery_time_delay(=30000), it might work collectry. > However, I sat long timed recovery_time_delay(=3000000), it didn't work. > My reporduced operation log is under following. > >[mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4 -p5432 > >starting vacuum...end. > >transaction type: TPC-B (sort of) > >scaling factor: 10 > >query mode: simple > >number of clients: 8 > >number of threads: 4 > >duration: 30 s > >number of transactions actually processed: 68704 > >latency average: 3.493 ms > >tps = 2289.196747 (including connections establishing) > >tps = 2290.175129 (excluding connections establishing) > >[mitsu-ko@localhost postgresql]$ vim slave/recovery.conf > >[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start > >server starting > >[mitsu-ko@localhost postgresql]$ LOG: database system was shut down in recovery at 2013-12-03 10:26:41 JST > >LOG: entering standby mode > >LOG: consistent recovery state reached at 0/5C4D8668 > >LOG: redo starts at 0/5C4000D8 > >[mitsu-ko@localhost postgresql]$ FATAL: the database system is starting up > >FATAL: the database system is starting up > >FATAL: the database system is starting up > >FATAL: the database system is starting up > >FATAL: the database system is starting up > >[mitsu-ko@localhost postgresql]$ bin/psql -p6543 > >psql: FATAL: the database system is starting up > >[mitsu-ko@localhost postgresql]$ bin/psql -p6543 > >psql: FATAL: the database system is starting up > I attached my postgresql.conf and recovery.conf. It will be reproduced. So, you brought up a standby and it took more time to become consistent because it waited on commits? That's the problem? If so, I don't think that's a bug? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 04/12/13 11:13, KONDO Mitsumasa wrote: > >1) Clusters > >- build master > >- build slave and attach to the master using SR and config recovery_time_delay to > >1min. > > > >2) Stop de Slave > > > >3) Run some transactions on the master using pgbench to generate a lot of archives > > > >4) Start the slave and connect to it using psql and in another session I can see > >all archive recovery log > Hmm... I had thought my mistake in reading your email, but it reproduce again. > When I sat small recovery_time_delay(=30000), it might work collectry. > However, I sat long timed recovery_time_delay(=3000000), it didn't work. > […] I'm not sure if I understand your problem correctly. I try to summarize, please correct if I'm wrong: You created a master node and a hot standby with 3000000 delay. Then you stopped the standby, did the pgbench and startet the hot standby again. It did not get in line with the master. Is this correct? I don't see a problem here… the standby should not be in sync with the master, it should be delayed. I did step by step what you did and after 50 minutes (3000000ms) the standby was at the same level the master was. Did I missunderstand you? Regards,Christian Kruse -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2013/12/4 Andres Freund <andres@2ndquadrant.com>
If you think this behavior is the best, I will set ready for commiter. And commiter will fix it better.So, you brought up a standby and it took more time to become consistentOn 2013-12-04 11:13:58 +0900, KONDO Mitsumasa wrote:
> >4) Start the slave and connect to it using psql and in another session I can see
> >all archive recovery log
> Hmm... I had thought my mistake in reading your email, but it reproduce again.
> When I sat small recovery_time_delay(=30000), it might work collectry.
> However, I sat long timed recovery_time_delay(=3000000), it didn't work.
> My reporduced operation log is under following.
> >[mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4 -p5432
> >starting vacuum...end.
> >transaction type: TPC-B (sort of)
> >scaling factor: 10
> >query mode: simple
> >number of clients: 8
> >number of threads: 4
> >duration: 30 s
> >number of transactions actually processed: 68704
> >latency average: 3.493 ms
> >tps = 2289.196747 (including connections establishing)
> >tps = 2290.175129 (excluding connections establishing)
> >[mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
> >[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
> >server starting
> >[mitsu-ko@localhost postgresql]$ LOG: database system was shut down in recovery at 2013-12-03 10:26:41 JST
> >LOG: entering standby mode
> >LOG: consistent recovery state reached at 0/5C4D8668
> >LOG: redo starts at 0/5C4000D8
> >[mitsu-ko@localhost postgresql]$ FATAL: the database system is starting up
> >FATAL: the database system is starting up
> >FATAL: the database system is starting up
> >FATAL: the database system is starting up
> >FATAL: the database system is starting up
> >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> >psql: FATAL: the database system is starting up
> >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> >psql: FATAL: the database system is starting up
> I attached my postgresql.conf and recovery.conf. It will be reproduced.
because it waited on commits? That's the problem? If so, I don't think
that's a bug?
When it happened, psql cannot connect standby server at all. I think this behavior is not good.
It should only delay recovery position and can seen old delay table data. Cannot connect server is not hoped behavior.
It should only delay recovery position and can seen old delay table data. Cannot connect server is not hoped behavior.
Rregards,
--
Mitsumasa KONDO
NTT Open Source Software Center
On 2013-12-04 22:47:47 +0900, Mitsumasa KONDO wrote: > 2013/12/4 Andres Freund <andres@2ndquadrant.com> > When it happened, psql cannot connect standby server at all. I think this > behavior is not good. > It should only delay recovery position and can seen old delay table data. That doesn't sound like a good plan - even if the clients cannot connect yet, you can still promote the server. Just not taking delay into consideration at that point seems like it would possibly surprise users rather badly in situations they really cannot use such surprises. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2013-12-03 19:33:16 +0000, Simon Riggs wrote: > > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and > > XLOG_XACT_COMMIT_COMPACT checks > > Why just those? Why not aborts and restore points also? What would the advantage of waiting on anything but commits be? If it's not a commit, the action won't change the state of the database (yesyes, there are exceptions, but those don't have a timestamp)... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2013/12/4 Christian Kruse <christian@2ndquadrant.com>
You created a master node and a hot standby with 3000000 delay. Then
you stopped the standby, did the pgbench and startet the hot standby
again. It did not get in line with the master. Is this correct?
No. First, I start master, and execute pgbench. Second, I start standby with 3000000ms(50min) delay.
Then it cannot connect standby server by psql at all. I'm not sure why standby did not start.
It might because delay feature is disturbed in REDO loop when first standby start-up.
It might because delay feature is disturbed in REDO loop when first standby start-up.
I don't see a problem here… the standby should not be in sync with the
master, it should be delayed. I did step by step what you did and
after 50 minutes (3000000ms) the standby was at the same level the
master was.
I think we can connect standby server any time, nevertheless with delay option.
Did I missunderstand you?
I'm not sure... You might right or another best way might be existed.
Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
2013/12/4 Andres Freund <andres@2ndquadrant.com>
--
On 2013-12-04 22:47:47 +0900, Mitsumasa KONDO wrote:> 2013/12/4 Andres Freund <andres@2ndquadrant.com>That doesn't sound like a good plan - even if the clients cannot connect
> When it happened, psql cannot connect standby server at all. I think this
> behavior is not good.
> It should only delay recovery position and can seen old delay table data.
yet, you can still promote the server.
I'm not sure your argument, but does a purpose of this patch slip off?
Just not taking delay into
consideration at that point seems like it would possibly surprise users
rather badly in situations they really cannot use such surprises.
Hmm... I think user will be surprised...
I think it is easy to fix behavior using recovery flag.
So we had better to wait for other comments.
I think it is easy to fix behavior using recovery flag.
So we had better to wait for other comments.
Regards,
Mitsumasa KONDO
NTT Open Source Software Center
Robert Haas <robertmhaas@gmail.com> wrote: > So, I proposed this patch previously and I still think it's a > good idea, but it got voted down on the grounds that it didn't > deal with clock drift. I view that as insufficient reason to > reject the feature, but others disagreed. Unless some of those > people have changed their minds, I don't think this patch has > much future here. There are many things that a system admin can get wrong. Failing to supply this feature because the sysadmin might not be running ntpd (or equivalent) correctly seems to me to be like not having the software do fsync because the sysadmin might not have turned off write-back buffering on drives without persistent storage. Either way, poor system management can defeat the feature. Either way, I see no reason to withhold the feature from those who manage their systems in a sane fashion. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On 04/12/13 07:22, Kevin Grittner wrote: > There are many things that a system admin can get wrong. Failing > to supply this feature because the sysadmin might not be running > ntpd (or equivalent) correctly seems to me to be like not having > the software do fsync because the sysadmin might not have turned > off write-back buffering on drives without persistent storage. > Either way, poor system management can defeat the feature. Either > way, I see no reason to withhold the feature from those who manage > their systems in a sane fashion. I agree. But maybe we should add a warning in the documentation about time syncing? Greetings,CK -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
src/backend/access/transam/xlog.c:5889: trailing whitespace.
On 3 December 2013 18:46, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello > <fabriziomello@gmail.com> wrote: >> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com> >> wrote: >>> >>> Hi Fabrizio, >>> >>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It >>> applies and compiles w/o errors or warnings. I set up a master and two >>> hot standbys replicating from the master, one with 5 minutes delay and >>> one without delay. After that I created a new database and generated >>> some test data: >>> >>> CREATE TABLE test (val INTEGER); >>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000)); >>> >>> The non-delayed standby nearly instantly had the data replicated, the >>> delayed standby was replicated after exactly 5 minutes. I did not >>> notice any problems, errors or warnings. >>> >> >> Thanks for your review Christian... > > So, I proposed this patch previously and I still think it's a good > idea, but it got voted down on the grounds that it didn't deal with > clock drift. I view that as insufficient reason to reject the > feature, but others disagreed. Unless some of those people have > changed their minds, I don't think this patch has much future here. I had that objection and others. Since then many people have requested this feature and have persuaded me that this is worth having and that my objections are minor points. I now agree with the need for the feature, almost as written. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Dec 5, 2013 at 1:45 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 3 December 2013 18:46, Robert Haas <robertmhaas@gmail.com> wrote:I had that objection and others. Since then many people have requested
> On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
>> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse <christian@2ndquadrant.com>
>> wrote:
>>>
>>> Hi Fabrizio,
>>>
>>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
>>> applies and compiles w/o errors or warnings. I set up a master and two
>>> hot standbys replicating from the master, one with 5 minutes delay and
>>> one without delay. After that I created a new database and generated
>>> some test data:
>>>
>>> CREATE TABLE test (val INTEGER);
>>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 1000000));
>>>
>>> The non-delayed standby nearly instantly had the data replicated, the
>>> delayed standby was replicated after exactly 5 minutes. I did not
>>> notice any problems, errors or warnings.
>>>
>>
>> Thanks for your review Christian...
>
> So, I proposed this patch previously and I still think it's a good
> idea, but it got voted down on the grounds that it didn't deal with
> clock drift. I view that as insufficient reason to reject the
> feature, but others disagreed. Unless some of those people have
> changed their minds, I don't think this patch has much future here.
this feature and have persuaded me that this is worth having and that
my objections are minor points. I now agree with the need for the
feature, almost as written.
Not recalling the older thread, but it seems the "breaks on clock drift", I think we can fairly easily make that situation "good enough". Just have IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to start if the time difference is too great. Yes, that doesn't catch the case when the machines are in perfect sync when they start up and drift *later*, but it will catch the most common cases I bet. But I think that's good enough that we can accept the feature, given that *most* people will have ntp, and that it's a very useful feature for those people. But we could help people who run into it because of a simple config error..
Or maybe the suggested patch already does this, in which case ignore that part :)
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 5 December 2013 08:51, Magnus Hagander <magnus@hagander.net> wrote: > Not recalling the older thread, but it seems the "breaks on clock drift", I > think we can fairly easily make that situation "good enough". Just have > IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to > start if the time difference is too great. Yes, that doesn't catch the case > when the machines are in perfect sync when they start up and drift *later*, > but it will catch the most common cases I bet. But I think that's good > enough that we can accept the feature, given that *most* people will have > ntp, and that it's a very useful feature for those people. But we could help > people who run into it because of a simple config error.. > > Or maybe the suggested patch already does this, in which case ignore that > part :) I think the very nature of *this* feature is that it doesnt *require* the clocks to be exactly in sync, even though that is the basis for measurement. The setting of this parameter for sane usage would be minimum 5 mins, but more likely 30 mins, 1 hour or more. In that case, a few seconds drift either way makes no real difference to this feature. So IMHO, without prejudice to other features that may be more critically reliant on time synchronisation, we are OK to proceed with this specific feature. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Dec 5, 2013 at 7:57 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 5 December 2013 08:51, Magnus Hagander <magnus@hagander.net> wrote:I think the very nature of *this* feature is that it doesnt *require*
> Not recalling the older thread, but it seems the "breaks on clock drift", I
> think we can fairly easily make that situation "good enough". Just have
> IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to
> start if the time difference is too great. Yes, that doesn't catch the case
> when the machines are in perfect sync when they start up and drift *later*,
> but it will catch the most common cases I bet. But I think that's good
> enough that we can accept the feature, given that *most* people will have
> ntp, and that it's a very useful feature for those people. But we could help
> people who run into it because of a simple config error..
>
> Or maybe the suggested patch already does this, in which case ignore that
> part :)
the clocks to be exactly in sync, even though that is the basis for
measurement.
The setting of this parameter for sane usage would be minimum 5 mins,
but more likely 30 mins, 1 hour or more.
In that case, a few seconds drift either way makes no real difference
to this feature.
So IMHO, without prejudice to other features that may be more
critically reliant on time synchronisation, we are OK to proceed with
this specific feature.
Hi all,
I saw the comments of all of you. I'm a few busy with some customers issues (has been a crazy week), but I'll reply and/or fix your suggestions later.
Thanks for all review and sorry to delay in reply.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
> > XLOG_XACT_COMMIT_COMPACT checks
>
> Why just those? Why not aborts and restore points also?
>
I think make no sense execute the delay after aborts and/or restore points, because it not change data in a standby server.
> > - don't care about clockdrift because it's an admin problem.
>
> Few minor points on things
>
> * The code with comment "Clear any previous recovery delay time" is in
> wrong place, move down or remove completely. Setting the delay to zero
> doesn't prevent calling recoveryDelay(), so that logic looks wrong
> anyway.
>
Fixed.
> * The loop exit in recoveryDelay() is inelegant, should break if <= 0
>
Fixed.
> * There's a spelling mistake in sample
>
Fixed.
> * The patch has whitespace in one place
>
Fixed.
> and one important point...
>
> * The delay loop happens AFTER we check for a pause. Which means if
> the user notices a problem on a commit, then hits pause button on the
> standby, the pause will have no effect and the next commit will be
> applied anyway. Maybe just one commit, but its an off by one error
> that removes the benefit of the patch. So I think we need to test this
> again after we finish delaying
>
> if (xlogctl->recoveryPause)
> recoveryPausesHere();
>
Fixed.
>
> We need to explain in the docs that this is intended only for use in a
> live streaming deployment. It will have little or no meaning in a
> PITR.
>
Fixed.
> I think recovery_time_delay should be called
> <something>_apply_delay
> to highlight the point that it is the apply of records that is
> delayed, not the receipt. And hence the need to document that sync rep
> is NOT slowed down by setting this value.
>
Fixed.
> And to make the name consistent with other parameters, I suggest
> min_standby_apply_delay
>
I agree. Fixed!
> We also need to document caveats about the patch, in that it only
> delays on timestamped WAL records and other records may be applied
> sooner than the delay in some circumstances, so it is not a way to
> avoid all cancellations.
>
> We also need to document the behaviour of the patch is to apply all
> data received as quickly as possible once triggered, so the specified
> delay does not slow down promoting the server to a master. That might
> also be seen as a negative behaviour, since promoting the master
> effectively sets recovery_time_delay to zero.
>
> I will handle the additional documentation, if you can update the
> patch with the main review comments. Thanks.
>
Thanks, your help is welcome.
Att,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Attachment
On Thu, Dec 5, 2013 at 11:07 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and >> > XLOG_XACT_COMMIT_COMPACT checks >> >> Why just those? Why not aborts and restore points also? >> > > I think make no sense execute the delay after aborts and/or restore points, > because it not change data in a standby server. I see no reason to pause for aborts. Aside from the fact that it wouldn't be reliable in corner cases, as Fabrízio says, there's no user-visible effect, just as there's no user-visible effect from replaying a transaction up until just prior to the point where it commits (which we also do). Waiting for restore points seems like it potentially makes sense. If the standby is delayed by an hour, and you create a restore point and wait 55 minutes, you might expect that that you can still kill the standby and recover it to that restore point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Dec 6, 2013 at 1:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
-- On Thu, Dec 5, 2013 at 11:07 PM, Fabrízio de Royes MelloI see no reason to pause for aborts. Aside from the fact that it
<fabriziomello@gmail.com> wrote:
> On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
>> > XLOG_XACT_COMMIT_COMPACT checks
>>
>> Why just those? Why not aborts and restore points also?
>>
>
> I think make no sense execute the delay after aborts and/or restore points,
> because it not change data in a standby server.
wouldn't be reliable in corner cases, as Fabrízio says, there's no
user-visible effect, just as there's no user-visible effect from
replaying a transaction up until just prior to the point where it
commits (which we also do).
Waiting for restore points seems like it potentially makes sense. If
the standby is delayed by an hour, and you create a restore point and
wait 55 minutes, you might expect that that you can still kill the
standby and recover it to that restore point.
Makes sense. Fixed.
Regards,
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Attachment
Hi Fabrízio, I test your v4 patch, and send your review comments. * Fix typo> 49 -# commited transactions from the master, specify a recovery time delay.> 49 +# committed transactions fromthe master, specify a recovery time delay. * Fix white space> 177 - if (secs <= 0 && microsecs <=0)> 177 + if (secs <= 0 && microsecs <=0) * Add functionality (I propose) We can set negative number at min_standby_apply_delay. I think that this feature is for world wide replication situation. For example, master server is in Japan and slave server is in San Francisco. Japan time fowards than San Francisco time . And if we want to delay in this situation, it can need negative number in min_standby_apply_delay. So I propose that time delay conditional branch change under following.> - if (min_standby_apply_delay > 0)> + if (min_standby_apply_delay != 0) What do you think? It might also be working collectry. * Problem 1 I read your wittened document. There is "PITR has not affected". However, when I run PITR with min_standby_apply_delay=3000000, it cannot start server. The log is under following. > [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start > server starting > [mitsu-ko@localhost postgresql]$ LOG: database system was interrupted; last known up at 2013-12-08 18:57:00 JST > LOG: creating missing WAL directory "pg_xlog/archive_status" > cp: cannot stat `../arc/00000002.history': > LOG: starting archive recovery > LOG: restored log file "000000010000000000000041" from archive > LOG: redo starts at 0/41000028 > LOG: consistent recovery state reached at 0/410000F0 > LOG: database system is ready to accept read only connections > LOG: restored log file "000000010000000000000042" from archive > FATAL: cannot wait on a latch owned by another process > LOG: startup process (PID 30501) exited with exit code 1 > LOG: terminating any other active server processes We need recovery flag for controling PITR situation. That's all for now. If you are busy, please fix in your pace. I'm busy and I'd like to wait your time, too:-) Regards, -- Mitsumasa KONDO NTT Open Source Software Center
2013/12/9 KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp>
Hi Fabrízio,
I test your v4 patch, and send your review comments.
* Fix typo
> 49 -# commited transactions from the master, specify a recovery time delay.
> 49 +# committed transactions from the master, specify a recovery time delay.
* Fix white space
> 177 - if (secs <= 0 && microsecs <=0)
> 177 + if (secs <= 0 && microsecs <=0 )
* Add functionality (I propose)
We can set negative number at min_standby_apply_delay. I think that this feature
is for world wide replication situation. For example, master server is in Japan and slave server is in San Francisco. Japan time fowards than San Francisco time
. And if we want to delay in this situation, it can need negative number in min_standby_apply_delay. So I propose that time delay conditional branch change under following.
> - if (min_standby_apply_delay > 0)
> + if (min_standby_apply_delay != 0)
What do you think? It might also be working collectry.
what using interval instead absolute time?
Regards
Pavel
* Problem 1
I read your wittened document. There is "PITR has not affected".
However, when I run PITR with min_standby_apply_delay=3000000, it cannot start server. The log is under following.[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 startWe need recovery flag for controling PITR situation.
server starting
[mitsu-ko@localhost postgresql]$ LOG: database system was interrupted; last known up at 2013-12-08 18:57:00 JST
LOG: creating missing WAL directory "pg_xlog/archive_status"
cp: cannot stat `../arc/00000002.history':
LOG: starting archive recovery
LOG: restored log file "000000010000000000000041" from archive
LOG: redo starts at 0/41000028
LOG: consistent recovery state reached at 0/410000F0
LOG: database system is ready to accept read only connections
LOG: restored log file "000000010000000000000042" from archive
FATAL: cannot wait on a latch owned by another process
LOG: startup process (PID 30501) exited with exit code 1
LOG: terminating any other active server processes
That's all for now.
If you are busy, please fix in your pace. I'm busy and I'd like to wait your time, too:-)
Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
(2013/12/09 19:36), KONDO Mitsumasa wrote: > * Problem 1 > I read your wittened document. There is "PITR has not affected". > However, when I run PITR with min_standby_apply_delay=3000000, it cannot start > server. The log is under following. >> [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start >> server starting >> [mitsu-ko@localhost postgresql]$ LOG: database system was interrupted; last >> known up at 2013-12-08 18:57:00 JST >> LOG: creating missing WAL directory "pg_xlog/archive_status" >> cp: cannot stat `../arc/00000002.history': >> LOG: starting archive recovery >> LOG: restored log file "000000010000000000000041" from archive >> LOG: redo starts at 0/41000028 >> LOG: consistent recovery state reached at 0/410000F0 >> LOG: database system is ready to accept read only connections >> LOG: restored log file "000000010000000000000042" from archive >> FATAL: cannot wait on a latch owned by another process >> LOG: startup process (PID 30501) exited with exit code 1 >> LOG: terminating any other active server processes > We need recovery flag for controling PITR situation. Add my comment. We have to consider three situations. 1. PITR 2. replication standby 3. replication standby with restore_command I think this patch cannot delay in 1 situation. So I think you should add only StandbyModeRequested flag in conditional branch. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
(2013/12/09 19:35), Pavel Stehule wrote: > > > > 2013/12/9 KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp > <mailto:kondo.mitsumasa@lab.ntt.co.jp>> > > Hi Fabrízio, > > I test your v4 patch, and send your review comments. > > * Fix typo > > 49 -# commited transactions from the master, specify a recovery time delay. > > 49 +# committed transactions from the master, specify a recovery time delay. > > * Fix white space > > 177 - if (secs <= 0 && microsecs <=0) > > 177 + if (secs <= 0 && microsecs <=0 ) > > * Add functionality (I propose) > We can set negative number at min_standby_apply_delay. I think that this feature > is for world wide replication situation. For example, master server is in > Japan and slave server is in San Francisco. Japan time fowards than San > Francisco time > . And if we want to delay in this situation, it can need negative number in > min_standby_apply_delay. So I propose that time delay conditional branch > change under following. > > - if (min_standby_apply_delay > 0) > > + if (min_standby_apply_delay != 0) > What do you think? It might also be working collectry. > > > what using interval instead absolute time? This is because local time is recorded in XLOG. And it has big cost for calculating global time. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote: > Add my comment. We have to consider three situations. > > 1. PITR > 2. replication standby > 3. replication standby with restore_command > > I think this patch cannot delay in 1 situation. Why? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/04/2013 02:46 AM, Robert Haas wrote: >> Thanks for your review Christian... > > So, I proposed this patch previously and I still think it's a good > idea, but it got voted down on the grounds that it didn't deal with > clock drift. I view that as insufficient reason to reject the > feature, but others disagreed. Unless some of those people have > changed their minds, I don't think this patch has much future here. Surely that's the operating system / VM host / sysadmin / whatever's problem? The only way to "deal with" clock drift that isn't fragile in the face of variable latency, etc, is to basically re-implement (S)NTP in order to find out what the clock difference with the remote is. If we're going to do that, why not just let the OS deal with it? It might well be worth complaining about obvious aberrations like timestamps in the local future - preferably by complaining and not actually dying. It does need to be able to cope with a *skewing* clock, but I'd be surprised if it had any issues there in the first place. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
<p dir="ltr"><br /> On 9 Dec 2013 12:16, "Craig Ringer" <<a href="mailto:craig@2ndquadrant.com">craig@2ndquadrant.com</a>>wrote:<p dir="ltr">> The only way to "deal with" clockdrift that isn't fragile in the face<br /> > of variable latency, etc, is to basically re-implement (S)NTP in order<br/> > to find out what the clock difference with the remote is.<br /><p dir="ltr">There's actually an entirelydifferent way to "deal" with clock drift: test "master time" and "slave time" as two different incomparable spaces.Similar to how you would treat measurements in different units.<p dir="ltr">If you do that then you can measure andmanage the delay in the slave between receiving and applying a record and also measure the amount of master server timewhich can be pending. These measurements don't depend at all on time sync between servers.<p dir="ltr">The specifiedfeature depends explicitly on the conversion between master and slave time spaces so it's inevitable that sync wouldbe an issue. It might be nice to print a warning on connection if the time is far out of sync or periodically check.But I don't think reimplementing NTP is a good idea.
(2013/12/09 20:29), Andres Freund wrote: > On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote: >> Add my comment. We have to consider three situations. >> >> 1. PITR >> 2. replication standby >> 3. replication standby with restore_command >> >> I think this patch cannot delay in 1 situation. > > Why? I have three reasons. 1. It is written in document. Can we remove it? 2. Name of this feature is "Time-delayed *standbys*", not "Time-delayed *recovery*". Can we change it? 3. I think it is unnessesary in master PITR. And if it can delay in master PITR, it will become master at unexpected timing, not to continue to recovery. It is meaningless. I'd like to ask you what do you expect from this feature and how to use it. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
On 2013-12-10 13:26:27 +0900, KONDO Mitsumasa wrote: > (2013/12/09 20:29), Andres Freund wrote: > >On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote: > >>Add my comment. We have to consider three situations. > >> > >>1. PITR > >>2. replication standby > >>3. replication standby with restore_command > >> > >>I think this patch cannot delay in 1 situation. > > > >Why? > > I have three reasons. None of these reasons seem to be of technical nature, right? > 1. It is written in document. Can we remove it? > 2. Name of this feature is "Time-delayed *standbys*", not "Time-delayed > *recovery*". Can we change it? I don't think that'd be a win in clarity. But perhaps somebody else has a better suggestion? > 3. I think it is unnessesary in master PITR. And if it can delay in master > PITR, it will become master at unexpected timing, not to continue to > recovery. It is meaningless. "master PITR"? What's that? All PITR is based on recovery.conf and thus not really a "master"? Why should we prohibit using this feature in PITR? I don't see any advantage in doing so. If somebody doesn't want the delay, they shouldn't set it in the configuration file. End of story. There's not really a that meaningful distinction between PITR and replication using archive_command. Especially when using *pause_after. I think this feature will be used in a lot of scenarios in which PITR is currently used. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2013/12/10 18:38), Andres Freund wrote: > "master PITR"? What's that? All PITR is based on recovery.conf and thus > not really a "master"? "master PITR" is PITR with "standby_mode = off". It's just recovery from basebackup. They have difference between "master PITR" and "standby" that the former will be independent timelineID, but the latter is same timeline ID taht following the master sever. In the first place, purposes are different. > Why should we prohibit using this feature in PITR? I don't see any > advantage in doing so. If somebody doesn't want the delay, they > shouldn't set it in the configuration file. End of story. Unfortunately, there are a lot of stupid in the world... I think you have these clients, too. > There's not really a that meaningful distinction between PITR and > replication using archive_command. Especially when using > *pause_after. It is meaningless in "master PITR". It will be master which has new timelineID at unexpected timing. > I think this feature will be used in a lot of scenarios in > which PITR is currently used. We have to judge which is better, we get something potential or to protect stupid. And we had better to wait author's comment... Regards, -- Mitsumasa KONDO NTT Open Source Software Center
On 11 December 2013 06:36, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote: >> I think this feature will be used in a lot of scenarios in >> which PITR is currently used. > > We have to judge which is better, we get something potential or to protect > stupid. > And we had better to wait author's comment... I'd say just document that it wouldn't make sense to use it for PITR. There may be some use case we can't see yet, so specifically prohibiting a use case that is not dangerous seems too much at this point. I will no doubt be reminded of these words in the future... -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
<div dir="ltr"><div class="gmail_extra"><br />On Wed, Dec 11, 2013 at 6:27 AM, Simon Riggs <<a href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>>wrote:<br />><br />> On 11 December 2013 06:36, KONDOMitsumasa<br /> > <<a href="mailto:kondo.mitsumasa@lab.ntt.co.jp">kondo.mitsumasa@lab.ntt.co.jp</a>> wrote:<br/>><br />> >> I think this feature will be used in a lot of scenarios in<br />> >> which PITRis currently used.<br /> > ><br />> > We have to judge which is better, we get something potential or toprotect<br />> > stupid.<br />> > And we had better to wait author's comment...<br />><br />> I'd sayjust document that it wouldn't make sense to use it for PITR.<br /> ><br />> There may be some use case we can'tsee yet, so specifically<br />> prohibiting a use case that is not dangerous seems too much at this<br />> point.I will no doubt be reminded of these words in the future...<br /> ><br /><br /></div><div class="gmail_extra">Hiall,<br /><br />I tend to agree with Simon, but I confess that I don't liked to delay a server withstandby_mode = 'off'.<br /><br /></div><div class="gmail_extra">The main goal of this patch is delay the Streaming Replication,so if the slave server isn't a hot-standby I think makes no sense to delay it.<br /></div><div class="gmail_extra"><br/></div><div class="gmail_extra">Mitsumasa suggested to add "StandbyModeRequested" in conditionalbranch to skip this situation. I agree with him!<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">AndI'll change 'recoveryDelay' (functions, variables) to 'standbyDelay'. <br /><br /></div><div class="gmail_extra">Regards,<br/><br /></div><div class="gmail_extra">--<br /> Fabrízio de Royes Mello<br />Consultoria/CoachingPostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
On 2013-12-11 16:37:54 -0200, Fabrízio de Royes Mello wrote: > On Wed, Dec 11, 2013 at 6:27 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > >> I think this feature will be used in a lot of scenarios in > > >> which PITR is currently used. > > > > > > We have to judge which is better, we get something potential or to protect > > > stupid. > > > And we had better to wait author's comment... > > > > I'd say just document that it wouldn't make sense to use it for PITR. > > > > There may be some use case we can't see yet, so specifically > > prohibiting a use case that is not dangerous seems too much at this > > point. I will no doubt be reminded of these words in the future... > I tend to agree with Simon, but I confess that I don't liked to delay a > server with standby_mode = 'off'. > The main goal of this patch is delay the Streaming Replication, so if the > slave server isn't a hot-standby I think makes no sense to delay it. > Mitsumasa suggested to add "StandbyModeRequested" in conditional branch to > skip this situation. I agree with him! I don't think that position has any merit, sorry: Think about the way this stuff gets setup. The user creates a new basebackup (pg_basebackup, manual pg_start/stop_backup, shutdown primary). Then he creates a recovery conf by either starting from scratch, using --write-recovery-conf or by copying recovery.conf.sample. In none of these cases delay will be configured. So, with that in mind, the only way it could have been configured is by the user *explicitly* writing it into recovery.conf. And now you want to to react to this explicit step by just *silently* ignoring the setting based on some random criteria (arguments have been made about hot_standby=on/off, standby_mode=on/off which aren't directly related). Why on earth would that by a usability improvement? Also, you seem to assume there's no point in configuring it for any of hot_standby=off, standby_mode=off, recovery_target=*. Why? There's usecases for all of them: * hot_standby=off: Makes delay useable with wal_level=archive (and thus a lower WAL volume) * standby_mode=off: Configurations that use tools like pg_standby and similar simply don't need standby_mode=on. If you wantto trigger failover from within the restore_command you *cannot* set it. * recovery_target_*: It can still make sense if you use pause_at_recovery_target. In which scenarios does your restriction actually improve anything? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>
> I don't think that position has any merit, sorry: Think about the way
> this stuff gets setup. The user creates a new basebackup (pg_basebackup,
> manual pg_start/stop_backup, shutdown primary). Then he creates a
> recovery conf by either starting from scratch, using
> --write-recovery-conf or by copying recovery.conf.sample. In none of
> these cases delay will be configured.
>
> So, with that in mind, the only way it could have been configured is by
> the user *explicitly* writing it into recovery.conf. And now you want to
> to react to this explicit step by just *silently* ignoring the setting
> based on some random criteria (arguments have been made about
> hot_standby=on/off, standby_mode=on/off which aren't directly
> related). Why on earth would that by a usability improvement?
>
> Also, you seem to assume there's no point in configuring it for any of
> hot_standby=off, standby_mode=off, recovery_target=*. Why? There's
> usecases for all of them:
> * hot_standby=off: Makes delay useable with wal_level=archive (and thus
> a lower WAL volume)
> * standby_mode=off: Configurations that use tools like pg_standby and
> similar simply don't need standby_mode=on. If you want to trigger
> failover from within the restore_command you *cannot* set it.
> * recovery_target_*: It can still make sense if you use
> pause_at_recovery_target.
>
> In which scenarios does your restriction actually improve anything?
>
Given your arguments I'm forced to review my understanding of the problem. You are absolutely right in your assertions. I was not seeing the scenario on this perspective.
Anyway we need to improve docs, any suggestions?
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
(2013/12/12 7:23), Fabrízio de Royes Mello wrote: > On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund <andres@2ndquadrant.com > > * hot_standby=off: Makes delay useable with wal_level=archive (and thus > > a lower WAL volume) > > * standby_mode=off: Configurations that use tools like pg_standby and > > similar simply don't need standby_mode=on. If you want to trigger > > failover from within the restore_command you *cannot* set it. > > * recovery_target_*: It can still make sense if you use > > pause_at_recovery_target. I don't think part of his arguments are right very much... We can just set stanby_mode=on when we use "min_standby_apply_delay" with pg_standby and similar simply tools. However, I tend to agree with not to need to prohibit except for standby_mode. So I'd like to propose that changing parameter name of "min_standby_apply_delay" to "min_recovery_apply_delay". It is natural for this feature. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
On 12 December 2013 08:19, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote: > (2013/12/12 7:23), Fabrízio de Royes Mello wrote: >> >> On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund <andres@2ndquadrant.com >> > * hot_standby=off: Makes delay useable with wal_level=archive (and thus >> > a lower WAL volume) >> > * standby_mode=off: Configurations that use tools like pg_standby and >> > similar simply don't need standby_mode=on. If you want to trigger >> > failover from within the restore_command you *cannot* set it. >> > * recovery_target_*: It can still make sense if you use >> > pause_at_recovery_target. > > > I don't think part of his arguments are right very much... We can just set > stanby_mode=on when we use "min_standby_apply_delay" with pg_standby and > similar simply tools. However, I tend to agree with not to need to prohibit > except for standby_mode. So I'd like to propose that changing parameter name > of "min_standby_apply_delay" to "min_recovery_apply_delay". It is natural > for this feature. OK -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 9 December 2013 10:54, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote: > (2013/12/09 19:35), Pavel Stehule wrote: >> >> >> >> >> 2013/12/9 KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp >> <mailto:kondo.mitsumasa@lab.ntt.co.jp>> >> >> >> Hi Fabrízio, >> >> I test your v4 patch, and send your review comments. >> >> * Fix typo >> > 49 -# commited transactions from the master, specify a recovery >> time delay. >> > 49 +# committed transactions from the master, specify a recovery >> time delay. >> >> * Fix white space >> > 177 - if (secs <= 0 && microsecs <=0) >> > 177 + if (secs <= 0 && microsecs <=0 ) >> >> * Add functionality (I propose) >> We can set negative number at min_standby_apply_delay. I think that >> this feature >> is for world wide replication situation. For example, master server is >> in >> Japan and slave server is in San Francisco. Japan time fowards than >> San >> Francisco time >> . And if we want to delay in this situation, it can need negative >> number in >> min_standby_apply_delay. So I propose that time delay conditional >> branch >> change under following. >> > - if (min_standby_apply_delay > 0) >> > + if (min_standby_apply_delay != 0) >> What do you think? It might also be working collectry. >> >> >> what using interval instead absolute time? > > This is because local time is recorded in XLOG. And it has big cost for > calculating global time. I agree with your request here, but I don't think negative values are the right way to implement that, at least it would not be very usable. My suggestion would be to add the TZ to the checkpoint record. This way all users of WAL can see the TZ of the master and act accordingly. I'll do a separate patch for that. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2013/12/12 18:09), Simon Riggs wrote: > On 9 December 2013 10:54, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote: >> (2013/12/09 19:35), Pavel Stehule wrote: >>> >>> >>> >>> >>> 2013/12/9 KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp >>> <mailto:kondo.mitsumasa@lab.ntt.co.jp>> >>> >>> >>> Hi Fabrízio, >>> >>> I test your v4 patch, and send your review comments. >>> >>> * Fix typo >>> > 49 -# commited transactions from the master, specify a recovery >>> time delay. >>> > 49 +# committed transactions from the master, specify a recovery >>> time delay. >>> >>> * Fix white space >>> > 177 - if (secs <= 0 && microsecs <=0) >>> > 177 + if (secs <= 0 && microsecs <=0 ) >>> >>> * Add functionality (I propose) >>> We can set negative number at min_standby_apply_delay. I think that >>> this feature >>> is for world wide replication situation. For example, master server is >>> in >>> Japan and slave server is in San Francisco. Japan time fowards than >>> San >>> Francisco time >>> . And if we want to delay in this situation, it can need negative >>> number in >>> min_standby_apply_delay. So I propose that time delay conditional >>> branch >>> change under following. >>> > - if (min_standby_apply_delay > 0) >>> > + if (min_standby_apply_delay != 0) >>> What do you think? It might also be working collectry. >>> >>> >>> what using interval instead absolute time? >> >> This is because local time is recorded in XLOG. And it has big cost for >> calculating global time. > > I agree with your request here, but I don't think negative values are > the right way to implement that, at least it would not be very usable. I think that my proposal is the easiest and simplist way to solve this problem. And I believe that the man who cannot calculate the difference in time-zone doesn't set replication cluster across continents. > My suggestion would be to add the TZ to the checkpoint record. This > way all users of WAL can see the TZ of the master and act accordingly. > I'll do a separate patch for that. It is something useful for also other situations. However, it might be going to happen long and complicated discussions... I think that our hope is to commit this patch in this commit-fest or next final commit-fest. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
On 12 December 2013 10:42, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote: >> I agree with your request here, but I don't think negative values are >> the right way to implement that, at least it would not be very usable. > > I think that my proposal is the easiest and simplist way to solve this > problem. And I believe that the man who cannot calculate the difference in > time-zone doesn't set replication cluster across continents. > > >> My suggestion would be to add the TZ to the checkpoint record. This >> way all users of WAL can see the TZ of the master and act accordingly. >> I'll do a separate patch for that. > > It is something useful for also other situations. However, it might be > going to happen long and complicated discussions... I think that our hope is > to commit this patch in this commit-fest or next final commit-fest. Agreed on no delay for the delay patch, as shown by my commit. Still think we need better TZ handling. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2013-12-12 09:09:21 +0000, Simon Riggs wrote: > >> * Add functionality (I propose) > >> We can set negative number at min_standby_apply_delay. I think that > >> this feature > >> is for world wide replication situation. For example, master server is > >> in > >> Japan and slave server is in San Francisco. Japan time fowards than > >> San > >> Francisco time > >> . And if we want to delay in this situation, it can need negative > > This is because local time is recorded in XLOG. And it has big cost for > > calculating global time. Uhm? Isn't the timestamp in commit records actually a TimestampTz? And thus essentially stored as UTC? I don't think this problem actually exists? > My suggestion would be to add the TZ to the checkpoint record. This > way all users of WAL can see the TZ of the master and act accordingly. > I'll do a separate patch for that. Intuitively I'd say that might be useful - but I am not reall sure what for. And we don't exactly have a great interface for looking at a checkpoint's data. Maybe add it to the control file instead? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12 December 2013 11:05, Andres Freund <andres@2ndquadrant.com> wrote: >> My suggestion would be to add the TZ to the checkpoint record. This >> way all users of WAL can see the TZ of the master and act accordingly. >> I'll do a separate patch for that. > > Intuitively I'd say that might be useful - but I am not reall sure what > for. And we don't exactly have a great interface for looking at a > checkpoint's data. Maybe add it to the control file instead? That's actually what I had in mind, I just phrased it badly in mid-thought. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2013/12/12 Simon Riggs <simon@2ndquadrant.com>
On 12 December 2013 10:42, KONDO Mitsumasa<kondo.mitsumasa@lab.ntt.co.jp> wrote:Agreed on no delay for the delay patch, as shown by my commit.
>> I agree with your request here, but I don't think negative values are
>> the right way to implement that, at least it would not be very usable.
>
> I think that my proposal is the easiest and simplist way to solve this
> problem. And I believe that the man who cannot calculate the difference in
> time-zone doesn't set replication cluster across continents.
>
>
>> My suggestion would be to add the TZ to the checkpoint record. This
>> way all users of WAL can see the TZ of the master and act accordingly.
>> I'll do a separate patch for that.
>
> It is something useful for also other situations. However, it might be
> going to happen long and complicated discussions... I think that our hope is
> to commit this patch in this commit-fest or next final commit-fest.
Our forecast was very accurate...
Nice commit, Thanks!
Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
Simon Riggs <simon@2ndQuadrant.com> writes: > On 12 December 2013 11:05, Andres Freund <andres@2ndquadrant.com> wrote: >>> My suggestion would be to add the TZ to the checkpoint record. This >>> way all users of WAL can see the TZ of the master and act accordingly. >>> I'll do a separate patch for that. >> Intuitively I'd say that might be useful - but I am not reall sure what >> for. And we don't exactly have a great interface for looking at a >> checkpoint's data. Maybe add it to the control file instead? > That's actually what I had in mind, I just phrased it badly in mid-thought. I don't think you realize what a can of worms that would be. There's no compact representation of "a timezone", unless you are only proposing to store the UTC offset; and frankly I'm not particularly seeing the point of that. regards, tom lane
On Thu, Dec 12, 2013 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> On 12 December 2013 11:05, Andres Freund <andres@2ndquadrant.com> wrote: >>>> My suggestion would be to add the TZ to the checkpoint record. This >>>> way all users of WAL can see the TZ of the master and act accordingly. >>>> I'll do a separate patch for that. > >>> Intuitively I'd say that might be useful - but I am not reall sure what >>> for. And we don't exactly have a great interface for looking at a >>> checkpoint's data. Maybe add it to the control file instead? > >> That's actually what I had in mind, I just phrased it badly in mid-thought. > > I don't think you realize what a can of worms that would be. There's > no compact representation of "a timezone", unless you are only proposing > to store the UTC offset; and frankly I'm not particularly seeing the point > of that. +1. I can see the point of storing a timestamp in each checkpoint record, if we don't already, but time zones should be completely irrelevant to this feature. Everything should be reckoned in seconds since the epoch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12 December 2013 15:03, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Dec 12, 2013 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Simon Riggs <simon@2ndQuadrant.com> writes: >>> On 12 December 2013 11:05, Andres Freund <andres@2ndquadrant.com> wrote: >>>>> My suggestion would be to add the TZ to the checkpoint record. This >>>>> way all users of WAL can see the TZ of the master and act accordingly. >>>>> I'll do a separate patch for that. >> >>>> Intuitively I'd say that might be useful - but I am not reall sure what >>>> for. And we don't exactly have a great interface for looking at a >>>> checkpoint's data. Maybe add it to the control file instead? >> >>> That's actually what I had in mind, I just phrased it badly in mid-thought. >> >> I don't think you realize what a can of worms that would be. There's >> no compact representation of "a timezone", unless you are only proposing >> to store the UTC offset; and frankly I'm not particularly seeing the point >> of that. > > +1. I can see the point of storing a timestamp in each checkpoint > record, if we don't already, but time zones should be completely > irrelevant to this feature. Everything should be reckoned in seconds > since the epoch. Don't panic guys! I meant UTC offset only. And yes, it may not be needed, will check. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12 December 2013 15:19, Simon Riggs <simon@2ndquadrant.com> wrote: > Don't panic guys! I meant UTC offset only. And yes, it may not be > needed, will check. Checked, all non-UTC TZ offsets work without further effort here. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
<div dir="ltr"><div class="gmail_extra"><br />On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs <<a href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>>wrote:<br />><br />> On 12 December 2013 15:19, SimonRiggs <<a href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>> wrote:<br /> ><br />> > Don'tpanic guys! I meant UTC offset only. And yes, it may not be<br />> > needed, will check.<br />><br />> Checked,all non-UTC TZ offsets work without further effort here.<br />><br /><br />Thanks!<br /><br />--<br />Fabríziode Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
On Thu, Dec 12, 2013 at 3:42 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote:
>
> On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > On 12 December 2013 15:19, Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > > Don't panic guys! I meant UTC offset only. And yes, it may not be
> > > needed, will check.
> >
> > Checked, all non-UTC TZ offsets work without further effort here.
> >
>
> Thanks!
>
If we promote the standby during the delay and don't check the trigger immediately after the delay, then we will replay undesired WALs records.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Attachment
On 12 December 2013 21:58, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > > On Thu, Dec 12, 2013 at 3:42 PM, Fabrízio de Royes Mello > <fabriziomello@gmail.com> wrote: >> >> On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs <simon@2ndquadrant.com> >> wrote: >> > >> > On 12 December 2013 15:19, Simon Riggs <simon@2ndquadrant.com> wrote: >> > >> > > Don't panic guys! I meant UTC offset only. And yes, it may not be >> > > needed, will check. >> > >> > Checked, all non-UTC TZ offsets work without further effort here. >> > >> >> Thanks! >> > > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()" > after the delay was removed. > > If we promote the standby during the delay and don't check the trigger > immediately after the delay, then we will replay undesired WALs records. > > The attached patch add this check. I removed it because it was after the pause. I'll replace it, but before the pause. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2013-12-13 11:56:47 +0000, Simon Riggs wrote: > On 12 December 2013 21:58, Fabrízio de Royes Mello > <fabriziomello@gmail.com> wrote: > > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()" > > after the delay was removed. > > > > If we promote the standby during the delay and don't check the trigger > > immediately after the delay, then we will replay undesired WALs records. > > > > The attached patch add this check. > > I removed it because it was after the pause. I'll replace it, but > before the pause. Doesn't after the pause make more sense? If somebody promoted while we were waiting, we want to recognize that before rolling forward? The wait can take a long while after all? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 13 December 2013 11:58, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-12-13 11:56:47 +0000, Simon Riggs wrote: >> On 12 December 2013 21:58, Fabrízio de Royes Mello >> <fabriziomello@gmail.com> wrote: >> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()" >> > after the delay was removed. >> > >> > If we promote the standby during the delay and don't check the trigger >> > immediately after the delay, then we will replay undesired WALs records. >> > >> > The attached patch add this check. >> >> I removed it because it was after the pause. I'll replace it, but >> before the pause. > > Doesn't after the pause make more sense? If somebody promoted while we > were waiting, we want to recognize that before rolling forward? The wait > can take a long while after all? That would change the way pause currently works, which is OOS for that patch. I'm happy to discuss such a change, but if agreed, it would need to apply in all cases, not just this one. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2013-12-13 13:09:13 +0000, Simon Riggs wrote: > On 13 December 2013 11:58, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2013-12-13 11:56:47 +0000, Simon Riggs wrote: > >> On 12 December 2013 21:58, Fabrízio de Royes Mello > >> <fabriziomello@gmail.com> wrote: > >> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()" > >> > after the delay was removed. > >> > > >> > If we promote the standby during the delay and don't check the trigger > >> > immediately after the delay, then we will replay undesired WALs records. > >> > > >> > The attached patch add this check. > >> > >> I removed it because it was after the pause. I'll replace it, but > >> before the pause. > > > > Doesn't after the pause make more sense? If somebody promoted while we > > were waiting, we want to recognize that before rolling forward? The wait > > can take a long while after all? > > That would change the way pause currently works, which is OOS for that patch. But this feature isn't pause itself - it's imo something independent. Note that we currently a) check pause again after recoveryApplyDelay(), b) do check for promotion if the sleep in recoveryApplyDelay() is interrupted. So not checking after the final sleep seemsconfusing. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 13 December 2013 13:22, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-12-13 13:09:13 +0000, Simon Riggs wrote: >> On 13 December 2013 11:58, Andres Freund <andres@2ndquadrant.com> wrote: >> > On 2013-12-13 11:56:47 +0000, Simon Riggs wrote: >> >> On 12 December 2013 21:58, Fabrízio de Royes Mello >> >> <fabriziomello@gmail.com> wrote: >> >> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()" >> >> > after the delay was removed. >> >> > >> >> > If we promote the standby during the delay and don't check the trigger >> >> > immediately after the delay, then we will replay undesired WALs records. >> >> > >> >> > The attached patch add this check. >> >> >> >> I removed it because it was after the pause. I'll replace it, but >> >> before the pause. >> > >> > Doesn't after the pause make more sense? If somebody promoted while we >> > were waiting, we want to recognize that before rolling forward? The wait >> > can take a long while after all? >> >> That would change the way pause currently works, which is OOS for that patch. > > But this feature isn't pause itself - it's imo something > independent. Note that we currently > a) check pause again after recoveryApplyDelay(), > b) do check for promotion if the sleep in recoveryApplyDelay() is > interrupted. So not checking after the final sleep seems confusing. I'm proposing the attached patch. This patch implements a consistent view of recovery pause, which is that when paused, we don't check for promotion, during or immediately after. That is user noticeable behaviour and shouldn't be changed without thought and discussion on a separate thread with a clear descriptive title. (I might argue in favour of it myself, I'm not yet decided). -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
<div dir="ltr"><div class="gmail_extra"><br />On Fri, Dec 13, 2013 at 11:44 AM, Simon Riggs <<a href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>>wrote:<br />><br />> On 13 December 2013 13:22, AndresFreund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>> wrote:<br /> > > On 2013-12-1313:09:13 +0000, Simon Riggs wrote:<br />> >> On 13 December 2013 11:58, Andres Freund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />> >> > On 2013-12-13 11:56:47+0000, Simon Riggs wrote:<br /> > >> >> On 12 December 2013 21:58, Fabrízio de Royes Mello<br />>>> >> <<a href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>> wrote:<br />> >>>> > Reviewing the committed patch I noted that the "CheckForStandbyTrigger()"<br /> > >> >>> after the delay was removed.<br />> >> >> ><br />> >> >> > If we promotethe standby during the delay and don't check the trigger<br />> >> >> > immediately after the delay,then we will replay undesired WALs records.<br /> > >> >> ><br />> >> >> > Theattached patch add this check.<br />> >> >><br />> >> >> I removed it because it was afterthe pause. I'll replace it, but<br />> >> >> before the pause.<br /> > >> ><br />> >>> Doesn't after the pause make more sense? If somebody promoted while we<br />> >> > were waiting,we want to recognize that before rolling forward? The wait<br />> >> > can take a long while after all?<br/> > >><br />> >> That would change the way pause currently works, which is OOS for that patch.<br/>> ><br />> > But this feature isn't pause itself - it's imo something<br />> > independent.Note that we currently<br /> > > a) check pause again after recoveryApplyDelay(),<br />> > b) docheck for promotion if the sleep in recoveryApplyDelay() is<br />> > interrupted. So not checking after the finalsleep seems confusing.<br /> ><br />> I'm proposing the attached patch.<br />><br />> This patch implementsa consistent view of recovery pause, which is<br />> that when paused, we don't check for promotion, duringor immediately<br />> after. That is user noticeable behaviour and shouldn't be changed<br /> > without thoughtand discussion on a separate thread with a clear<br />> descriptive title. (I might argue in favour of it myself,I'm not yet<br />> decided).<br />><br /><br /></div><div class="gmail_extra">In my previous message [1] I attacha patch equal to your ;-)<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">Regards,<br /></div><divclass="gmail_extra"><br />[1] <a href="http://www.postgresql.org/message-id/CAFcNs+qD0AJ=qzhsHD9+v_Mhz0RTBJ=cJPCT_T=uT_JvvnC1FQ@mail.gmail.com">http://www.postgresql.org/message-id/CAFcNs+qD0AJ=qzhsHD9+v_Mhz0RTBJ=cJPCT_T=uT_JvvnC1FQ@mail.gmail.com</a><br /><br/>--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
On 2013-12-13 13:44:30 +0000, Simon Riggs wrote: > On 13 December 2013 13:22, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2013-12-13 13:09:13 +0000, Simon Riggs wrote: > >> On 13 December 2013 11:58, Andres Freund <andres@2ndquadrant.com> wrote: > >> >> I removed it because it was after the pause. I'll replace it, but > >> >> before the pause. > >> > > >> > Doesn't after the pause make more sense? If somebody promoted while we > >> > were waiting, we want to recognize that before rolling forward? The wait > >> > can take a long while after all? > >> > >> That would change the way pause currently works, which is OOS for that patch. > > > > But this feature isn't pause itself - it's imo something > > independent. Note that we currently > > a) check pause again after recoveryApplyDelay(), > > b) do check for promotion if the sleep in recoveryApplyDelay() is > > interrupted. So not checking after the final sleep seems confusing. > > I'm proposing the attached patch. LOoks good, although I'd move it down below the comment ;) > This patch implements a consistent view of recovery pause, which is > that when paused, we don't check for promotion, during or immediately > after. That is user noticeable behaviour and shouldn't be changed > without thought and discussion on a separate thread with a clear > descriptive title. (I might argue in favour of it myself, I'm not yet > decided). Some more improvements in that are certainly would be good... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services