Re: Time-Delayed Standbys - Mailing list pgsql-hackers

From KONDO Mitsumasa
Subject Re: Time-Delayed Standbys
Date
Msg-id 529E8FE6.6040209@lab.ntt.co.jp
Whole thread Raw
In response to Re: Time-Delayed Standbys  (Fabrízio de Royes Mello <fabriziomello@gmail.com>)
Responses Re: Time-Delayed Standbys
Re: Time-Delayed Standbys
List pgsql-hackers
(2013/11/30 5:34), Fabrízio de Royes Mello wrote:
> On Fri, Nov 29, 2013 at 5:49 AM, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp
> <mailto:kondo.mitsumasa@lab.ntt.co.jp>> wrote:
>  > * Problem1
>  > Your patch does not code recovery.conf.sample about recovery_time_delay.
>  > Please add it.
> Fixed.
OK. It seems no problem.

>  > * Problem2
>  > When I set time-delayed standby and start standby server, I cannot access
> stanby server by psql. It is because PG is in first starting recovery which
> cannot access by psql. I think that time-delayed standby is only delayed recovery
> position, it must not affect other functionality.
>  >
>  > I didn't test recoevery in master server with recovery_time_delay. If you have
> detail test result of these cases, please send me.
>  >
> Well, I could not reproduce the problem that you described.
>
> I run the following test:
>
> 1) Clusters
> - build master
> - build slave and attach to the master using SR and config recovery_time_delay to
> 1min.
>
> 2) Stop de Slave
>
> 3) Run some transactions on the master using pgbench to generate a lot of archives
>
> 4) Start the slave and connect to it using psql and in another session I can see
> all archive recovery log
Hmm... I had thought my mistake in reading your email, but it reproduce again.
When I sat small recovery_time_delay(=30000), it might work collectry. However, I
sat long timed recovery_time_delay(=3000000), it didn't work.

My reporduced operation log is under following.
> [mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4  -p5432
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 10
> query mode: simple
> number of clients: 8
> number of threads: 4
> duration: 30 s
> number of transactions actually processed: 68704
> latency average: 3.493 ms
> tps = 2289.196747 (including connections establishing)
> tps = 2290.175129 (excluding connections establishing)
> [mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
> [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
> server starting
> [mitsu-ko@localhost postgresql]$ LOG:  database system was shut down in recovery at 2013-12-03 10:26:41 JST
> LOG:  entering standby mode
> LOG:  consistent recovery state reached at 0/5C4D8668
> LOG:  redo starts at 0/5C4000D8
> [mitsu-ko@localhost postgresql]$ FATAL:  the database system is starting up
> FATAL:  the database system is starting up
> FATAL:  the database system is starting up
> FATAL:  the database system is starting up
> FATAL:  the database system is starting up
> [mitsu-ko@localhost postgresql]$ bin/psql -p6543
> psql: FATAL:  the database system is starting up
> [mitsu-ko@localhost postgresql]$ bin/psql -p6543
> psql: FATAL:  the database system is starting up
I attached my postgresql.conf and recovery.conf. It will be reproduced.

I think that your patch should be needed recovery flags which are like
ArchiveRecoveryRequested and InArchiveRecovery etc. It is because time-delayed
standy works only replication situasion. And I hope that it isn't bad in startup
standby server and archive recovery. Is it wrong with your image? I think this
patch have a lot of potential, however I think that standby functionality is more
important than this feature. And we might need to discuss that how behavior is
best in this patch.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Sawada Masahiko
Date:
Subject: Re: Logging WAL when updating hintbit
Next
From: Tom Lane
Date:
Subject: Re: WITHIN GROUP patch