Hello, Thank you to head me the previous discussion. I'll
consider them from now.
> > I want the standby to start to serve as soon as possible even by
> > a few seconds on failover in a HA cluster.
>
> Please implement a prototype and measure how many seconds we
> are discussing.
I'm sorry to have omitted measurement data. (But it might be
shown in previous discussion.)
Our previous measurement of failover of PostgreSQL 9.1 +
Pacemaker on some workload showed that shutdown snapshot takes 8
seconds out of 42 seconds of total failover time (about 20%).
OS : RHEL6.1-64
DBMS : PostgeSQL 9.1.1
HA : pacemaker-1.0.11-1.2.2 x64
Repl : sync
Workload : master : pgbench / scale factor = 100 (aprx. 1.5GB) standby: none (warm-standby)
shared_buffers = 2.5GB
wal_buffers = 4MB
checkpoint_segments = 300
checkpoint_timeout = 15min
checpoint_completion_target = 0.7
archive_mode = on
WAL segment comsumption was about 310 segments / 15 mins under
the condition above.
> This proposal is for a performance enhancement. We normally require
> some proof that the enhancement is real and that it doesn't have a
> poor effect on people not using it. Please make measurements.
On the benchmark above, extra load by more frequent (but the same
to the its master) checkpoint is not a problem. On the other
hand, failover time is expected to be shortened to 34 seconds
from 42 seconds by omitting the shutdown checkpoint.
(But I have not measured that..)
> Discussion on a patch submitted to me to the Januray 2012 CommitFest
> to reduce failover time.
Thank you and I'm sorry for missing it. I've found that
discussions and read them from now.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
== My e-mail address has been changed since Apr. 1, 2012.