Thread: Fwd: Re: New 8.4 hot standby feature

Fwd: Re: New 8.4 hot standby feature

From
Gabi Julien
Date:
On Tuesday 27 January 2009 13:13:32 you wrote:
> On Tue, 2009-01-27 at 12:53 -0500, Gabi Julien wrote:
> > I have merged the last hot standby patch (v9g) to 8.4 devel and I am
> > pleased with the experience. This is promising stuff. Perhaps it is a bit
> > too soon to ask questions here but here it is:
> >
> > 1. Speed of recovery
> >
> > With a archive_timeout of 60 seconds, it can take about 4 minutes before
> > I see the reflected changes in the replica. This is normal since, in
> > addition to the WAL log shipping, it takes more time to do the recovery
> > itself. Still, is there any way besides the archive_timeout config option
> > to speed up the recovery of WAL logs on the hot standby?
>
> Is the recovery itself the bottleneck?

Yes, the logs are shipped every minute but the recevory is 3 or 4 times
longer.

>
> There's a performance improvement submitted here:
> http://archives.postgresql.org/message-id/a778a7260810280033n43f70d36x8c437
>eacf9a5461e@mail.gmail.com
>
> But I haven't been following the development of it closely, so you'll
> have to read the thread to see whether it will meet your needs or not.

Thanks I will take a look at it. Also, I came across the record log shipping
feature too in my research:

http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD

Could this help? If the logs are smaller then I could potentially afford
shipping then at a higher frequency.

>
> Regards,
>     Jeff Davis



-------------------------------------------------------

Re: Fwd: Re: New 8.4 hot standby feature

From
Jeff Davis
Date:
On Tue, 2009-01-27 at 14:28 -0500, Gabi Julien wrote:
> Could this help? If the logs are smaller then I could potentially afford
> shipping then at a higher frequency.
>

See if there are times during which the recovery process isn't doing
anything (i.e. just waiting for WAL data). If so, something like this
might help. If it's constantly working as hard as it can, then probably
not.

An important question you should ask yourself is whether it can keep up
in the steady state at all. If the primary is producing segments faster
than the standby is recovering them, I don't think there's any way
around that.

Regards,
    Jeff Davis


Re: New 8.4 hot standby feature

From
Fujii Masao
Date:
Hi,

On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com> wrote:
> Yes, the logs are shipped every minute but the recevory is 3 or 4 times
> longer.

Are you disabling full_page_writes? It may slow down recovery several times.

> Thanks I will take a look at it. Also, I came across the record log shipping
> feature too in my research:
>
> http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD
>
> Could this help? If the logs are smaller then I could potentially afford
> shipping then at a higher frequency.

No. Even if the logs are shipped frequently, they cannot be applied until
the log file fills.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: New 8.4 hot standby feature

From
Jason Long
Date:
Fujii Masao wrote:
Hi,

On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com> wrote: 
Yes, the logs are shipped every minute but the recevory is 3 or 4 times
longer.   
Are you disabling full_page_writes? It may slow down recovery several times.
 
Thanks I will take a look at it. Also, I came across the record log shipping
feature too in my research:

http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD

Could this help? If the logs are smaller then I could potentially afford
shipping then at a higher frequency.   
No. Even if the logs are shipped frequently, they cannot be applied until
the log file fills.

Regards,
 
Is pg_clearxlogtail going to be in contrib or integrated in some other way?

Re: New 8.4 hot standby feature

From
Fujii Masao
Date:
Hi,

On Thu, Jan 29, 2009 at 12:23 AM, Jason Long
<mailing.list@supernovasoftware.com> wrote:
> Is pg_clearxlogtail going to be in contrib or integrated in some other way?

I also hope so. The related topic was discussed before.
http://archives.postgresql.org/pgsql-hackers/2009-01/msg00639.php

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Fwd: Re: New 8.4 hot standby feature

From
Gabi Julien
Date:
On Tuesday 27 January 2009 16:25:44 you wrote:
> On Tue, 2009-01-27 at 14:28 -0500, Gabi Julien wrote:
> > Could this help? If the logs are smaller then I could potentially afford
> > shipping then at a higher frequency.
>
> See if there are times during which the recovery process isn't doing
> anything (i.e. just waiting for WAL data). If so, something like this
> might help. If it's constantly working as hard as it can, then probably
> not.
>
> An important question you should ask yourself is whether it can keep up
> in the steady state at all. If the primary is producing segments faster
> than the standby is recovering them, I don't think there's any way
> around that.

The load on the slave is close to 0 so it does not explain the speed of
recovery. Also the shipping of the 16MB WAL log takes only 1 second on the
LAN. I guess the problem is probably what Fujii Masao explained. The WAL log
shipped are not yet usable or something like that. I won't try to increase
the frequency of log shipping because of that. Also, my setting of 60 seconds
is the lowest frequency suggested by the documentation anyways.

However, I have found the v4 patch about the PITR performance improvement. I
will give it a try and report here.

I might try pg_clearxlogtail too if I have time.

>
> Regards,
>     Jeff Davis


Re: New 8.4 hot standby feature

From
Gabi Julien
Date:
On Tuesday 27 January 2009 21:47:36 you wrote:
> Hi,
>
> On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com>
wrote:
> > Yes, the logs are shipped every minute but the recevory is 3 or 4 times
> > longer.
>
> Are you disabling full_page_writes? It may slow down recovery several
> times.

It looks like you found my problem. Everything I needed to know is described
here:

http://www.postgresql.org/docs/8.3/interactive/wal-configuration.html

Setting checkpoint_timeout to 55 seconds speeds up the recovery to the level I
want. Ironically, it makes the pg_last_recovered_xact_timestamp() function
more reliable too on how up to date the replica is. I am not sure that I can
take this for granted however.

I will disable full_page_writes to make sure this agressive checkpoint_timeout
setting won't slow down my master database too much. Can I be reassured on
the fact that, if the master database crashes and some data is lost, at least
the replica would keep its integrity (even though it is not in sync)?

My settings:
full_page_writes = off
checkpoint_timeout = 55s
checkpoint_completion_target = 0.7
archive_mode = on
archive_command = './archive_command.sh "%p" "%f"'
archive_timeout = 60

Also, would it be possible to recompile postgresql by using a different size
(smaller) then 16M for WAL logs and would that be a smart thing to try?

Thanks a lot to all of you.

>
> > Thanks I will take a look at it. Also, I came across the record log
> > shipping feature too in my research:
> >
> > http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STAN
> >DBY-RECORD
> >
> > Could this help? If the logs are smaller then I could potentially afford
> > shipping then at a higher frequency.
>
> No. Even if the logs are shipped frequently, they cannot be applied until
> the log file fills.
>
> Regards,



Re: New 8.4 hot standby feature

From
Gabi Julien
Date:
On Wednesday 28 January 2009 18:35:18 Gabi Julien wrote:
> On Tuesday 27 January 2009 21:47:36 you wrote:
> > Hi,
> >
> > On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com>
>
> wrote:
> > > Yes, the logs are shipped every minute but the recevory is 3 or 4 times
> > > longer.
> >
> > Are you disabling full_page_writes? It may slow down recovery several
> > times.
>
> It looks like you found my problem. Everything I needed to know is
> described here:
>
> http://www.postgresql.org/docs/8.3/interactive/wal-configuration.html
>
> Setting checkpoint_timeout to 55 seconds speeds up the recovery to the
> level I want. Ironically, it makes the pg_last_recovered_xact_timestamp()
> function more reliable too on how up to date the replica is. I am not sure
> that I can take this for granted however.

This is a good question actually. If I set the checkpoint_timeout  to
something less then the archive_timeout, can I take this for granted the fact
that pg_last_recovered_xact_timestamp() will always accurately tell me how up
to date the replica is?