Re: Checkpointer on hot standby runs without looking checkpoint_segments - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Checkpointer on hot standby runs without looking checkpoint_segments |
Date | |
Msg-id | CA+Tgmoa8kTT0JLs1FQ7C43VbkboEFJnOsJRTBbgdm5XRLiFZkA@mail.gmail.com Whole thread Raw |
In response to | Re: Checkpointer on hot standby runs without looking checkpoint_segments (Florian Pflug <fgp@phlo.org>) |
List | pgsql-hackers |
On Fri, Jun 8, 2012 at 1:01 PM, Florian Pflug <fgp@phlo.org> wrote: > On Jun8, 2012, at 15:47 , Robert Haas wrote: >> On Fri, Jun 8, 2012 at 5:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> On 8 June 2012 09:14, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >>> >>>> The requirement for this patch is as follows. >>>> >>>> - What I want to get is similarity of the behaviors between >>>> master and (hot-)standby concerning checkpoint >>>> progression. Specifically, checkpoints for streaming >>>> replication running at the speed governed with >>>> checkpoint_segments. The work of this patch is avoiding to get >>>> unexpectedly large number of WAL segments stay on standby >>>> side. (Plus, increasing the chance to skip recovery-end >>>> checkpoint by my another patch.) >>> >>> Since we want wal_keep_segments number of WAL files on master (and >>> because of cascading, on standby also), I don't see any purpose to >>> triggering more frequent checkpoints just so we can hit a magic number >>> that is most often set wrong. >> >> This is a good point. Right now, if you set checkpoint_segments to a >> large value, we retain lots of old WAL segments even when the system >> is idle (cf. XLOGfileslop). I think we could be smarter about that. >> I'm not sure what the exact algorithm should be, but right now users >> are forced between setting checkpoint_segments very large to achieve >> optimum write performance and setting it small to conserve disk space. >> What would be much better, IMHO, is if the number of retained >> segments could ratchet down when the system is idle, eventually >> reaching a state where we keep only one segment beyond the one >> currently in use. > > I'm a bit sceptical about this. It seems to me that you wouldn't actually > be able to do anything useful with the conserved space, since postgres > could re-claim it at any time. At which point it'd better be available, > or your whole cluster comes to a screeching halt... Well, the issue for me is elasticity. Right now we ship with checkpoint_segments=3. That causes terribly performance on many real-world workloads. But say we ship with checkpoint_segments = 100, which is a far better setting from a performance point of view. Then pg_xlog space utilization will eventually grow to more than 3 GB, even on a low-velocity system where they don't improve performance. I'm not sure whether it's useful for the number of checkpoint segments to vary dramatically on a single system, but I do think it would be very nice if we could ship with a less conservative default without eating up so much disk space. Maybe there's a better way of going about that, but I agree with Simon's point that the setting is often wrong. Frequently it's too low; sometimes it's too high; occasionally it's got both problems simultaneously. If you have another idea on how to improve this, I'm all ears. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: