Re: Fix checkpoint skip logic on idle systems by tracking LSN progress - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: Fix checkpoint skip logic on idle systems by tracking LSN progress |
Date | |
Msg-id | 20160930.140015.150178454.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Fix checkpoint skip logic on idle systems by tracking LSN progress (David Steele <david@pgmasters.net>) |
Responses |
Re: Fix checkpoint skip logic on idle systems by
tracking LSN progress
|
List | pgsql-hackers |
Sorry, I might have torn off this thread somehow.. At Thu, 29 Sep 2016 11:26:29 -0400, David Steele <david@pgmasters.net> wrote in <30095aea-3910-dbb7-1790-a579fb93fa5e@pgmasters.net> > On 9/28/16 10:32 PM, Michael Paquier wrote: > > On Thu, Sep 29, 2016 at 7:45 AM, David Steele <david@pgmasters.net> > > wrote: > >> > >> In general I agree with the other comments that this could end up > >> being > >> a problem. On the other hand, since the additional locks are only > >> taken > >> at checkpoint or archive_timeout it may not be that big a deal. > > > > Yes, I did some tests on my laptop a couple of months back, that has 4 > > cores. After reducing NUM_XLOGINSERT_LOCKS from 8 to 4 to increase > > contention and performing a bunch of INSERT using 4 clients on 4 > > different relations I could not catch a difference.. Autovacuum was > > disabled to eliminate any noise. I tried checkpoint_segments at 30s to > > see its effects, as well as larger values to see the impact with the > > standby snapshot taken by the bgwriter. Other thoughts are welcome. > > I don't have any better ideas than that. I don't see no problem in setting progressAt in XLogInsertRecord. But I doubt GetProgressRecPtr is harmful, especially when NUM_XLOGINSERT_LOCKS is *large*. So reducing the number seems rather alleviates the impact. But it actually doesn't seem so harmful up to 8. (Even though I don't like the locking in GetProgressRecPtr..) Currently possiblly harmful calling of GetProgressRecPtr is that in BackgroundWriterMain. It should be called with ther interval BgWriterDelay, and anytime pgwriter recieved SIGUSR1. But I don't see the issuer of SIGUSR1 of bgwriter.. > >> +++ b/src/backend/postmaster/checkpointer.c > >> + /* OK, it's time to switch */ > >> + elog(LOG, "Request XLog Switch"); > >> > >> LOG level seems a bit much here, perhaps DEBUG1? > > > > That's from Horiguchi-san's patch, and those would be definitely > > better as DEBUG1 by looking at it. Now and in order to keep things > > simple I think that we had better discard this patch for now. I was > > planning to come back to this thing anyway once we are done with the > > first problem. > > I still see this: > > +++ b/src/backend/postmaster/checkpointer.c > /* OK, it's time to switch */ > + elog(LOG, "Request XLog Switch"); > > > Well for now attached are two patches, that could just be squashed > > into one. Mmmm. Sorry, this was for my quite private instant debug, spilt outside.. But I don't mind to leave it with DEBUG2 if it seems useful. > Yes, I think that makes sense. > > More importantly, there is a regression. With your new patch the > xlogs are switching on archive_timeout again even with no changes. > The v12 worked fine. As Michael mentioned in this or another thread, it is another issue that he wants to solve separately. I personally doubt that this patch (v11 and v13) can be evaluated alone without it, but we can review this with the excessive switching problem, perhaps? > The differences are all in 0002-hs-checkpoints-v12-2.patch and as far > as I can see the patch does not work correctly without these changes. > Am I missing something? regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: