Re: Function to know last log write timestamp - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Function to know last log write timestamp |
Date | |
Msg-id | CAHGQGwE17VD5k=afAA+duSOwnH=qqHzrSg2Xf13ydyggSavkzA@mail.gmail.com Whole thread Raw |
In response to | Re: Function to know last log write timestamp (Jim Nasby <jim@nasby.net>) |
Responses |
Re: Function to know last log write timestamp
|
List | pgsql-hackers |
On Thu, Aug 28, 2014 at 2:44 AM, Jim Nasby <jim@nasby.net> wrote: > On 8/27/14, 7:33 AM, Fujii Masao wrote: >> >> On Tue, Aug 19, 2014 at 1:07 AM, Robert Haas <robertmhaas@gmail.com> >> wrote: >>> >>> On Fri, Aug 15, 2014 at 7:17 AM, Fujii Masao <masao.fujii@gmail.com> >>> wrote: >>>> >>>> On Fri, Aug 15, 2014 at 3:40 AM, Andres Freund <andres@2ndquadrant.com> >>>> wrote: >>>>> >>>>> On 2014-08-14 14:37:22 -0400, Robert Haas wrote: >>>>>> >>>>>> On Thu, Aug 14, 2014 at 2:21 PM, Andres Freund >>>>>> <andres@2ndquadrant.com> wrote: >>>>>>> >>>>>>> On 2014-08-14 14:19:13 -0400, Robert Haas wrote: >>>>>>>> >>>>>>>> That's about the idea. However, what you've got there is actually >>>>>>>> unsafe, because shmem->counter++ is not an atomic operation. It >>>>>>>> reads >>>>>>>> the counter (possibly even as two separate 4-byte loads if the >>>>>>>> counter >>>>>>>> is an 8-byte value), increments it inside the CPU, and then writes >>>>>>>> the >>>>>>>> resulting value back to memory. If two backends do this >>>>>>>> concurrently, >>>>>>>> one of the updates might be lost. >>>>>>> >>>>>>> >>>>>>> All these are only written by one backend, so it should be safe. Note >>>>>>> that that coding pattern, just without memory barriers, is all over >>>>>>> pgstat.c >>>>>> >>>>>> >>>>>> Ah, OK. If there's a separate slot for each backend, I agree that >>>>>> it's safe. >>>>>> >>>>>> We should probably add barriers to pgstat.c, too. >>>>> >>>>> >>>>> Yea, definitely. I think this is rather borked on "weaker" >>>>> architectures. It's just that the consequences of an out of date/torn >>>>> value are rather low, so it's unlikely to be noticed. >>>>> >>>>> Imo we should encapsulate the changecount modifications/checks somehow >>>>> instead of repeating the barriers, Asserts, comments et al everywhere. >>>> >>>> >>>> So what about applying the attached patch first, which adds the macros >>>> to load and store the changecount with the memory barries, and changes >>>> pgstat.c use them. Maybe this patch needs to be back-patch to at least >>>> 9.4? >>>> >>>> After applying the patch, I will rebase the >>>> pg_last_xact_insert_timestamp >>>> patch and post it again. >>> >>> >>> That looks OK to me on a relatively-quick read-through. I was >>> initially a bit worried about this part: >>> >>> do >>> { >>> ! pgstat_increment_changecount_before(beentry); >>> } while ((beentry->st_changecount & 1) == 0); >>> >>> pgstat_increment_changecount_before is an increment followed by a >>> write barrier. This seemed like funny coding to me at first because >>> while-test isn't protected by any sort of barrier. But now I think >>> it's correct, because there's only one process that can possibly write >>> to that data, and that's the one that is making the test, and it had >>> certainly better see its own modifications in program order no matter >>> what. >>> >>> I wouldn't object to back-patching this to 9.4 if we were earlier in >>> the beta cycle, but at this point I'm more inclined to just put it in >>> 9.5. If we get an actual bug report about any of this, we can always >>> back-patch the fix at that time. But so far that seems mostly >>> hypothetical, so I think the less-risky course of action is to give >>> this a longer time to bake before it hits an official release. >> >> >> Sounds reasonable. So, barring any objection, I will apply the patch >> only to the master branch. > > > It's probably worth adding a comment explaining why it's safe to do this > without a barrier... s/without/with ? Theoretically it's not safe without a barrier on a machine with weak memory ordering. No? Regards, -- Fujii Masao
pgsql-hackers by date: