On Wed, Aug 26, 2015 at 4:18 PM, Simon Riggs <
simon@2ndquadrant.com> wrote:
>
> On 26 August 2015 at 11:40, Amit Kapila <
amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <
simon@2ndquadrant.com> wrote:
>>>
>>> On 22 August 2015 at 15:14, Andres Freund <
andres@anarazel.de> wrote:
>>
>>
>>>>
>>>> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
>>>> writes an 8 byte variable (the lsn). That's not safe.
>>>
>>>
>>> Agreed, thanks for spotting that.
>>>
>>> I see this as the biggest problem to overcome with this patch.
>>
>>
>> How about using 64bit atomics or spinlock to protect this variable?
>
>
> Spinlock is out IMHO because this happens on every clog lookup. So it must be an atomic read.
>
Agreed, however using atomics is still an option, yet another way could
be before updating group_lsn, check if we already have CLogControlLock
in Exclusive mode then update it, else release the lock, re-acquire in
Exclusive mode and update the variable. The first thing that comes to mind
with this idea is that it will be less performant, yes thats true, but it will be
only done for asynchronous commits (mode which is generally not recommended
for production-use) and that too not on every commit, so may be the impact is
not that high. I have updated the patch (attached with mail) to show you what
I have in mind.
Another point about the latest patch:
+ (write_ok ||
+ shared->page_status[slotno] != SLRU_PAGE_WRITE_IN_PROGRESS))
Do you think that with new locking protocol as proposed in this
patch, it is safe to return if page status is SLRU_PAGE_WRITE_IN_PROGRESS
even if write_ok is true?
I think the case where it can cause problem is during SlruInternalWritePage()
where it performs below actions:
1. first marks the status of page as SLRU_PAGE_WRITE_IN_PROGRESS.
2. then take buffer lock in Exclusive mode.
3. release control lock.
4. perform the write
5. re-acquire the control lock in Exclusive mode.
Now consider another client which has to update the transaction status:
1. Control lock in Shared mode.
2. Get the slot
3. Acquire the buffer lock in Exclusive mode
Now consider client which has to update the transaction status performs
its step-1 after step-3 of writer, if that happens, then that could lead to
deadlock because writer will wait for client to release control lock and
client will wait for writer to release buffer lock.