On Tue, Aug 2, 2016 at 11:19 AM, Noah Misch <noah@leadboat.com> wrote:
> On Sat, Jul 23, 2016 at 01:25:55PM +0530, Amit Kapila wrote:
>> On Mon, Jul 18, 2016 at 2:03 PM, Andres Freund <andres@anarazel.de> wrote:
>> > On 2016-07-18 10:02:52 +0530, Amit Kapila wrote:
>> >> Consider the below scenario.
>> >>
>> >> Vacuum
>> >> a. acquires a cleanup lock for page - 10
>> >> b. busy in checking visibility of tuples
>> >> --assume, here it takes some time and in the meantime Session-1
>> >> performs step (a) and (b) and start waiting in step- (c)
>> >> c. marks the page as all-visible (PageSetAllVisible)
>> >> d. unlockandrelease the buffer
>> >>
>> >> Session-1
>> >> a. In heap_lock_tuple(), readbuffer for page-10
>> >> b. check PageIsAllVisible(), found page is not all-visible, so didn't
>> >> acquire the visbilitymap_pin
>> >> c. LockBuffer in ExlusiveMode - here it will wait for vacuum to
>> >> release the lock
>> >> d. Got the lock, but now the page is marked as all-visible, so ideally
>> >> need to recheck the page and acquire the visibilitymap_pin
>> >
>> > So, I've tried pretty hard to reproduce that. While the theory above is
>> > sound, I believe the relevant code-path is essentially dead for SQL
>> > callable code, because we'll always hold a buffer pin before even
>> > entering heap_update/heap_lock_tuple.
>> >
>>
>> It is possible that we don't hold any buffer pin before entering
>> heap_update() and or heap_lock_tuple(). For heap_update(), it is
>> possible when it enters via simple_heap_update() path. For
>> heap_lock_tuple(), it is possible for ON CONFLICT DO Update statement
>> and may be others as well.
>
> This is currently listed as a 9.6 open item. Is it indeed a regression in
> 9.6, or do released versions have the same defect? If it is a 9.6 regression,
> do you happen to know which commit, or at least which feature, caused it?
>
Commit eca0f1db is the reason for this specific issue.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com