On Wed, Jul 25, 2018 at 8:43 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Jul 25, 2018 at 8:29 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Wed, Jul 25, 2018 at 2:08 PM, Andres Freund <andres@anarazel.de> wrote:
>>> On 2018-07-25 14:04:11 +1200, Thomas Munro wrote:
>>>> Ok, I see it:
>>>>
>>>> /* check for interrupts while we're not
>>>> holding any buffer lock */
>>>> CHECK_FOR_INTERRUPTS();
>>>> /* step right one page */
>>>> so->currPos.buf = _bt_getbuf(rel, blkno, BT_READ);
>>>> ...
>>>> /* nope, keep going */
>>>> if (scan->parallel_scan != NULL)
>>>> {
>>>> status = _bt_parallel_seize(scan, &blkno);
>>>>
>>>> That leads to a condition variable wait, while we still hold that
>>>> buffer lock. That prevents interrupts. Oops.
>>>
>
> Well spotted. I think here we can release the current page lock
> before calling _bt_parallel_seize as we don't need it to get the next
> page.
>
I have written a patch on the above lines and manually verified (by
reproducing the issue via debugger) that it fixes the issue. Thomas,
Victor, is it possible for you guys to see if the attached fixes the
issue for you?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com