On Wed, Jul 25, 2018 at 8:29 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Jul 25, 2018 at 2:08 PM, Andres Freund <andres@anarazel.de> wrote:
>> On 2018-07-25 14:04:11 +1200, Thomas Munro wrote:
>>> Ok, I see it:
>>>
>>> /* check for interrupts while we're not
>>> holding any buffer lock */
>>> CHECK_FOR_INTERRUPTS();
>>> /* step right one page */
>>> so->currPos.buf = _bt_getbuf(rel, blkno, BT_READ);
>>> ...
>>> /* nope, keep going */
>>> if (scan->parallel_scan != NULL)
>>> {
>>> status = _bt_parallel_seize(scan, &blkno);
>>>
>>> That leads to a condition variable wait, while we still hold that
>>> buffer lock. That prevents interrupts. Oops.
>>
Well spotted. I think here we can release the current page lock
before calling _bt_parallel_seize as we don't need it to get the next
page. See the backward scan case, in particular, I am referring to
the below code:
_bt_readnextpage()
{
..
* For parallel scans, get the last page scanned as it is quite
* possible that by the time we try to seize the scan, some other
* worker has already advanced the scan to a different page. We
* must continue based on the latest page scanned by any worker.
*/
if (scan->parallel_scan != NULL)
{
_bt_relbuf(rel, so->currPos.buf);
status = _bt_parallel_seize(scan, &blkno);
..
}
This needs some more analysis. I will continue the analysis and
shared findings.
Thanks, Thomas for pinging me offlist and including me here.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com