Re: Support Parallel Query Execution in Executor - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Support Parallel Query Execution in Executor
Date
Msg-id 14259.1144619696@sss.pgh.pa.us
Whole thread Raw
In response to Re: Support Parallel Query Execution in Executor  ("Luke Lonergan" <llonergan@greenplum.com>)
Responses Re: Support Parallel Query Execution in Executor
List pgsql-hackers
"Luke Lonergan" <llonergan@greenplum.com> writes:
> On 4/9/06 9:27 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> 2. There are some low-level assumptions that no one reads in pages of
>> a relation without having some kind of lock on the relation (consider
>> eg the case where the relation is being dropped).  A bgwriter-like
>> process wouldn't be able to hold lmgr locks, and we wouldn't really want
>> it to be thrashing the lmgr shared data structures for each read anyway.
>> So you'd have to design some interlock to guarantee that no backend
>> abandons a query (and releases its own lmgr locks) while an async read
>> request it made is still pending.  Ugh.

> Does this lead us right back to planning for the appropriate amount of
> readahead at plan time?  We could consider a "page range" lock at that point
> instead of locking each individual page.

No, you're missing my point entirely.  What's bothering me is the
prospect of a "bgreader" process taking actions that are only safe
because of a lock that is held by a different process --- changing the
granularity of that lock doesn't get you out of trouble.

Here's a detailed scenario:
1. Backend X reads page N of a table T, queues a request for N+1.2. While processing page N, backend X gets an error
andaborts   its transaction, thereby dropping all its lmgr locks.3. Backend Y executes a DROP or TRUNCATE on T, which
itcan   now do because there's no lock held anywhere on T.  There   are actually two interesting sub-phases of this:
3a. Kill any shared buffers holding pages to be deleted.   3b.  Physically drop or truncate the OS file.4. Bgreader
triesto execute the pending read request.  Ooops.
 

If step 4 happens after 3b, the bgreader gets an error.  Maybe we could
kluge that to not cause any serious problems, but the nastier case is
where 4 happens between 3a and 3b --- the bgreader sees nothing wrong,
but it's now loaded a shared buffer that must *not* be there.

Having thought more about this, there may be a solution possible using
tighter integration with the bufmgr.  The bufmgr already has a notion of
a buffer page being "read busy".  Maybe, rather than just pushing a read
request into a separate queue somewhere, the requestor has to assign a
shared buffer for the page it wants and put the buffer into read-busy
state, but then pass the request to perform the physical read off to
someone else.  The advantage of this is that there's state that step 3a
can see telling it that a conflicting read is pending, and it just needs
to wait for the read to finish before killing the buffer.

Bottom line seems to be: just as the bgwriter is pretty intimately tied
to bufmgr, bgreaders would have to be as well.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Gregory Maxwell"
Date:
Subject: Re: Support Parallel Query Execution in Executor
Next
From: Andrew Dunstan
Date:
Subject: Re: [SUGGESTION] CVSync