Re: alternative model for handling locking in parallel groups - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: alternative model for handling locking in parallel groups
Date
Msg-id CAA4eK1JFEE3fLyTNRwKy=tsVU2gBqh3z3Xmd=S76uUmGp9PZLw@mail.gmail.com
Whole thread Raw
In response to Re: alternative model for handling locking in parallel groups  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Nov 19, 2014 at 8:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Nov 18, 2014 at 8:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> After thinking about these cases for a bit, I came up with a new
> >> possible approach to this problem.  Suppose that, at the beginning of
> >> parallelism, when we decide to start up workers, we grant all of the
> >> locks already held by the master to each worker (ignoring the normal
> >> rules for lock conflicts).  Thereafter, we do everything the same as
> >> now, with no changes to the deadlock detector.  That allows the lock
> >> conflicts to happen normally in the first two cases above, while
> >> preventing the unwanted lock conflicts in the second two cases.
> >
> > Here I think we have to consider how to pass the information about
> > all the locks held by master to worker backends.  Also I think assuming
> > we have such an information available, still it will be considerable work
> > to grant locks considering the number of locks we acquire [1] (based
> > on Simon's analysis) and the additional memory they require.  Finally
> > I think deadlock detector work might also be increased as there will be
> > now more procs to visit.
> >
> > In general, I think this scheme will work, but I am not sure it is worth
> > at this stage (considering initial goal to make parallel workers will be
> > used for read operations).
>
> I think this scheme might be quite easy to implement - we just need
> the user backend to iterate over the locks it holds and serialize them
> (similar to what pg_background does for GUCs) and store that in the
> DSM; the parallel worker calls some function on that serialized
> representation to put all those locks back in place.  

Today, I have thought about this scheme a bit more and it seems that
we can go through the local lock hash table (LockMethodLocalHash) to
get the locks which user backend holds and then using Local lock tag
we can form the similar hash table in worker.  Apart from that, I think
we need to get the information for fastpath locks from PGPROC and
restore the same information for worker (here before restore we need
to ensure that nobody else has moved that fast path lock to shared
table, we can do that by checking the available fast path information
against the user backend fast path lock information).  For non-fastpath
locks, I think we can search shared hash table (LockMethodLockHash) to
obtain the lock information and assign the same to local lock and finally
establish the proclock entry for parallel worker in LockMethodProcLockHash.

In the above analysis, one point which slightly worries me is that for
non-fastpath locks we need to obtain partitionLock which can be
bottleneck as all parallel worker's needs to obtain the same and it can
very well contend with any unrelated backend as well. 

> The actual
> deadlock detector should need few or no changes, which seems like a
> major advantage in comparison with the approach dicussed on the other
> thread.
>

Okay, but there is one downside to this approach as well which is
additional overhead of acquiring more number of locks which will
be more severe as it has to be taken care every time for a parallel
operation for each worker whereas in other approach there will be
no such overhead.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: WAL format and API changes (9.5)
Next
From: Pavel Golub
Date:
Subject: "Closed connection unexpectedly caused by winsock error 10061 on Win2008 R2