Re: alternative model for handling locking in parallel groups - Mailing list pgsql-hackers

From Robert Haas
Subject Re: alternative model for handling locking in parallel groups
Date
Msg-id CA+TgmoYFu6bhoLiG6zNhhnoxK9B4uEcb02SqPoqaYc7HhFHcyg@mail.gmail.com
Whole thread Raw
In response to Re: alternative model for handling locking in parallel groups  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Fri, Nov 14, 2014 at 12:03 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Note that you'd definitely not want to do this naively - currently
> there's baked in assumptions into the vaccum code that only one backend
> is doing parts of it.
>
> I think there's

Did something you intended get left out here?

>> 4. Parallel query on a locked relation.  Parallel query should work on
>> a table created in the current transaction, or one explicitly locked
>> by user action.  It's not acceptable for that to just randomly
>> deadlock, and skipping parallelism altogether, while it'd probably be
>> acceptable for a first version, is not going a good long-term
>> solution.
>
> FWIW, I think it's perfectly acceptable to refuse to work in parallel in
> that scenario. And not just for now.

I don't agree with that, but my point is that I think that fixing it
so it works is probably no more work than detecting that it isn't
going to work, whether the specific proposal in this email pans out or
not.

> The biggest argument I can see to that is parallel index creation.
>
>> It also sounds buggy and fragile for the query planner to
>> try to guess whether the lock requests in the parallel workers will
>> succeed or fail when issued.
>
> I don't know. Checking whether we hold a self exclusive lock on that
> relation doesn't sound very problematic to me.

That seems like  gross oversimplification of what we need to check.
For example, suppose we want to do a parallel scan.  We grab
AccessShareLock.  Now, another backends waits for AccessExcusiveLock.
We start workers, who all try to get AccessShareLock.  They wait
behind the AccessExclusiveLock, while, outside the view of the current
lock manager, we wait for them.  No process in our group acquired more
than AccesShareLock, yet we've got problems.  We need a simple and
elegant way to avoid this kind of situation.

>> After thinking about these cases for a bit, I came up with a new
>> possible approach to this problem.  Suppose that, at the beginning of
>> parallelism, when we decide to start up workers, we grant all of the
>> locks already held by the master to each worker (ignoring the normal
>> rules for lock conflicts).  Thereafter, we do everything the same as
>> now, with no changes to the deadlock detector.  That allows the lock
>> conflicts to happen normally in the first two cases above, while
>> preventing the unwanted lock conflicts in the second two cases.
>
> I don't think that's safe enough. There's e.g. a reason why ANALYZE
> requires SUE lock. It'd definitely not be safe to simply grant the
> worker a SUE lock, just because the master backend already analyzed it
> or something like that. You could end up with the master and worker
> backends ANALYZEing the same relation.

Well, in the first version of this, I expect to prohibit parallel
workers from doing any DML or DDL whatsoever - they will be strictly
read-only.  So you won't have that problem.  Now, eventually, we might
relax that, but I would expect that a prohibition on the workers
starting new utility commands while in parallel mode wouldn't be very
high on anyone's list of restrictions to relax.

> That said, I can definitely see use for a infrastructure where we
> explicitly can grant another backend a lock that'd conflict with one
> we're already holding. But I think it'll need to be more explicit than
> just granting all currently held locks at the "highest" current
> level. And I think it's not necessarily going to be granting them all
> the locks at their current levels.
>
> What I'm thinking of is basically add a step to execMain.c:InitPlan()
> that goes through the locks and grants the child process all the locks
> that are required for the statement to run. But not possibly preexisting
> higher locks.

This doesn't actually solve the problem, because we can be
incidentally holding locks on other relations - system catalogs, in
particular - that will cause the child processes to fail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: didier
Date:
Subject: Re: Failback to old master
Next
From: "Maeldron T."
Date:
Subject: Re: Failback to old master