Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers

From Robert Haas
Subject Re: group locking: incomplete patch, just for discussion
Date
Msg-id CA+TgmoaxYGnbYHFOd_NCyJE7rr3ZLtopw=oRmi9V7Z+u=36+-w@mail.gmail.com
Whole thread Raw
In response to Re: group locking: incomplete patch, just for discussion  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: group locking: incomplete patch, just for discussion
Re: group locking: incomplete patch, just for discussion
Re: group locking: incomplete patch, just for discussion
List pgsql-hackers
On Fri, Oct 31, 2014 at 10:38 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> I have serious doubts about the number of cases where it's correct to
> access relations in a second backend that are exclusively locked. Not so
> much when that happens for a user issued LOCK statement of course, but
> when the system has done so. Personally I think to stay sane we'll have
> to forbid access to anything normal other backends can't access either -
> otherwise things will get too hard to reason about.
>
> So just refusing parallelism as soon as anything has taken an access
> exclusive lock doesn't sound too bad to me.

That restriction seems onerous to me; for example, it would prevent a
parallel sort for CLUSTER or a parallel index rebuild for VACUUM FULL.
Those seems like worthy targets for parallelism, and maybe even early
targets, since I expect a lot of the real complexity here to be in the
query planner, and you can avoid most of that complexity when planning
DDL.

I also think it's unnecessary.  It's wrong to think of two parallel
backends that both want AccessExclusiveLock on a given target relation
as conflicting with each other; if the process were not running in
parallel, those lock requests would both have come from the same
process and both requests would have been granted.  Parallelism is
generally not about making different things happen; it's about
spreading the stuff that would happen anyway across multiple backends,
and things that would succeed if done in a single backend shouldn't
fail when spread across 2 or more without some compelling
justification.  The cases where it's actually unsafe for a table to be
accessed even by some other code within the same backend are handled
not by the lock manager, but by CheckTableNotInUse().  That call
should probably fail categorically if issued while parallelism is
active.

>> >> 1. Turing's theorem being what it is, predicting what catalog tables
>> >> the child might lock is not necessarily simple.
>> >
>> > Well, there's really no need to be absolutely general here. We're only
>> > going to support a subset of functionality as parallelizable. And in
>> > that case we don't need anything complicated to acquire all locks.
>>
>> See, that's what I don't believe.  Even very simple things like
>> equality operators do a surprising amount of catalog locking.
>
> So what? I don't see catalog locking as something problematic? Maybe I'm
> missing something, but I don't see cases would you expect deadlocks or
> anything like it on catalog relations?

Suppose somebody fires off a parallel sort on a text column, or a
parallel sequential scan involving a qual of the form textcol = 'zog'.
We launch a bunch of workers to do comparisons; they do lookups
against pg_collation.  After some but not all of them have loaded the
collation information they need into their local cache, the DBA types
"cluster pg_collate".  It queues up for an AccessExclusiveLock.  The
remaining parallel workers join the wait queue waiting for their
AccessShareLock. The system is now deadlocked; the only solution is to
move the parallel workers ahead of the AccessExclusiveLock request,
but the deadlock detector won't do that unless it knows about the
relationship among the parallel workers.

It's possible to construct more obscure cases as well.  For example,
suppose a user (for reasons known only to himself and God) does LOCK
TABLE pg_opclass and then begins a sort of a column full of enums.
Meanwhile, another user tries to CLUSTER pg_enum.  This will deadlock
if, and only if, some of the enum OIDs are odd.  I mention this
example not because I think it's a particularly useful case but
because it illustrates just how robust the current system is: it will
catch that case, and break the deadlock somehow, and you as a
PostgreSQL backend hacker do not need to think about it.  The guy who
added pg_enum did not need to think about it.  It just works.  If we
decide that parallelism is allowed to break that guarantee, then every
patch that does parallel anything has got to worry about what
undetected deadlocks might result, and then argue about which of them
are obscure enough that we shouldn't care and which are not.  That
doesn't sound like a fun way to spend my time.

But, really, the first case is the one I'm more worried about: a bunch
of parallel backends all grab the same locks, but slightly separated
in time.  A strong locker joins the queue midway through and we're
hosed.  Obviously any system cache lookups whatsoever are going to be
separated in time like this, just because the world isn't synchronous.
The pg_enum case is an example of how the lookups could be more widely
spaced: each backend will scan pg_enum when it first hits an odd OID,
and that could happen much later for some than others.  But even a
small window is enough to render undetected deadlock a possibility,
and you will not convince me that "hope the race condition is narrow
enough in practice that this never happens" is a remotely sane
strategy.

Down the road, it's not hard to imagine the same issues cropping up
with user tables.  For example, the user defines a non-inlinable SQL
or PL/pgsql function getname() that does SELECT name FROM other_table
WHERE id = $1 and returns the result.  They mark this function
parallel safe, or we infer that automatically.  Then they do SELECT *
FROM some_table WHERE somecol = 242 AND getname(othercol) ~
'cranberry'.  There is nothing whatsoever to make this query
non-parallelizable; indeed, it's probably an excellent candidate for
parellelism.  But if you don't have group locking then it is again
susceptible to the risk of undetected deadlock as soon as an unrelated
process tries to grab AccessExclusiveLock on other_table, because it
may be that some parallel backends already have AccessShareLock on it,
and that others will request it later, queueing up behind the
AccessExcusiveLock request.

Even if you think that all of these particular issues are somehow
avoidable or that they don't matter, I think it would be unwise to bet
on them being the only issues.  The lock manager looks like it's old,
ill-maintained code, but that's in no small part because it solved the
problem it aims at so thoroughly that few people have felt the need to
modify it very much in the last decade.  I think that's part of why
there's so much skepticism on this thread - we don't think about the
problems that the lock manager solves as being important problems not
because they truly aren't important, but because they are so
thoroughly solved.  Then every once in a while we complain about the
performance.

> I'm less worried about the additional branches than about increasing the
> size of the datastructures. They're already pretty huge.

I share that concern.  I aimed for a design which would be
memory-efficient and have low impact on the non-group-locking case
rather than a design which would be ideal for group locking.  The
patch could be made to handle large locking groups more efficiently by
changing the shared memory structure around; e.g. by replacing
PROCLOCK's holdMask and waitMask fields, now both of type LOCKMASK,
with int [MAX_LOCKMODES] so that we can represent the entire group's
locks held and awaited on a single object with a single PROCLOCK.
That representation would be significantly more convenient than the
one I actually chose, but it would also use up a LOT more shared
memory and would penalize LockReleaseAll().  The design embodied by
the proposed patch adds one additional pointer per PROCLOCK, which is
still not great, but I haven't been able to think of a reasonable way
to do better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: Re: CREATE INDEX CONCURRENTLY?
Next
From: Simon Riggs
Date:
Subject: Re: group locking: incomplete patch, just for discussion