Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers

From Robert Haas
Subject Re: group locking: incomplete patch, just for discussion
Date
Msg-id CA+Tgmoavw2eQuEjZGdpHUsMiEJzHP=5ubnCi_rvX3FcvVupP1A@mail.gmail.com
Whole thread Raw
In response to Re: group locking: incomplete patch, just for discussion  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: group locking: incomplete patch, just for discussion  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Sat, Nov 1, 2014 at 1:55 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> Where will those preexisting cache entries come from, exactly?  The
>> postmaster is forking the parallel worker, not the user backend.
>
> Several things:
> 1) The relcache init files load a fair bit
> 2) There's cache entries made just during startup while we look up user
>    information and such
> 3) There's lots of places you can hook into where it's perfectly legal
>    and expected that you access system caches. I don't think you can
>    reasonably define those away.
> 4) I'm pretty sure that one of the early next steps will be to reuse a
>    bgworker for multiple tasks. So there'll definitely be preexisting
>    cache entries.
>
> I'm pretty sure that it's impossible to define the problem away. We
> *might* be able to say that we'll just do a InvalidateSystemCaches() at
> start of the parallelized task.

I see.  I haven't thought deeply about those things, and there could
indeed be problems there.

>> > What I have serious doubts about is 'coowning' locks. Especially if two
>> > backends normally wouldn't be able to both get that lock.
>>
>> Perhaps we should discuss that more.  To me it seems pretty obvious
>> that's both safe and important.
>
> I completely fail to see why you'd generally think it's safe for two
> backends to hold the same self conflicting lock. Yes, within carefully
> restricted sections of code that might be doable. But I don't think you
> can fully enforce that. People *will* mark their own functions at
> parallel safe. And do stupid stuff. That shouldn't cause
> segfaults/exploits/low level data corruption/.. as long as it's done
> from !C functions.

I agree that people will mark their own functions as parallel-safe,
and I also agree that shouldn't cause segfaults, exploits, or
low-level data corruption.  Clearly, we'll need to impose a bunch of
restrictions to achieve that.  But if you put a LOCK TABLE statement
in an SQL function, it won't conflict with other locking request made
by the same or another function elsewhere in the same transaction;
surely you don't want that behavior to be *different* in parallel
mode.  That would be a blatant, user-visible behavior difference
justified by ... nothing at all.  There's no hazard there.  Where you
start getting into crash/exploit/data corruption territory is when you
are talking about DDL operations that change the physical structure of
the table.  That's why we have stuff like CheckTableNotInUse() to
verify that, for example, there are no old cursors around that are
still expecting the old relfilenode and tuple descriptor to be valid.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Let's drop two obsolete features which are bear-traps for novices
Next
From: Robert Haas
Date:
Subject: Re: group locking: incomplete patch, just for discussion