Thread: Rethinking LOCK TABLE's behavior on views
The problems discussed in bug #16703 [1] show that pg_dump needs a version of LOCK TABLE that behaves differently for views than what we have now. Since v11, LOCK TABLE on a view recurses to all tables and views named in the view, and it does so using the view owner's permissions, meaning that a view that would have permissions failures if executed will also have permissions failures when locked. That's probably fine for ordinary usage, but it's disastrous for pg_dump --- even a superuser can't lock such a view. Moreover, pg_dump doesn't really need the recursive behavior. It just needs the view's definition to hold still; and in any case, a typical pg_dump run would have independently acquired locks on all the other relations anyway. The recursion is buying us nothing, except perhaps an increased risk of deadlocks against concurrent DDL operations. (I'm not quite sure if that's significant, given that pg_dump pays no attention to the order in which it locks things. But it sure as heck isn't *decreasing* the risk; and it's a behavior that we could not hope to improve with more smarts about pg_dump's lock ordering.) So we need a way to turn that off. What I proposed in that thread was > (1) Make pg_dump use LOCK TABLE ONLY, not LOCK TABLE. > (2) Make LOCK TABLE ONLY on a view not recurse to the view's dependencies. which would each be quite trivial to do. An objection to this approach is that ONLY typically controls recursion to a table's inheritance or partitioning children, which is a different animal from recursion to a view's dependencies. That argument would lead to wanting some other syntax to control this. I do not find that argument compelling enough to justify making pg_dump deal with two different commands depending on the relation's relkind, but it is a plausible concern. Closely related to this is whether pg_dump ought to be using ONLY for locking regular tables too. I tend to think that it should be, again on the grounds that any child tables we may be interested in will get locked separately, so that we're not doing anything by recursing except expending extra cycles and perhaps increasing the chance of a deadlock. A completely different approach we could consider is to weaken the permissions requirements for LOCK on a view, say "allow it if either the calling user or the view owner has the needed permission". This seems generally pretty messy and so I don't much like it, but we should consider as many solutions as we can think of. Thoughts? regards, tom lane [1] https://www.postgresql.org/message-id/flat/16703-e348f58aab3cf6cc%40postgresql.org
On Sat, Nov 07, 2020 at 11:57:20AM -0500, Tom Lane wrote: > The problems discussed in bug #16703 [1] show that pg_dump needs a > version of LOCK TABLE that behaves differently for views than > what we have now. Since v11, LOCK TABLE on a view recurses to all > tables and views named in the view, and it does so using the view > owner's permissions, meaning that a view that would have permissions > failures if executed will also have permissions failures when locked. > That's probably fine for ordinary usage, but it's disastrous for > pg_dump --- even a superuser can't lock such a view. > > Moreover, pg_dump doesn't really need the recursive behavior. It just > needs the view's definition to hold still; and in any case, a typical > pg_dump run would have independently acquired locks on all the other > relations anyway. The recursion is buying us nothing, except perhaps > an increased risk of deadlocks against concurrent DDL operations. The getTables() locking aims to take the locks that will be taken later. That avoids failing after expensive work. For views, the later lock-taker is pg_get_viewdef(), which locks more than just the view but less than[2] LOCK TABLE. Recursion buys us more than nothing for "pg_dump --table=viewname", so abandoning recursion unconditionally is a step in the wrong direction. I don't expect --table to be as excellent as complete dumps, but a change that makes it worse does lose points. I want to keep the recursion. > (I'm not quite sure if that's significant, given that pg_dump pays > no attention to the order in which it locks things. But it sure as > heck isn't *decreasing* the risk; and it's a behavior that we could > not hope to improve with more smarts about pg_dump's lock ordering.) Reordering to avoid deadlocks would be best-effort, so it's fine not to have full control over the order. > Closely related to this is whether pg_dump ought to be using ONLY for > locking regular tables too. I tend to think that it should be, again > on the grounds that any child tables we may be interested in will get > locked separately, so that we're not doing anything by recursing except > expending extra cycles and perhaps increasing the chance of a deadlock. Agreed. "pg_dump --table=inheritance_parent" never queries inheritance children, so it's nice not to lock them. > A completely different approach we could consider is to weaken the > permissions requirements for LOCK on a view, say "allow it if either > the calling user or the view owner has the needed permission". This > seems generally pretty messy and so I don't much like it, but we > should consider as many solutions as we can think of. This is the best of what you've listed by a strong margin, and I don't know of better options you've not listed. +1 for it. Does it work for you? I think the mess arises from LOCK TABLE serving "get locks sufficient for $ACTIONS" as a family of use cases. For views only, different $ACTIONS want different behavior. $ACTIONS==SELECT wants today's behavior; pg_get_viewdef() wants shallower recursion and caller permissions; DROP VIEW wants no recursion. > [1] https://www.postgresql.org/message-id/flat/16703-e348f58aab3cf6cc%40postgresql.org [2] For example, pg_get_viewdef('pg_user') locks pg_shadow, but "LOCK TABLE pg_user" additionally locks pg_authid and pg_db_role_setting.
On 2020-Nov-07, Noah Misch wrote: > On Sat, Nov 07, 2020 at 11:57:20AM -0500, Tom Lane wrote: > > A completely different approach we could consider is to weaken the > > permissions requirements for LOCK on a view, say "allow it if either > > the calling user or the view owner has the needed permission". This > > seems generally pretty messy and so I don't much like it, but we > > should consider as many solutions as we can think of. > > This is the best of what you've listed by a strong margin, and I don't know of > better options you've not listed. +1 for it. Does it work for you? It does sound attractive from a user complexity perspective, even if it does sound messy form an implementation perspective. > I think > the mess arises from LOCK TABLE serving "get locks sufficient for $ACTIONS" as > a family of use cases. For views only, different $ACTIONS want different > behavior. $ACTIONS==SELECT wants today's behavior; pg_get_viewdef() wants > shallower recursion and caller permissions; DROP VIEW wants no recursion. Maybe we can tackle this problem directly, by adding a clause to LOCK TABLE to indicate a purpose for the lock that the server can use to determine the level of recursion. For example LOCK TABLE xyz IN <mode> FOR <purpose> where <purpose> can be READ, DROP, DEFINE. (For back-patch purposes we could store the purpose in LockStmt->mode, which has more than enough unused bits).
On Mon, Nov 09, 2020 at 11:42:33AM -0300, Alvaro Herrera wrote: > On 2020-Nov-07, Noah Misch wrote: > > On Sat, Nov 07, 2020 at 11:57:20AM -0500, Tom Lane wrote: > > > A completely different approach we could consider is to weaken the > > > permissions requirements for LOCK on a view, say "allow it if either > > > the calling user or the view owner has the needed permission". This > > > seems generally pretty messy and so I don't much like it, but we > > > should consider as many solutions as we can think of. > > > > This is the best of what you've listed by a strong margin, and I don't know of > > better options you've not listed. +1 for it. Does it work for you? > > It does sound attractive from a user complexity perspective, even if it > does sound messy form an implementation perspective. > > > I think > > the mess arises from LOCK TABLE serving "get locks sufficient for $ACTIONS" as > > a family of use cases. For views only, different $ACTIONS want different > > behavior. $ACTIONS==SELECT wants today's behavior; pg_get_viewdef() wants > > shallower recursion and caller permissions; DROP VIEW wants no recursion. > > Maybe we can tackle this problem directly, by adding a clause to LOCK > TABLE to indicate a purpose for the lock that the server can use to > determine the level of recursion. For example > LOCK TABLE xyz IN <mode> FOR <purpose> > where <purpose> can be READ, DROP, DEFINE. Possible. Regrettably, we're not set up for it; running pg_get_viewdef() to completion is today's way to determine what it will lock. Each of these modes probably would have condensed copies of the operation they mimic, which I'd find sadder than locking somewhat more than pg_dump needs (via today's "LOCK TABLE viewname" behavior). Is it plausible to do without that duplication?