Re: pg_dump needs SELECT privileges on irrelevant extension table - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: pg_dump needs SELECT privileges on irrelevant extension table |
Date | |
Msg-id | a185f66f-1edc-426e-41a4-091dad230c75@dunslane.net Whole thread Raw |
In response to | Re: pg_dump needs SELECT privileges on irrelevant extension table (Jacob Champion <jchampion@timescale.com>) |
Responses |
Re: pg_dump needs SELECT privileges on irrelevant extension table
|
List | pgsql-hackers |
On 2023-06-29 Th 12:24, Jacob Champion wrote:
On 3/20/23 10:43, Tom Lane wrote:I'd be more willing to consider the proposed patch if it weren't such a hack --- as you say, it doesn't fix the problem when the table has policies, so it's hardly a general-purpose solution. I fear that it's also fairly expensive: adding sub-selects to the query we must do before we can lock any tables is not appetizing, because making that window wider adds to the risk of deadlocks, dump failures, etc.(moving to -hackers and July CF) = Recap for Potential Reviewers = The timescaledb extension has an internal table that's owned by the superuser. It's not dumpable, and other users can only access its contents through a filtered view. For our cloud deployments, we additionally have that common trope where the most privileged users aren't actually superusers, but we still want them to be able to perform a subset of maintenance tasks, including pg_dumping their data. When cloud users try to dump that data, pg_dump sees that this internal table is an extension member and plans to dump ACLs, security labels, and RLS policies for it. (This behavior cannot be overridden with --exclude-table. pg_dump ignores that flag for extension members.) Dumping policies requires the use of pg_get_expr() on the backend; this, in turn, requires a lock on the table with ACCESS SHARE. So pg_dump tries to lock a table, with no policies, that it's not going to dump the schema or data for anyway, and it fails because our users don't have (and shouldn't need) SELECT access to it. For an example of this in action, I've attached a test case in v2-0001. = Proposal = Since this is affecting users on released Postgres versions, my end goal is to find a fix that's backportable. This situation looks very similar to [1], where non-superusers couldn't perform a dump because we were eagerly grabbing table locks to read (non-existent) ACLs. But that was solved with the realization that ACLs don't need locks anyway, which is unfortunately not applicable to policies. My initial patch to -bugs was a riff on a related performance fix [2], which figured out which tables had interesting ACLs and skipped that part if nothing was found. I added the same kind of subselect for RLS policies as well, but that had nasty corner cases where it would perform terribly, as Tom alluded to above. (In a cluster of 200k tables, where one single table had 10M policies, the query ground to a halt.) So v2-0002 is instead inspired by Tom's rewrite of that ACL dump logic [3]. It scans pg_policy separately, stores the tables it finds into the catalog map on the client side, and then again skips the policy dump (and therefore the lock) if no policies exist. The performance hit now scales with the size of pg_policy alone. This is a bit more invasive than the subselect, but hopefully still straightforward enough to be applicable to the back branches' old catalog map strategy. It's still not a general-purpose fix, as Tom pointed out above, but that was true of the discussion in [1] as well, so I'm optimistic. WDYT? --Jacob [1] https://postgr.es/m/CAGPqQf3Uzo-yU1suYyoZR83h6QTxXxkGTtEyeMV7EAVBqn%3DPcQ%40mail.gmail.com [2] https://git.postgresql.org/cgit/postgresql.git/commit/?id=5d589993 [3] https://git.postgresql.org/cgit/postgresql.git/commit/?id=0c9d8442
Seems reasonable at first glance. Isn't it going to save some work anyway later on, so the performance hit could end up negative?
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
pgsql-hackers by date: