Thread: Improve behavior of concurrent ANALYZE/VACUUM
Hi all, (In CC are the folks who have reviewed the first patch versions, Nathan and Horiguchi-san.) After TRUNCATE and REINDEX, here is the third and last thread I am spawning for the previous thread "Canceling authentication due to timeout aka Denial of Service Attack": https://www.postgresql.org/message-id/152512087100.19803.12733865831237526317%40wrigleys.postgresql.org And this time the discussion is about VACUUM/ANALYZE. In this case, we also check relation ownership after queuing for a lock, which can allow any user to potentially lock a relation which others could use, particularly with VACUUM FULL which needs an AEL (access exclusive lock). In the previous thread, we discussed a couple of approaches, but I was not happy with any of those, hence I have been spending more time in getting to a solution which has no user-facing changes, and still solves the problems folks have been complaining about, and the result is the patch attached. The patch changes a couple of things regarding ACL checks, by simply centralizing the ownership checks into a single routine used by both ANALYZE and VACUUM. This routine is then used in two more places for manual ANALYZE and VACUUM: - When specifying directly one or more relations in the command, in expand_vacuum_rel(). - When building the complete list of relations to work on in the case of a database-wide operation, in get_all_vacuum_rels(). analyze_rel() and vacuum_rel() have been using the same logic to check for relation ownership, so refactoring things into a single routine is a win in my opinion. While reviewing the code, I have of course noticed that analyze_rel() makes an effort to not produce a WARNING if both VACOPT_VACUUM and VACOPT_ANALYZE are specified in VacuumStmt->options, however we can never see that scenario as analyze_rel() never gets called at the same time as vacuum_rel() for a single relation. The patch attached includes tests which I have used to also check that correct error messages are produced for VACUUM, VACUUM ANALYZE and ANALYZE. Please note that like the previous one for TRUNCATE, I would no plans for a back-patch with the same arguments as previously. There are also serious bugs being worked on for REL_11_STABLE so I don't want to take any risk for this branch. Thoughts? -- Michael
Attachment
On Sun, Aug 12, 2018 at 10:21 PM, Michael Paquier <michael@paquier.xyz> wrote: > In the previous thread, we discussed a couple of approaches, but I was > not happy with any of those, hence I have been spending more time in > getting to a solution which has no user-facing changes, and still solves > the problems folks have been complaining about, and the result is the > patch attached. The patch changes a couple of things regarding ACL > checks, by simply centralizing the ownership checks into a single > routine used by both ANALYZE and VACUUM. This routine is then used in > two more places for manual ANALYZE and VACUUM: > - When specifying directly one or more relations in the command, in > expand_vacuum_rel(). > - When building the complete list of relations to work on in the case of > a database-wide operation, in get_all_vacuum_rels(). I feel like you're not being very clear about exactly what this new approach is. Sorry if I'm being dense. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Aug 14, 2018 at 03:26:29PM +0000, Robert Haas wrote: > I feel like you're not being very clear about exactly what this new > approach is. Sorry if I'm being dense. On HEAD, we check the ownership of the relation vacuumed or analyzed after taking a lock on it in respectively vacuum_rel() and analyze_rel(), where we already know the OID of the relation and there may be no RangeVar which we could use with RangeVarGetRelidExtended (like partitions). I don't think that we want to use again RangeVarGetRelidExtended once the relation OID is known anyway. So My proposal is to add more ownership checks when building the list of VacuumRelations, and skip the relations the user has no right to work on at an earlier stage. Looking at the code may be easier to understand than a comment, please remember that there are three code paths used to build the list of VacuumRelations (each one may be processed in its own transaction): 1) autovacuum, which specifies only one relation at a time with its OID, and we build the list in expand_vacuum_rel(), which finishes with a single element. 2) Manual VACUUM with a list of relation specified, where the list of elements is built in the second part of expand_vacuum_rel(), which is able to expand partitioned tables as well. 3) Manual VACUUM with no list specified, where the list of relations is built in get_all_vacuum_rels(). My proposal is to add two more ownership checks in 2) and 3), and also refactor the code so as we use a single routine for ANALYZE and VACUUM. This has the advantage of not making the code of ANALYZE and VACUUM diverge anymore for the existing ownership checks, and we still generate WARNINGs if a relation needs to be skipped. (Thinking about it, it could make sense to add an extra assert at the beginning of expand_vacuum_rel like I did in 6551f3d...) -- Michael
Attachment
On Tue, Aug 14, 2018 at 11:59 AM, Michael Paquier <michael@paquier.xyz> wrote: > On HEAD, we check the ownership of the relation vacuumed or analyzed > after taking a lock on it in respectively vacuum_rel() and > analyze_rel(), where we already know the OID of the relation and there > may be no RangeVar which we could use with RangeVarGetRelidExtended > (like partitions). I don't think that we want to use again > RangeVarGetRelidExtended once the relation OID is known anyway. We definitely don't want to use RangeVarGetRelidExtended more than necessary. It is important that we use that function only when necessary - that is, to look up names supplied by users - and it is also important that we look up each user-supplied name only once, lest we get different answers on different occasions, possibly introducing either outright security problems or at the least ludicrous behavior. In the case where we have an OID already, I think we could just perform a permissions test before locking the OID. It's true that permissions might be revoked after we test them and before the lock is acquired, but that doesn't seem terrible. The real point of all of this stuff is to keep users from locking objects which they never had any right to access, not to worry about what happens if permissions are concurrently revoked while we're getting the lock. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Aug 16, 2018 at 03:07:18PM -0400, Robert Haas wrote: > We definitely don't want to use RangeVarGetRelidExtended more than > necessary. It is important that we use that function only when > necessary - that is, to look up names supplied by users - and it is > also important that we look up each user-supplied name only once, lest > we get different answers on different occasions, possibly introducing > either outright security problems or at the least ludicrous behavior. Thanks, that matches my feelings about this stuff. > In the case where we have an OID already, I think we could just > perform a permissions test before locking the OID. It's true that > permissions might be revoked after we test them and before the lock is > acquired, but that doesn't seem terrible. The real point of all of > this stuff is to keep users from locking objects which they never had > any right to access, not to worry about what happens if permissions > are concurrently revoked while we're getting the lock. One thing is that neither pg_class_ownercheck nor pg_database_ownercheck are fail-safe, and would issue an ERROR when the relation does not exist, and this can happen when using multiple transactions for VACUUM FULL or such, so we cannot simply swap the owner checks before trying to lock the relation in vacuum_rel() and analyze_rel(). Or we invent new flavors of those routine able to handle missing relations, then swap the ACL checks to happen before the relations are locked. For VACUUM/ANALYZE, I tend to think that it is incorrect to include from the start in the list of relations to process all the ones a user is not an owner of, so my approach seems quite natural, at least to me. Each one of the two approaches has its good and bad sides. -- Michael
Attachment
On Mon, Aug 13, 2018 at 12:21:42AM +0200, Michael Paquier wrote: > The patch attached includes tests which I have used to also check that > correct error messages are produced for VACUUM, VACUUM ANALYZE and > ANALYZE. I have reworked the patch on this side, clarifying the use of the new common API for the logs. One thing I am wondering about is what do we want to do when VACUUM ANALYZE is used. As of HEAD, if vacuum_rel() stops, then analyze_rel() is never called, and the only log showing up to a non-owner user would be: skipping "foo" --- only superuser can vacuum it With this patch, things become perhaps more confusing by generating two WARNING log entries: skipping "foo" --- only superuser can vacuum it skipping "foo" --- only superuser can analyze it We could either combine both in a single message, or just generate the message for vacuum as HEAD does now. I have also added some simple regression tests triggering the skipping logs for shared catalogs, non-shared catalogs and non-owners. This could be a separate patch as well. Input is welcome. -- Michael
Attachment
Hi, Sorry for the delay! I looked through the latest patch. On 8/17/18, 1:43 AM, "Michael Paquier" <michael@paquier.xyz> wrote: > I have reworked the patch on this side, clarifying the use of the new > common API for the logs. One thing I am wondering about is what do we > want to do when VACUUM ANALYZE is used. As of HEAD, if vacuum_rel() > stops, then analyze_rel() is never called, and the only log showing up > to a non-owner user would be: > skipping "foo" --- only superuser can vacuum it > > With this patch, things become perhaps more confusing by generating two > WARNING log entries: > skipping "foo" --- only superuser can vacuum it > skipping "foo" --- only superuser can analyze it > > We could either combine both in a single message, or just generate the > message for vacuum as HEAD does now. I have also added some simple > regression tests triggering the skipping logs for shared catalogs, > non-shared catalogs and non-owners. This could be a separate patch as > well. I like the idea of emitting a separate WARNING for each for clarity on what operations are being skipped. However, I think it could be a bit confusing with the current wording. Perhaps something like "skipping vacuum of..." and "skipping analyze of..." would make things clearer. Another thing to keep in mind is how often only one of these messages will apply. IIUC the vast majority of VACUUM (ANALYZE) statements that need to emit such log statements would emit both. Plus, while VACUUM (ANALYZE) is clearly documented as performing both operations, I can easily see the argument that users may view it as one big command and that emitting multiple log entries could be a confusing change in behavior. In short, my vote would be to maintain the current behavior for now and to bring up any logging improvements separately. + /* + * Depending on the permission checks done while building the list + * of relations to work on, it could be possible that the list is + * empty, hence do nothing in this case. + */ + if (relations == NIL) + return; It might be better to let the command go through normally so that we don't miss any cleanup at the end (e.g. deleting vac_context). +/* + * Check if a given relation can be safely vacuumed or not. If the + * user is not the relation owner, issue a WARNING log message and return + * false to let the caller decide what to do with this relation. + */ +bool +vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options) +{ + Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0); + + /* + * Check permissions. + * + * We allow the user to vacuum a table if he is superuser, the table + * owner, or the database owner (but in the latter case, only if it's not + * a shared relation). pg_class_ownercheck includes the superuser case. + * + * Note we choose to treat permissions failure as a WARNING and keep + * trying to vacuum the rest of the DB --- is this appropriate? + */ Do you think it's worth adding ANALYZE to these comments as well? + if (!(pg_class_ownercheck(relid, GetUserId()) || + (pg_database_ownercheck(MyDatabaseId, GetUserId()) && !reltuple->relisshared))) Returning true right away when the role does have permissions might be a nice way to save a level of indentation. + /* + * To check whether the relation is a partitioned table and its + * ownership, fetch its syscache entry. + */ + tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "cache lookup failed for relation %u", relid); + classForm = (Form_pg_class) GETSTRUCT(tuple); + + /* check permissions of relation */ + if (!vacuum_is_relation_owner(relid, classForm, options)) + { + ReleaseSysCache(tuple); + + /* + * Release lock again with AccessShareLock -- see below for + * the reason why this lock is released. + */ + UnlockRelationOid(relid, AccessShareLock); + return vacrels; + } I think this actually changes the behavior for partitioned tables. Presently, we still go through and collect all the partitions in the vacrels list. With this change, we will skip collecting a table's partitions if the current role doesn't have the required permissions. Perhaps we should skip adding the current relation to vacrels if vacuum_is_relation_owner() returns false, and then we could go through the partitions and check permissions on those as well. Since we don't take any locks on the individual partitions yet, getting the relation name and calling pg_class_ownercheck() safely might be tricky, though. Nathan
On Mon, Aug 20, 2018 at 08:57:00PM +0000, Bossart, Nathan wrote: > Sorry for the delay! I looked through the latest patch. Thanks a lot for the review! > I like the idea of emitting a separate WARNING for each for clarity on > what operations are being skipped. However, I think it could be a bit > confusing with the current wording. Perhaps something like "skipping > vacuum of..." and "skipping analyze of..." would make things clearer. > Another thing to keep in mind is how often only one of these messages > will apply. IIUC the vast majority of VACUUM (ANALYZE) statements > that need to emit such log statements would emit both. Plus, while > VACUUM (ANALYZE) is clearly documented as performing both operations, > I can easily see the argument that users may view it as one big > command and that emitting multiple log entries could be a confusing > change in behavior. > > In short, my vote would be to maintain the current behavior for now > and to bring up any logging improvements separately. On the other hand, it would be useful for the user to know exactly what is getting skipped. For example if VACUUM ANALYZE is used then both operations would happen, but now the user would only know that VACUUM has been skipped, and may miss the fact that ANALYZE was not attempted. Let's do as you suggest at the end, aka if both VACOPT_VACUUM and VACOPT_ANALYZE are passed down to vacuum_is_relation_owner, then only the log for VACUUM is generated, which is consistent. Any other changes could happen later on if necessary. > + /* > + * Depending on the permission checks done while building the list > + * of relations to work on, it could be possible that the list is > + * empty, hence do nothing in this case. > + */ > + if (relations == NIL) > + return; > > It might be better to let the command go through normally so that we > don't miss any cleanup at the end (e.g. deleting vac_context). Right, that was a bad idea. Updating datfrozenxid can actually be a good thing. > +/* > + * Check if a given relation can be safely vacuumed or not. If the > + * user is not the relation owner, issue a WARNING log message and return > + * false to let the caller decide what to do with this relation. > + */ > +bool > +vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options) > +{ > + Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0); > + > + /* > + * Check permissions. > + * > + * We allow the user to vacuum a table if he is superuser, the table > + * owner, or the database owner (but in the latter case, only if it's not > + * a shared relation). pg_class_ownercheck includes the superuser case. > + * > + * Note we choose to treat permissions failure as a WARNING and keep > + * trying to vacuum the rest of the DB --- is this appropriate? > + */ > > Do you think it's worth adding ANALYZE to these comments as well? Done. > + if (!(pg_class_ownercheck(relid, GetUserId()) || > + (pg_database_ownercheck(MyDatabaseId, GetUserId()) && !reltuple->relisshared))) > > Returning true right away when the role does have permissions might be > a nice way to save a level of indentation. Done. > I think this actually changes the behavior for partitioned tables. > Presently, we still go through and collect all the partitions in the > vacrels list. With this change, we will skip collecting a table's > partitions if the current role doesn't have the required permissions. > Perhaps we should skip adding the current relation to vacrels if > vacuum_is_relation_owner() returns false, and then we could go through > the partitions and check permissions on those as well. Since we don't > take any locks on the individual partitions yet, getting the relation > name and calling pg_class_ownercheck() safely might be tricky, though. Yes, that's actually intentional on my side as this keeps the logic more simple, and we avoid risks around deadlocks when working on partitions. It seems to me that it is also more intuitive to *not* scan a full partition tree if the user does not have ownership on its root if the relation is listed, instead of trying to scan all leafs to find perhaps some of them. In most data models it would matter much anyway, no? This is also more consistent with what is done for TRUNCATE where the ACLs of the parent are considered first. The documentation also actually mentions that: "To vacuum a table, one must ordinarily be the table's owner or a superuser." Perhaps we could make that clearer for partitions, with something like: "If listed explicitly, the vacuum of a partitioned table will include all its partitions if the user is the owner of the partitioned table." If we don't want to change the current behavior, then one simple solution would be close to what you mention, aka skip adding the partitioned table to the list, include *all* the partitions in the list as we cannot sanely check their ACLs at this stage, and rely on the checks already happening in vacuum_rel() and analyze_rel(). This would cause the original early lock attempts to not be solved for partitions, which is why the approach taken in the patches makes the most sense. I have split the patch into two parts: - 0001 includes new tests which generate WARNING messages for VACUUM, ANALYZE and VACUUM (ANALYZE). That's useful separately. - 0002 is the original patch discussed here. Thanks, -- Michael
Attachment
On 8/20/18, 8:29 PM, "Michael Paquier" <michael@paquier.xyz> wrote: >> In short, my vote would be to maintain the current behavior for now >> and to bring up any logging improvements separately. > > On the other hand, it would be useful for the user to know exactly what > is getting skipped. For example if VACUUM ANALYZE is used then both > operations would happen, but now the user would only know that VACUUM > has been skipped, and may miss the fact that ANALYZE was not attempted. > Let's do as you suggest at the end, aka if both VACOPT_VACUUM and > VACOPT_ANALYZE are passed down to vacuum_is_relation_owner, then only > the log for VACUUM is generated, which is consistent. Any other changes > could happen later on if necessary. Sounds good. > If we don't want to change the current behavior, then one simple > solution would be close to what you mention, aka skip adding the > partitioned table to the list, include *all* the partitions in the list > as we cannot sanely check their ACLs at this stage, and rely on the > checks already happening in vacuum_rel() and analyze_rel(). This would > cause the original early lock attempts to not be solved for partitions, > which is why the approach taken in the patches makes the most sense. I think my biggest concern with this approach is that we'd be introducing inconsistent behavior whenever there are concurrent changes. If a user never had permissions to VACUUM the partitioned table, the partitions are skipped outright. However, if the user loses permissions to VACUUM the partitioned table between expand_vacuum_rel() and vacuum_rel(), we'll still attempt to VACUUM each individual partition. I'll admit I don't have a great alternative proposal that doesn't involve adding deadlock risk or complexity, but it still seems worth mulling over. > I have split the patch into two parts: > - 0001 includes new tests which generate WARNING messages for VACUUM, > ANALYZE and VACUUM (ANALYZE). That's useful separately. 0001 looks good to me. > - 0002 is the original patch discussed here. I'd suggest even splitting 0002 into two patches: one for refactoring the existing permissions checks into vacuum_is_relation_owner() and another for the new checks. +# The role doesn't have privileges to vacuum the table, so VACUUM should +# immediately skip the table without waiting for a lock. Can we add tests for concurrent changes that cause the relation to be skipped in vacuum_rel() and analyze_rel() instead of expand_vacuum_rel()? Nathan
On Tue, Aug 21, 2018 at 04:01:50PM +0000, Bossart, Nathan wrote: > I think my biggest concern with this approach is that we'd be > introducing inconsistent behavior whenever there are concurrent > changes. If a user never had permissions to VACUUM the partitioned > table, the partitions are skipped outright. However, if the user > loses permissions to VACUUM the partitioned table between > expand_vacuum_rel() and vacuum_rel(), we'll still attempt to VACUUM > each individual partition. > > I'll admit I don't have a great alternative proposal that doesn't > involve adding deadlock risk or complexity, but it still seems worth > mulling over. That counts only for a manual vacuum/analyze listing directly the relation in question. If running a system-wide VACUUM then all the relations are still processed. This is a rather edge case in my opinion but.. I don't mind mulling over it (as you say). So please let me think over it for a couple of days. I don't see a smart solution which does not create risks of lock upgrades and deadlocks now, there may be one able to preserve the existing behavior. >> I have split the patch into two parts: >> - 0001 includes new tests which generate WARNING messages for VACUUM, >> ANALYZE and VACUUM (ANALYZE). That's useful separately. > > 0001 looks good to me. Thanks, I have pushed this one. >> - 0002 is the original patch discussed here. > > I'd suggest even splitting 0002 into two patches: one for refactoring > the existing permissions checks into vacuum_is_relation_owner() and > another for the new checks. Hmmm. The second patch changes also some comment blocks when calling vacuum_is_relation_owner(), so we finish by changing the same code areas, resulting in more code churn for no real gain. > +# The role doesn't have privileges to vacuum the table, so VACUUM should > +# immediately skip the table without waiting for a lock. > > Can we add tests for concurrent changes that cause the relation to be > skipped in vacuum_rel() and analyze_rel() instead of > expand_vacuum_rel()? Doing that deterministically with concurrent tests look difficult to me as doing ALTER TABLE OWNER TO to a relation in a first session causes a second session running VACUUM to block in expand_vacuum_rel(), be it with a plain table or a partitioned table (doing the ALTER TABLE on a leaf will block scanning the parent as well). -- Michael
Attachment
On 8/21/18, 7:44 PM, "Michael Paquier" <michael@paquier.xyz> wrote: > On Tue, Aug 21, 2018 at 04:01:50PM +0000, Bossart, Nathan wrote: >> I think my biggest concern with this approach is that we'd be >> introducing inconsistent behavior whenever there are concurrent >> changes. If a user never had permissions to VACUUM the partitioned >> table, the partitions are skipped outright. However, if the user >> loses permissions to VACUUM the partitioned table between >> expand_vacuum_rel() and vacuum_rel(), we'll still attempt to VACUUM >> each individual partition. >> >> I'll admit I don't have a great alternative proposal that doesn't >> involve adding deadlock risk or complexity, but it still seems worth >> mulling over. > > That counts only for a manual vacuum/analyze listing directly the > relation in question. If running a system-wide VACUUM then all the > relations are still processed. This is a rather edge case in my opinion > but.. I don't mind mulling over it (as you say). So please let me > think over it for a couple of days. I don't see a smart solution which > does not create risks of lock upgrades and deadlocks now, there may be > one able to preserve the existing behavior. Right. If we don't come up with anything, the behavior change for this edge case is probably reasonable as long as we update the documentation like you proposed earlier. >>> - 0002 is the original patch discussed here. >> >> I'd suggest even splitting 0002 into two patches: one for refactoring >> the existing permissions checks into vacuum_is_relation_owner() and >> another for the new checks. > > Hmmm. The second patch changes also some comment blocks when calling > vacuum_is_relation_owner(), so we finish by changing the same code > areas, resulting in more code churn for no real gain. I see. I only made this suggestion so that we could get some of the easy changes out of the way, but there's no need if it's just adding unnecessary code churn. >> +# The role doesn't have privileges to vacuum the table, so VACUUM should >> +# immediately skip the table without waiting for a lock. >> >> Can we add tests for concurrent changes that cause the relation to be >> skipped in vacuum_rel() and analyze_rel() instead of >> expand_vacuum_rel()? > > Doing that deterministically with concurrent tests look difficult to me > as doing ALTER TABLE OWNER TO to a relation in a first session causes a > second session running VACUUM to block in expand_vacuum_rel(), be it > with a plain table or a partitioned table (doing the ALTER TABLE on a > leaf will block scanning the parent as well). I think this is doable by locking the table in SHARE mode. That won't conflict with the AccessShareLock that expand_vacuum_rel() obtains, but it will conflict with the ShareUpdateExclusiveLock or AccessExclusiveLock that vacuum_rel() takes. session 1> BEGIN; LOCK test IN SHARE MODE; session 2> VACUUM test; session 1> ALTER TABLE test OWNER TO not_session_2_user; COMMIT; Nathan
On Wed, Aug 22, 2018 at 02:17:44AM +0000, Bossart, Nathan wrote: > I think this is doable by locking the table in SHARE mode. That won't > conflict with the AccessShareLock that expand_vacuum_rel() obtains, > but it will conflict with the ShareUpdateExclusiveLock or > AccessExclusiveLock that vacuum_rel() takes. Good point. Still is that really worth adding? This implies a test which has at least two roles, one switching the ownership to the other and do so back-and-forth. At least that should be on a different isolation spec file to not complicate the first one. -- Michael
Attachment
On 8/22/18, 12:37 AM, "Michael Paquier" <michael@paquier.xyz> wrote: > On Wed, Aug 22, 2018 at 02:17:44AM +0000, Bossart, Nathan wrote: >> I think this is doable by locking the table in SHARE mode. That won't >> conflict with the AccessShareLock that expand_vacuum_rel() obtains, >> but it will conflict with the ShareUpdateExclusiveLock or >> AccessExclusiveLock that vacuum_rel() takes. > > Good point. Still is that really worth adding? This implies a test > which has at least two roles, one switching the ownership to the other > and do so back-and-forth. At least that should be on a different > isolation spec file to not complicate the first one. I think so, since this is the only ownership checks we do on individual partitions. Another simple way to test this would be to create a partitioned table with a different owner than the partitions and to run VACUUM as the partitioned table owner. In this case, we would still rely on the checks in vacuum_rel() and analyze_rel(). IMO this is a reason to avoid skipping gathering the individual partitions based upon the ownership of the partitioned table. It's true that this wouldn't fix the locking issue for partitions, but the aforementioned edge case is still present with 0002 anyway. Plus, it would add a bit more consistency to partition handling in VACUUM. I've attached a patch that applies on top of 0002 that adds a simple test to exercise the checks in vacuum_rel() and analyze_rel(). + /* + * For VACUUM ANALYZE, both logs could show up, but just generate + * information for VACUUM as that would be the first one to + * process. + */ + return; We should probably return false here. Nathan
Attachment
On Wed, Aug 22, 2018 at 03:49:16PM +0000, Bossart, Nathan wrote: > I think so, since this is the only ownership checks we do on > individual partitions. Another simple way to test this would be to > create a partitioned table with a different owner than the partitions > and to run VACUUM as the partitioned table owner. In this case, we > would still rely on the checks in vacuum_rel() and analyze_rel(). IMO > this is a reason to avoid skipping gathering the individual partitions > based upon the ownership of the partitioned table. It's true that > this wouldn't fix the locking issue for partitions, but the > aforementioned edge case is still present with 0002 anyway. Plus, it > would add a bit more consistency to partition handling in VACUUM. Normal regression tests are less costly than isolation tests, so let's use them as possible. What you attached is covering only a portion of all the scenarios though, as it is as well interesting to see what happens if another user owns only the partitioned table, only one partition, and the partitioned as well as at least one partition. I have extended your patch as attached. It applies on top of HEAD. Once applied with the other patch one can easily stop the difference in behavior, and this stresses the ownership checks in vacuum_rel() and analyze_rel() as well. Perhaps we could begin by that? > We should probably return false here. Oh, my compiler complained here as well. Fixed it on my branch. -- Michael
Attachment
On 8/23/18, 12:08 AM, "Michael Paquier" <michael@paquier.xyz> wrote: > Normal regression tests are less costly than isolation tests, so let's > use them as possible. What you attached is covering only a portion of > all the scenarios though, as it is as well interesting to see what > happens if another user owns only the partitioned table, only one > partition, and the partitioned as well as at least one partition. I > have extended your patch as attached. It applies on top of HEAD. Once > applied with the other patch one can easily stop the difference in > behavior, and this stresses the ownership checks in vacuum_rel() and > analyze_rel() as well. Perhaps we could begin by that? This seems reasonable to me. I think establishing the expected behavior here is a good idea. Nathan
On Thu, Aug 23, 2018 at 09:53:57PM +0000, Bossart, Nathan wrote: > This seems reasonable to me. I think establishing the expected > behavior here is a good idea. Thanks, I have pushed the new test series, and reused it to check the new version of the main patch, which is attached. I have added a commit message and I have indented the thing. After pondering about it, I have also reworked the portion for partitioned tables so as the list of partitions processed is unchanged on HEAD, and we keep a consistent behavior compared to past versions. If VACUUM processing for partitioned tables was something new in 11, I think that we could have considered it, but changing silently something that people may rely on for more than one year now is not very appealing. I can personally imagine data models with multiple layers of partitions where the top-most parent has the most restricted access, and then things get more permitted the more down you get. For example let's imagine a table listing a population, which is split by cities. The top-most partitioned table references the whole country, which say only the president has access to. Then there are partitions which can be accessed only by the majors of each city. In this case, even if a mayor does a VACUUM FULL of its leaf partition then a full read would be blocked even for the president. The reverse is technically possible, aka the top-most parent is not really restrictive, and leafs get more and more restricted, but logically that does not make much sense as the top-most parent would be just useless for any kind of operations so as a full table scan. Still, in the first case, say that each city major uses the same application layer which vacuums the top-most parent, then we'd break something that worked in 10 and 11. -- Michael
Attachment
On 8/23/18, 9:16 PM, "Michael Paquier" <michael@paquier.xyz> wrote: > Thanks, I have pushed the new test series, and reused it to check the > new version of the main patch, which is attached. I have added a commit > message and I have indented the thing. Thanks for the new version! > After pondering about it, I have also reworked the portion for > partitioned tables so as the list of partitions processed is unchanged > on HEAD, and we keep a consistent behavior compared to past versions. > If VACUUM processing for partitioned tables was something new in 11, I > think that we could have considered it, but changing silently something > that people may rely on for more than one year now is not very > appealing. Agreed. Even though we're not fixing the issue for partitions yet, this patch should still fix the originally reported authentication issue (which I see is highlighted in your commit message). I think there's still a slight behavior change with the ordering of the "skipped" log messages in some cases, but that doesn't seem terribly important. We might be able to work around this by storing all the information we need for the log message in the VacuumRelation and waiting to emit it until vacuum_rel() or analyze_rel(), but I doubt it's worth the complexity. Without patch: postgres=> VACUUM parted1, parted2; WARNING: skipping "parted1" --- only table or database owner can vacuum it WARNING: skipping "parted1_part1" --- only table or database owner can vacuum it WARNING: skipping "parted1_part2" --- only table or database owner can vacuum it WARNING: skipping "parted2" --- only table or database owner can vacuum it WARNING: skipping "parted2_part1" --- only table or database owner can vacuum it WARNING: skipping "parted2_part2" --- only table or database owner can vacuum it VACUUM With patch: postgres=> VACUUM parted1, parted2; WARNING: skipping "parted1" --- only table or database owner can vacuum it WARNING: skipping "parted2" --- only table or database owner can vacuum it WARNING: skipping "parted1_part1" --- only table or database owner can vacuum it WARNING: skipping "parted1_part2" --- only table or database owner can vacuum it WARNING: skipping "parted2_part1" --- only table or database owner can vacuum it WARNING: skipping "parted2_part2" --- only table or database owner can vacuum it VACUUM The new version of the patch applies cleanly, builds cleanly, and 'make check-world' succeeds. Also, I'm no longer able to reproduce the authentication issue involving 'VACUUM FULL' run by non- superusers, so it looks good to me. Nathan
On Fri, Aug 24, 2018 at 05:30:01PM +0000, Bossart, Nathan wrote: > On 8/23/18, 9:16 PM, "Michael Paquier" <michael@paquier.xyz> wrote: >> Thanks, I have pushed the new test series, and reused it to check the >> new version of the main patch, which is attached. I have added a commit >> message and I have indented the thing. > > Thanks for the new version! Finally, I have been able to come back to it, and pushed the latest version. We have come a long way... I'll check the rest of the backend code for weird calls of relation_open or such. We may have other cases with similar problems. >> After pondering about it, I have also reworked the portion for >> partitioned tables so as the list of partitions processed is unchanged >> on HEAD, and we keep a consistent behavior compared to past versions. >> If VACUUM processing for partitioned tables was something new in 11, I >> think that we could have considered it, but changing silently something >> that people may rely on for more than one year now is not very >> appealing. > > Agreed. Even though we're not fixing the issue for partitions yet, > this patch should still fix the originally reported authentication > issue (which I see is highlighted in your commit message). I think > there's still a slight behavior change with the ordering of the > "skipped" log messages in some cases, but that doesn't seem terribly > important. We might be able to work around this by storing all the > information we need for the log message in the VacuumRelation and > waiting to emit it until vacuum_rel() or analyze_rel(), but I doubt > it's worth the complexity. This one is definitely not worth worrying in my opinion, we still process the same relations, and the order is preserved when using a single relation. > The new version of the patch applies cleanly, builds cleanly, and > 'make check-world' succeeds. Also, I'm no longer able to reproduce > the authentication issue involving 'VACUUM FULL' run by non- > superusers, so it looks good to me. Thanks for the help! -- Michael