Thread: Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Tue, Dec 24, 2019 at 04:50:58PM +0530, Mahendra Singh wrote: > We can fix this problem by either one way 1) reset myTempNamespace to > invalid while drooping schema of temp table 2) should not allow to drop > temporary table schema (Please note that it is better not to cross-post on multiple lists, so I have removed pgsql-bugs from CC.) There is a little bit more to that, as we would basically need to do the work of RemoveTempRelationsCallback() once the temp schema is dropped, callback registered when the schema is correctly created at transaction commit (also we need to make sure that RemoveTempRelationsCallback is not called or unregistered if we were to authorize DROP SCHEMA on a temp schema). And then all the reset done at the beginning of AtEOXact_Namespace() would need to happen. Anyway, as dropping a temporary schema leads to an inconsistent behavior when recreating new temporary objects in a session that dropped it, that nobody has actually complained on the matter, and that in concept a temporary schema is linked to the session that created it, I think that we have a lot of arguments to just forbid the operation from happening. Please note as well that it is possible to drop temporary schemas of other sessions, still this is limited to owners of the schema. In short, let's tighten the logic, and we had better back-patch this one all the way down, 9.4 being broken. Attached is a patch to do that. The error message generated depends on the state of the session so I have not added a test for this reason, and the check is added before the ACL check. We could make the error message more generic, like "cannot drop temporary namespace". Any thoughts? -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Kyotaro Horiguchi
Date:
At Wed, 25 Dec 2019 11:22:03 +0900, Michael Paquier <michael@paquier.xyz> wrote in > Anyway, as dropping a temporary schema leads to an inconsistent > behavior when recreating new temporary objects in a session that > dropped it, that nobody has actually complained on the matter, and > that in concept a temporary schema is linked to the session that Agreed. > created it, I think that we have a lot of arguments to just forbid the > operation from happening. Please note as well that it is possible to > drop temporary schemas of other sessions, still this is limited to > owners of the schema. > > In short, let's tighten the logic, and we had better back-patch this > one all the way down, 9.4 being broken. Attached is a patch to do > that. The error message generated depends on the state of the session > so I have not added a test for this reason, and the check is added > before the ACL check. We could make the error message more generic, > like "cannot drop temporary namespace". Any thoughts? Just inhibiting the action seems reasonable to me. Still the owner can drop temporary namespace on another session or pg_toast_temp_x of the current session. isTempnamespace(address.objectId) doesn't work for the purpose. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Wed, Dec 25, 2019 at 12:18:26PM +0900, Kyotaro Horiguchi wrote: > Still the owner can drop temporary namespace on another session or > pg_toast_temp_x of the current session. Arf. Yes, this had better be isAnyTempNamespace() so as we complain about all of them. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Mahendra Singh
Date:
On Wed, 25 Dec 2019 at 07:52, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Dec 24, 2019 at 04:50:58PM +0530, Mahendra Singh wrote: > > We can fix this problem by either one way 1) reset myTempNamespace to > > invalid while drooping schema of temp table 2) should not allow to drop > > temporary table schema > > (Please note that it is better not to cross-post on multiple lists, so Sorry. I was not aware of multiple mail ids. I will take care in future mails. > I have removed pgsql-bugs from CC.) Thanks. > There is a little bit more to that, as we would basically need to do > the work of RemoveTempRelationsCallback() once the temp schema is > dropped, callback registered when the schema is correctly created at > transaction commit (also we need to make sure that > RemoveTempRelationsCallback is not called or unregistered if we were > to authorize DROP SCHEMA on a temp schema). And then all the reset > done at the beginning of AtEOXact_Namespace() would need to happen. > Thanks for quick detailed analysis. > Anyway, as dropping a temporary schema leads to an inconsistent > behavior when recreating new temporary objects in a session that > dropped it, that nobody has actually complained on the matter, and > that in concept a temporary schema is linked to the session that > created it, I think that we have a lot of arguments to just forbid the > operation from happening. Please note as well that it is possible to > drop temporary schemas of other sessions, still this is limited to > owners of the schema. Yes, you are right that we can drop temporary schema of other sessions. Even after applying your attached patch, I am getting same assert failure because I am able to drop " temporary schema" from other session so I think, we should not allow to drop any temporary schema from any session. > In short, let's tighten the logic, and we had better back-patch this > one all the way down, 9.4 being broken. Attached is a patch to do Yes, I also verified that we have to back-patch till v9.4. > that. The error message generated depends on the state of the session > so I have not added a test for this reason, and the check is added > before the ACL check. We could make the error message more generic, > like "cannot drop temporary namespace". Any thoughts? I think, we can make error message as "cannot drop temporary schema" While applying attached patch on HEAD, I got below warnings: [mahendra@localhost postgres]$ git apply drop-temp-schema-v1.patch drop-temp-schema-v1.patch:9: trailing whitespace. /* drop-temp-schema-v1.patch:10: trailing whitespace. * Prevent drop of a temporary schema as this would mess up with drop-temp-schema-v1.patch:11: trailing whitespace. * the end-of-session callback cleaning up all temporary objects. drop-temp-schema-v1.patch:12: trailing whitespace. * As the in-memory state is not cleaned up either here, upon drop-temp-schema-v1.patch:13: trailing whitespace. * recreation of a temporary schema within the same session the error: patch failed: src/backend/commands/dropcmds.c:101 error: src/backend/commands/dropcmds.c: patch does not apply I think, above warnings are due to "trailing CRs" in patch. Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Wed, Dec 25, 2019 at 10:07:58AM +0530, Mahendra Singh wrote: > Yes, you are right that we can drop temporary schema of other sessions. I have mentioned that upthread, and basically we need to use isAnyTempNamespace() here. My mistake. > While applying attached patch on HEAD, I got below warnings: The patch applies cleanly for me. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Wed, Dec 25, 2019 at 12:24:10PM +0900, Michael Paquier wrote: > Arf. Yes, this had better be isAnyTempNamespace() so as we complain > about all of them. Okay, finally coming back to that. Attached is an updated patch with polished comments and the fixed logic. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Mahendra Singh
Date:
On Thu, 26 Dec 2019 at 19:23, Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Dec 25, 2019 at 12:24:10PM +0900, Michael Paquier wrote: > > Arf. Yes, this had better be isAnyTempNamespace() so as we complain > > about all of them. > > Okay, finally coming back to that. Attached is an updated patch with > polished comments and the fixed logic. Thanks Michael for patch. Patch is fixing all the issues. I think, we can add a regression test for this. postgres=# create temporary table temp(c1 int); CREATE TABLE postgres=# drop schema pg_temp_3 cascade ; ERROR: cannot drop temporary namespace "pg_temp_3" postgres=# I have one doubt. Please give me your opinion on below doubt. Let suppose, I connected 10 sessions at a time and created 1 temporary table to each session. Then it is creating schema from pg_temp_3 to pg_temp_12 (one schema for each temp table session). After that, I closed all the 10 sessions but if I connect again any session and checking all the schema, It is still showing pg_temp_3 to pg_temp_12. Is this expected behavior ? or we should not display any temp table schema after closing session. I thought that auto_vacuum wlll drop all the temp table schema but it is not drooping. Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
From
Tom Lane
Date:
Mahendra Singh <mahi6run@gmail.com> writes: > I think, we can add a regression test for this. > postgres=# create temporary table temp(c1 int); > CREATE TABLE > postgres=# drop schema pg_temp_3 cascade ; > ERROR: cannot drop temporary namespace "pg_temp_3" > postgres=# No, we can't, because the particular temp namespace used by a given session isn't stable. > I thought that auto_vacuum wlll drop all > the temp table schema but it is not drooping. Generally speaking, once a particular pg_temp_N schema exists it's never dropped, just recycled for use by subsequent sessions. regards, tom lane
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Mahendra Singh
Date:
On Thu, 26 Dec 2019 at 23:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Mahendra Singh <mahi6run@gmail.com> writes: > > I think, we can add a regression test for this. > > postgres=# create temporary table temp(c1 int); > > CREATE TABLE > > postgres=# drop schema pg_temp_3 cascade ; > > ERROR: cannot drop temporary namespace "pg_temp_3" > > postgres=# > > No, we can't, because the particular temp namespace used by a given > session isn't stable. > > > I thought that auto_vacuum wlll drop all > > the temp table schema but it is not drooping. > > Generally speaking, once a particular pg_temp_N schema exists it's > never dropped, just recycled for use by subsequent sessions. Okay. Understood. Thanks for clarification. Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Dec 27, 2019 at 12:33:03AM +0530, Mahendra Singh wrote: > On Thu, 26 Dec 2019 at 23:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> No, we can't, because the particular temp namespace used by a given >> session isn't stable. And I'd prefer keep the name of the namespace in the error message, because the information is helpful. >>> I thought that auto_vacuum wlll drop all >>> the temp table schema but it is not drooping. >> >> Generally speaking, once a particular pg_temp_N schema exists it's >> never dropped, just recycled for use by subsequent sessions. > > Okay. Understood. Thanks for clarification. Please see RemoveTempRelations() for the details, which uses PERFORM_DELETION_SKIP_ORIGINAL to avoid a drop of the temp schema, and just work on all the objects the schema includes. And committed down to 9.4. We use much more "temporary schema" in error messages actually, so I have switched to that. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Robert Haas
Date:
On Fri, Dec 27, 2019 at 4:06 AM Michael Paquier <michael@paquier.xyz> wrote: > And committed down to 9.4. We use much more "temporary schema" in > error messages actually, so I have switched to that. I think this was a bad idea and that it should be reverted. It seems to me that the problem here is that you introduced a feature which had a bug, namely that it couldn't tolerate concurrency, and when somebody discovered the bug, you "fixed" it not by making the code able to tolerate concurrent activity but by preventing concurrent activity from happening in the first place. I think that's wrong on general principle. In this specific case, DROP SCHEMA on another temporary sessions's schema is a feature which has existed for a very long time and which I have used on multiple occasions to repair damaged databases. Suppose, for example, there's a catalog entry that prevents the schema from being dropped. Before this commit, you could fix it or delete the entry and then retry the drop. Now, you can't. You can maybe wait for autovacuum to retry it or something, assuming autovacuum is working and you're in multi-user mode. But even if that weren't the case, this seems like a very fragile fix. Maybe someday we'll allow multiple autovacuum workers in the same database, and the problem comes back. Maybe some user who can't drop the schema because of this arbitrary prohibition will find themselves forced to delete the pg_namespace row by hand and then crash the server. Most server code is pretty careful that to either tolerate missing system catalog tuples or elog(ERROR), not crash (e.g. cache lookup failed for ...). This code shouldn't be an exception to that rule. Also, as a matter of procedure, 3 days from first post to commit is not a lot, especially when the day something is posted is Christmas Eve. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Sun, Dec 29, 2019 at 07:37:15AM -0500, Robert Haas wrote: > I think this was a bad idea and that it should be reverted. It seems > to me that the problem here is that you introduced a feature which had > a bug, namely that it couldn't tolerate concurrency, and when somebody > discovered the bug, you "fixed" it not by making the code able to > tolerate concurrent activity but by preventing concurrent activity > from happening in the first place. I think that's wrong on general > principle. Sorry for the delay, there was a long period off here so I could not have a serious look. The behavior of the code in 246a6c8 has changed so as a non-existing temporary namespace is considered as not in use, in which case autovacuum would consider this relation to be orphaned, and it would then try to drop it. Anyway, just a revert of the patch is not a good idea either, because keeping around the old behavior allows any user to create orphaned relations that autovacuum would just ignore in 9.4~10, leading to ultimately a forced shutdown of the instance as no cleanup can happen if this goes unnoticed. This also puts pg_class into an inconsistent state as pg_class entries would include references to a namespace that does not exist for sessions still holding its own references to myTempNamespace/myTempToastNamespace. > In this specific case, DROP SCHEMA on another temporary sessions's > schema is a feature which has existed for a very long time and which I > have used on multiple occasions to repair damaged databases. Suppose, > for example, there's a catalog entry that prevents the schema from > being dropped. Before this commit, you could fix it or delete the > entry and then retry the drop. Now, you can't. You can maybe wait for > autovacuum to retry it or something, assuming autovacuum is working > and you're in multi-user mode. This behavior is broken since its introduction then per the above. If we were to allow DROP SCHEMA to work properly on temporary schema, we would need to do more than what we have now, and that does not involve just mimicking DISCARD TEMP if you really wish to be able to drop the schema entirely and not only the objects it includes. Allowing a temporary schema to be dropped only if it is owned by the current session would be simple enough to implement, but I think that allowing that to work properly for a schema owned by another session would be rather difficult to implement for little gains. Now, if you still wish to be able to do a DROP SCHEMA on a temporary schema, I have no objections to allow doing that, but under some conditions. So I would recommend to restrict it so as this operation is not allowed by default, and I think we ought to use allow_system_table_mods to control that, because if you were to do that you are an operator and you know what you are doing. Normally :) > But even if that weren't the case, this seems like a very fragile fix. > Maybe someday we'll allow multiple autovacuum workers in the same > database, and the problem comes back. Maybe some user who can't drop > the schema because of this arbitrary prohibition will find themselves > forced to delete the pg_namespace row by hand and then crash the > server. Most server code is pretty careful that to either tolerate > missing system catalog tuples or elog(ERROR), not crash (e.g. cache > lookup failed for ...). This code shouldn't be an exception to that > rule. You are right here, things could be done better in 11 and newer versions, still there are multiple ways to do that. Here are three suggestions: 1) Issue an elog(ERROR) as that's what we do usually for lookup errors and such when seeing an orphaned relation which refers to a non-existing namespace. But this would prevent autovacuum to do any kind of work and just loop over-and-over on the same error, just bloating the database involved. 2) Ignore the relation and leave it around, though we really have been fighting to make autovacuum more aggressive, so that would defeat the work done lately for that purpose. 3) Still drop the orphaned relation even if it references to a non-existing schema, generating an appropriate LOG message so as the problem comes from an incorrect lookup at the namespace name. Attached is a patch doing two things: a) Control DROP SCHEMA on a temporary namespace using allow_system_table_mods. b) Generate a non-buggy LOG message if trying to remove a temp relation referring to a temporary schema that does not exist, using "(null)" as a replacement for the schema name. My suggestion is to do a) down to 9.4 if that's thought to be helpful to have, and at least Robert visibly thinks so, then b) in 11~ as that's where 246a6c8 exists. Comments welcome. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Robert Haas
Date:
On Sun, Jan 5, 2020 at 8:42 PM Michael Paquier <michael@paquier.xyz> wrote: > The behavior of the code in 246a6c8 has changed so as a non-existing > temporary namespace is considered as not in use, in which case > autovacuum would consider this relation to be orphaned, and it would > then try to drop it. Anyway, just a revert of the patch is not a good > idea either, because keeping around the old behavior allows any user > to create orphaned relations that autovacuum would just ignore in > 9.4~10, leading to ultimately a forced shutdown of the instance as no > cleanup can happen if this goes unnoticed. This also puts pg_class > into an inconsistent state as pg_class entries would include > references to a namespace that does not exist for sessions still > holding its own references to myTempNamespace/myTempToastNamespace. I'm not arguing for a revert of 246a6c8. I think we should just change this: ereport(LOG, (errmsg("autovacuum: dropping orphan temp table \"%s.%s.%s\"", get_database_name(MyDatabaseId), get_namespace_name(classForm->relnamespace), NameStr(classForm->relname)))); To look more like: char *nspname = get_namespace_name(classForm->relnamespace); if (nspname != NULL) ereport(..."autovacuum: dropping orphan temp table \"%s.%s.%s\"...) else ereport(..."autovacuum: dropping orphan temp table with OID %u"....) If we do that, then I think we can just revert a052f6cbb84e5630d50b68586cecc127e64be639 completely. As a side benefit, this would also provide some insurance against other possibly-problematic situations, like a corrupted pg_class row with a garbage value in the relnamespace field, which is something I've seen multiple times in the field. I can't quite understand your comments about why we shouldn't do that, but the reported bug is just a null pointer reference. Incredibly, autovacuum.c seems to have been using get_namespace_name() without a null check since 2006, so it's not really the fault of your patch as I had originally thought. I wonder how in the world we've managed to get away with it for as long as we have. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > I'm not arguing for a revert of 246a6c8. I think we should just change this: > ... > To look more like: > char *nspname = get_namespace_name(classForm->relnamespace); > if (nspname != NULL) > ereport(..."autovacuum: dropping orphan temp table \"%s.%s.%s\"...) > else > ereport(..."autovacuum: dropping orphan temp table with OID %u"....) > If we do that, then I think we can just revert > a052f6cbb84e5630d50b68586cecc127e64be639 completely. +1 to both of those --- although I think we could still provide the table name in the null-nspname case. > autovacuum.c seems to have been using get_namespace_name() without a > null check since 2006, so it's not really the fault of your patch as I > had originally thought. I wonder how in the world we've managed to get > away with it for as long as we have. Maybe we haven't. It's not clear that infrequent autovac crashes would get reported to us, or that we'd successfully find the cause if they were. I think what you propose above is a back-patchable bug fix. regards, tom lane
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Mon, Jan 06, 2020 at 12:33:47PM -0500, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I'm not arguing for a revert of 246a6c8. I think we should just change this: >> ... >> To look more like: > >> char *nspname = get_namespace_name(classForm->relnamespace); >> if (nspname != NULL) >> ereport(..."autovacuum: dropping orphan temp table \"%s.%s.%s\"...) >> else >> ereport(..."autovacuum: dropping orphan temp table with OID %u"....) > >> If we do that, then I think we can just revert >> a052f6cbb84e5630d50b68586cecc127e64be639 completely. > > +1 to both of those --- although I think we could still provide the > table name in the null-nspname case. Okay for the first one, printing the OID sounds like a good idea. Like Tom, I would prefer keeping the relation name with "(null)" for the schema name. Or even better, could we just print the OID all the time? What's preventing us from showing that information in the first place? And that still looks good to have when debugging issues IMO for orphaned entries. For the second one, I would really wish that we keep the restriction put in place by a052f6c until we actually figure out how to make the operation safe in the ways we want it to work because this puts the catalogs into an inconsistent state for any object type able to use a temporary schema, like functions, domains etc. for example able to use "pg_temp" as a synonym for the temp namespace name. And any connected user is able to do that. On top of that, except for tables, these could remain as orphaned entries after a crash, no? Allowing the operation only via allow_system_table_mods gives an exit path actually if you really wish to do so, which is fine by me as startup controls that, aka an administrator. In short, I don't think that it is sane to keep in place the property, visibly accidental (?) for any user to create inconsistent catalog entries using a static state in the session which is incorrect in namespace.c, except if we make DROP SCHEMA on a temporary schema have a behavior close to DISCARD TEMP. Again, for the owner of the session that's simple, no clear idea how to do that safely when the drop is done from another session not owning the temp schema. > Maybe we haven't. It's not clear that infrequent autovac crashes would > get reported to us, or that we'd successfully find the cause if they were. > > I think what you propose above is a back-patchable bug fix. Yeah, likely it is safer to fix the logs in the long run down to 9.4. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Robert Haas
Date:
On Mon, Jan 6, 2020 at 7:22 PM Michael Paquier <michael@paquier.xyz> wrote: > Okay for the first one, printing the OID sounds like a good idea. > Like Tom, I would prefer keeping the relation name with "(null)" for > the schema name. Or even better, could we just print the OID all the > time? What's preventing us from showing that information in the first > place? And that still looks good to have when debugging issues IMO > for orphaned entries. I think we should have two different messages, rather than trying to shoehorn things into one message using a fake schema name. > For the second one, I would really wish that we keep the restriction > put in place by a052f6c until we actually figure out how to make the > operation safe in the ways we want it to work because this puts > the catalogs into an inconsistent state for any object type able to > use a temporary schema, like functions, domains etc. for example able > to use "pg_temp" as a synonym for the temp namespace name. And any > connected user is able to do that. So what? > On top of that, except for tables, > these could remain as orphaned entries after a crash, no? Tables, too, although they want have storage any more. But your patch in no way prevents that. It just makes it harder to fix when it does happen. So I see no advantages of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jan 6, 2020 at 7:22 PM Michael Paquier <michael@paquier.xyz> wrote: >> For the second one, I would really wish that we keep the restriction >> put in place by a052f6c until we actually figure out how to make the >> operation safe in the ways we want it to work because this puts >> the catalogs into an inconsistent state for any object type able to >> use a temporary schema, like functions, domains etc. for example able >> to use "pg_temp" as a synonym for the temp namespace name. And any >> connected user is able to do that. > So what? I still agree with Robert that a052f6c is a bad idea. It's not the case that that's blocking "any connected user" from causing an issue. The temp schemas are always owned by the bootstrap superuser, so only a superuser could delete them. All that that patch is doing is preventing superusers from doing something that they could reasonably wish to do, and that is perfectly safe when there's not concurrent usage of the schema. We are not normally that nanny-ish, and the case for being so here seems pretty thin. regards, tom lane
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Tue, Jan 07, 2020 at 01:06:08PM -0500, Tom Lane wrote: > I still agree with Robert that a052f6c is a bad idea. It's not the case > that that's blocking "any connected user" from causing an issue. The > temp schemas are always owned by the bootstrap superuser, so only a > superuser could delete them. All that that patch is doing is preventing > superusers from doing something that they could reasonably wish to do, > and that is perfectly safe when there's not concurrent usage of the > schema. We are not normally that nanny-ish, and the case for being so > here seems pretty thin. Okay, I am running out of arguments then, so attached is a patch to address things. I would also prefer if we keep the relation name in the log even if the namespace is missing. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes: > Okay, I am running out of arguments then, so attached is a patch to > address things. I would also prefer if we keep the relation name in > the log even if the namespace is missing. A couple of thoughts: * Please revert a052f6c as a separate commit specifically doing that, so that when it comes time to make the release notes, it's clear that a052f6c doesn't require documentation. * I think the check on log_min_messages <= LOG is probably wrong, since LOG sorts out of order for this purpose. Compare is_log_level_output() in elog.c. I'd suggest not bothering with trying to optimize away the get_namespace_name call here; we shouldn't be in this code path often enough for performance to matter, and nobody ever cared about it before. * I don't greatly like the notation dropping orphan temp table \"%s.(null).%s\" ... and I bet Robert won't either. Not sure offhand about a better idea --- maybe dropping orphan temp table \"%s\" with OID %u in database \"%s\" regards, tom lane
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Tue, Jan 07, 2020 at 07:55:17PM -0500, Tom Lane wrote: > * Please revert a052f6c as a separate commit specifically doing that, > so that when it comes time to make the release notes, it's clear that > a052f6c doesn't require documentation. Okay. Committed the revert first then. > * I think the check on log_min_messages <= LOG is probably wrong, since > LOG sorts out of order for this purpose. Compare is_log_level_output() > in elog.c. I'd suggest not bothering with trying to optimize away the > get_namespace_name call here; we shouldn't be in this code path often > enough for performance to matter, and nobody ever cared about it before. Done. > * I don't greatly like the notation > dropping orphan temp table \"%s.(null).%s\" ... > and I bet Robert won't either. Not sure offhand about a better > idea --- maybe > dropping orphan temp table \"%s\" with OID %u in database \"%s\" And done this way as per the attached. I am of course open to objections or better ideas, though this looks formulation looks pretty good to me. Robert? -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Wed, Jan 08, 2020 at 10:56:01AM +0900, Michael Paquier wrote: > And done this way as per the attached. I am of course open to > objections or better ideas, though this looks formulation looks pretty > good to me. Robert? Just to be clear here, I would like to commit this patch and backpatch with the current formulation in the error strings in the follow-up days. In 9.4~10, the error cannot be reached, but that feels safer if we begin to work again on this portion of the autovacuum code. So if you would like to object, that's the moment.. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Mahendra Singh Thalor
Date:
On Thu, 9 Jan 2020 at 09:36, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jan 08, 2020 at 10:56:01AM +0900, Michael Paquier wrote:
> > And done this way as per the attached. I am of course open to
> > objections or better ideas, though this looks formulation looks pretty
> > good to me. Robert?
>
> Just to be clear here, I would like to commit this patch and backpatch
> with the current formulation in the error strings in the follow-up
> days. In 9.4~10, the error cannot be reached, but that feels safer if
> we begin to work again on this portion of the autovacuum code. So if
> you would like to object, that's the moment..
> --
Hi,
I reviewed and tested the patch. After applying patch, I am getting other assert failure.
postgres=# CREATE TEMPORARY TABLE temp (a int);
CREATE TABLE
postgres=# \d
List of relations
Schema | Name | Type | Owner
-----------+------+-------+----------
pg_temp_3 | temp | table | mahendra
(1 row)
postgres=# drop schema pg_temp_3 cascade ;
NOTICE: drop cascades to table temp
DROP SCHEMA
postgres=# \d
Did not find any relations.
postgres=# CREATE TEMPORARY TABLE temp (a int);
CREATE TABLE
postgres=# \d
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
postgres=#
#0 0x00007f7d749bd277 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f7d749be968 in __GI_abort () at abort.c:90
#2 0x0000000000eca3c4 in ExceptionalCondition (conditionName=0x114cc08 "relation->rd_backend != InvalidBackendId", errorType=0x114ca8b "FailedAssertion",
fileName=0x114c8b0 "relcache.c", lineNumber=1123) at assert.c:67
#3 0x0000000000eaacb9 in RelationBuildDesc (targetRelId=16392, insertIt=true) at relcache.c:1123
#4 0x0000000000eadf61 in RelationIdGetRelation (relationId=16392) at relcache.c:2021
#5 0x000000000049f370 in relation_open (relationId=16392, lockmode=8) at relation.c:59
#6 0x000000000064ccda in heap_drop_with_catalog (relid=16392) at heap.c:1890
#7 0x00000000006435f3 in doDeletion (object=0x2d623c0, flags=21) at dependency.c:1360
#8 0x0000000000643180 in deleteOneObject (object=0x2d623c0, depRel=0x7ffcb9636290, flags=21) at dependency.c:1261
#9 0x0000000000640d97 in deleteObjectsInList (targetObjects=0x2dce438, depRel=0x7ffcb9636290, flags=21) at dependency.c:271
#10 0x0000000000640ed6 in performDeletion (object=0x7ffcb96363b0, behavior=DROP_CASCADE, flags=21) at dependency.c:356
#11 0x0000000000aebc3d in do_autovacuum () at autovacuum.c:2269
#12 0x0000000000aea478 in AutoVacWorkerMain (argc=0, argv=0x0) at autovacuum.c:1693
#13 0x0000000000ae9cf9 in StartAutoVacWorker () at autovacuum.c:1487
#14 0x0000000000b13cdc in StartAutovacuumWorker () at postmaster.c:5562
#15 0x0000000000b131b5 in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5279
#16 <signal handler called>
#17 0x00007f7d74a7cc53 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
#18 0x0000000000b09fc9 in ServerLoop () at postmaster.c:1691
#19 0x0000000000b09544 in PostmasterMain (argc=3, argv=0x2ce2290) at postmaster.c:1400
#20 0x0000000000974b43 in main (argc=3, argv=0x2ce2290) at main.c:210
I think, before committing 1st patch, we should fix this crash also and then we should commit all the patches.
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
>
> On Wed, Jan 08, 2020 at 10:56:01AM +0900, Michael Paquier wrote:
> > And done this way as per the attached. I am of course open to
> > objections or better ideas, though this looks formulation looks pretty
> > good to me. Robert?
>
> Just to be clear here, I would like to commit this patch and backpatch
> with the current formulation in the error strings in the follow-up
> days. In 9.4~10, the error cannot be reached, but that feels safer if
> we begin to work again on this portion of the autovacuum code. So if
> you would like to object, that's the moment..
> --
Hi,
I reviewed and tested the patch. After applying patch, I am getting other assert failure.
postgres=# CREATE TEMPORARY TABLE temp (a int);
CREATE TABLE
postgres=# \d
List of relations
Schema | Name | Type | Owner
-----------+------+-------+----------
pg_temp_3 | temp | table | mahendra
(1 row)
postgres=# drop schema pg_temp_3 cascade ;
NOTICE: drop cascades to table temp
DROP SCHEMA
postgres=# \d
Did not find any relations.
postgres=# CREATE TEMPORARY TABLE temp (a int);
CREATE TABLE
postgres=# \d
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
postgres=#
Stack trace:
(gdb) bt#0 0x00007f7d749bd277 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f7d749be968 in __GI_abort () at abort.c:90
#2 0x0000000000eca3c4 in ExceptionalCondition (conditionName=0x114cc08 "relation->rd_backend != InvalidBackendId", errorType=0x114ca8b "FailedAssertion",
fileName=0x114c8b0 "relcache.c", lineNumber=1123) at assert.c:67
#3 0x0000000000eaacb9 in RelationBuildDesc (targetRelId=16392, insertIt=true) at relcache.c:1123
#4 0x0000000000eadf61 in RelationIdGetRelation (relationId=16392) at relcache.c:2021
#5 0x000000000049f370 in relation_open (relationId=16392, lockmode=8) at relation.c:59
#6 0x000000000064ccda in heap_drop_with_catalog (relid=16392) at heap.c:1890
#7 0x00000000006435f3 in doDeletion (object=0x2d623c0, flags=21) at dependency.c:1360
#8 0x0000000000643180 in deleteOneObject (object=0x2d623c0, depRel=0x7ffcb9636290, flags=21) at dependency.c:1261
#9 0x0000000000640d97 in deleteObjectsInList (targetObjects=0x2dce438, depRel=0x7ffcb9636290, flags=21) at dependency.c:271
#10 0x0000000000640ed6 in performDeletion (object=0x7ffcb96363b0, behavior=DROP_CASCADE, flags=21) at dependency.c:356
#11 0x0000000000aebc3d in do_autovacuum () at autovacuum.c:2269
#12 0x0000000000aea478 in AutoVacWorkerMain (argc=0, argv=0x0) at autovacuum.c:1693
#13 0x0000000000ae9cf9 in StartAutoVacWorker () at autovacuum.c:1487
#14 0x0000000000b13cdc in StartAutovacuumWorker () at postmaster.c:5562
#15 0x0000000000b131b5 in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5279
#16 <signal handler called>
#17 0x00007f7d74a7cc53 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
#18 0x0000000000b09fc9 in ServerLoop () at postmaster.c:1691
#19 0x0000000000b09544 in PostmasterMain (argc=3, argv=0x2ce2290) at postmaster.c:1400
#20 0x0000000000974b43 in main (argc=3, argv=0x2ce2290) at main.c:210
I think, before committing 1st patch, we should fix this crash also and then we should commit all the patches.
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Jan 10, 2020 at 11:56:37AM +0530, Mahendra Singh Thalor wrote: > I reviewed and tested the patch. After applying patch, I am getting other > assert failure. > > I think, before committing 1st patch, we should fix this crash also and > then we should commit all the patches. I have somewhat managed to break my environment for a couple of days so as I got zero testing done with assertions, so I missed this one. Thanks for the lookup! The environment is fixed since. This code path uses an assertion that would become incorrect once you are able to create in pg_class temporary relations which rely on a temporary schema that does not exist anymore, because its schema has been dropped it, and that's what you are doing. The assertion does not concern only autovacuum originally, as it would fail each time a session tries to build a relation descriptor for its cache with a relation using a non-existing namespace. I have not really dug if that's actually possible to trigger.. Anyway. So, on the one hand, saying that we allow orphaned temporary relations to be dropped even if their schema does not exist is what autovacuum does now more aggressively, so that can help to avoid having to clean up yourself orphaned entries from catalogs, following up with their toast entries, etc. And this approach makes the assertion lose its meaning for autovacuum. On the other hand keeping this assertion makes sure that we never try to load incorrect relcache entries, and just make autovacuum less aggressive by ignoring orphaned entries with incorrect namespace references, though the user experience in fixing the cluster means manual manipulation of the catalogs. This is something I understood we'd like to avoid as much as possible, while keeping autovacuum aggressive on the removal as that can ease the life of people fixing a cluster. So this would bring us back to a point intermediate of 246a6c8. This makes me wonder how much we should try to outsmart somebody which puts the catalogs in such a inconsistent state. Hmm. Perhaps at the end autovacuum should just ignore such entries and just don't help the user at all as this also comes with its own issues with the storage level as well as smgr.c uses rd_backend. And if the user plays with temporary namespaces like that with superuser rights, he likely knows what he is doing. Perhaps not :D, in which case autovacuum may not be the best thing to decide that. I still think we should make the log of autovacuum.c for orphaned relations more careful with its coding though, and fix it with the previous patch. The documentation of isTempNamespaceInUse() could gain in clarity, just a nit from me while looking at the surroundings. And actually I found an issue with its logic, as the routine would not consider a temp namespace in use for a session's own MyBackendId. As that's only used for autovacuum, this has no consequence, but let's be correct in hte long run. And this gives the attached after a closer lookup. Thoughts? -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Jan 10, 2020 at 05:01:25PM +0900, Michael Paquier wrote: > This makes me wonder how much we should try to outsmart somebody which > puts the catalogs in such a inconsistent state. Hmm. Perhaps at the > end autovacuum should just ignore such entries and just don't help the > user at all as this also comes with its own issues with the storage > level as well as smgr.c uses rd_backend. And if the user plays with > temporary namespaces like that with superuser rights, he likely knows > what he is doing. Perhaps not :D, in which case autovacuum may not be > the best thing to decide that. I still think we should make the log > of autovacuum.c for orphaned relations more careful with its coding > though, and fix it with the previous patch. The documentation of > isTempNamespaceInUse() could gain in clarity, just a nit from me while > looking at the surroundings. And actually I found an issue with its > logic, as the routine would not consider a temp namespace in use for a > session's own MyBackendId. As that's only used for autovacuum, this > has no consequence, but let's be correct in hte long run. > > And this gives the attached after a closer lookup. Thoughts? Thinking more about it, this has a race condition if a temporary schema is removed after collecting the OIDs in the drop phase. So the updated attached is actually much more conservative and does not need an update of the log message, without giving up on the improvements done in v11~. In 9.4~10, the code of the second phase relies on GetTempNamespaceBackendId() which causes an orphaned relation to not be dropped in the event of a missing namespace. I'll just leave that alone for a couple of days now.. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Mahendra Singh Thalor
Date:
On Fri, 10 Jan 2020 at 16:37, Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Jan 10, 2020 at 05:01:25PM +0900, Michael Paquier wrote: > > This makes me wonder how much we should try to outsmart somebody which > > puts the catalogs in such a inconsistent state. Hmm. Perhaps at the > > end autovacuum should just ignore such entries and just don't help the > > user at all as this also comes with its own issues with the storage > > level as well as smgr.c uses rd_backend. And if the user plays with > > temporary namespaces like that with superuser rights, he likely knows > > what he is doing. Perhaps not :D, in which case autovacuum may not be > > the best thing to decide that. I still think we should make the log > > of autovacuum.c for orphaned relations more careful with its coding > > though, and fix it with the previous patch. The documentation of > > isTempNamespaceInUse() could gain in clarity, just a nit from me while > > looking at the surroundings. And actually I found an issue with its > > logic, as the routine would not consider a temp namespace in use for a > > session's own MyBackendId. As that's only used for autovacuum, this > > has no consequence, but let's be correct in hte long run. > > > > And this gives the attached after a closer lookup. Thoughts? > > Thinking more about it, this has a race condition if a temporary > schema is removed after collecting the OIDs in the drop phase. So the > updated attached is actually much more conservative and does not need > an update of the log message, without giving up on the improvements > done in v11~. In 9.4~10, the code of the second phase relies on > GetTempNamespaceBackendId() which causes an orphaned relation to not > be dropped in the event of a missing namespace. I'll just leave that > alone for a couple of days now.. > -- Thanks for the patch. I am not getting any crash but \d is not showing any temp table if we drop temp schema and create again temp table. postgres=# create temporary table test1 (a int); CREATE TABLE postgres=# \d List of relations Schema | Name | Type | Owner -----------+-------+-------+---------- pg_temp_3 | test1 | table | mahendra (1 row) postgres=# drop schema pg_temp_3 cascade ; NOTICE: drop cascades to table test1 DROP SCHEMA postgres=# \d Did not find any relations. postgres=# create temporary table test1 (a int); CREATE TABLE postgres=# \d Did not find any relations. postgres=# Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes: > [ patch to skip tables if get_namespace_name fails ] This doesn't seem like a very good idea to me. Is there any evidence that it's fixing an actual problem? What if the table you're skipping is holding back datfrozenxid? regards, tom lane
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Jan 10, 2020 at 09:50:58AM -0500, Tom Lane wrote: > This doesn't seem like a very good idea to me. Is there any > evidence that it's fixing an actual problem? What if the table > you're skipping is holding back datfrozenxid? That's the point I wanted to make sure: we don't because autovacuum has never actually been able to do that and because the cluster is put in this state by a superuser after issuing DROP SCHEMA on its temporary schema, which allows many fancy things based on the inconsistent state the session is on. Please see see for example REL_10_STABLE where GetTempNamespaceBackendId() would return InvalidBackendId when the namespace does not exist, so the drop is skipped. 246a6c8 (designed to track if a backend slot is using a temp namespace or not, allowing cleanup of orphaned tables if the namespace is around, still not used yet by the session it is assigned to) has changed the logic, accidentally actually, to also allow an orphaned temp table to be dropped even if its namespace does not exist anymore. If we say that it's fine for autovacuum to allow the drop of such inconsistent pg_class entries, then we would need to either remove or relax the assertion in relcache.c:1123 (RelationBuildDesc, should only autovacuum be allowed to do so?) to begin to allow autovacuum to remove temp relations. However, this does not sound like a correct thing to do IMO. So, note that if autovacuum is allowed to do so, you basically defeat partially the purpose of the assertion added by debcec7d in relcache.c. Another thing noticeable is that If autovacuum does the pg_class entry drops, the on-disk files for the temp relations would remain until the cluster is restarted by the way. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Jan 10, 2020 at 05:54:21PM +0530, Mahendra Singh Thalor wrote: > Thanks for the patch. I am not getting any crash but \d is not showing > any temp table if we drop temp schema and create again temp table. That's expected. As discussed on this thread, the schema has been dropped by a superuser and there are cases where it is helpful to do so, so the relation you have created after DROP SCHEMA relies on an inconsistent session state. If you actually try to use \d with a relation name that matches the one you just created, psql would just show nothing for the namespace name. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Jan 10, 2020 at 08:07:48PM +0900, Michael Paquier wrote: > Thinking more about it, this has a race condition if a temporary > schema is removed after collecting the OIDs in the drop phase. So the > updated attached is actually much more conservative and does not need > an update of the log message, without giving up on the improvements > done in v11~. In 9.4~10, the code of the second phase relies on > GetTempNamespaceBackendId() which causes an orphaned relation to not > be dropped in the event of a missing namespace. I'll just leave that > alone for a couple of days now.. And back on that one, I still like better the solution as of the attached which skips any relations with their namespace gone missing as 246a6c87's intention was only to allow orphaned temp relations to be dropped by autovacuum when a backend slot is connected, but not using yet its own temp namespace. If we want the drop of temp relations to work properly, more thoughts are needed regarding the storage part, and I am not actually sure that it is autovacuum's job to handle that better. Any thoughts? -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp table schema
From
Mahendra Singh Thalor
Date:
On Thu, 16 Jan 2020 at 09:36, Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Jan 10, 2020 at 08:07:48PM +0900, Michael Paquier wrote: > > Thinking more about it, this has a race condition if a temporary > > schema is removed after collecting the OIDs in the drop phase. So the > > updated attached is actually much more conservative and does not need > > an update of the log message, without giving up on the improvements > > done in v11~. In 9.4~10, the code of the second phase relies on > > GetTempNamespaceBackendId() which causes an orphaned relation to not > > be dropped in the event of a missing namespace. I'll just leave that > > alone for a couple of days now.. > > And back on that one, I still like better the solution as of the > attached which skips any relations with their namespace gone missing > as 246a6c87's intention was only to allow orphaned temp relations to > be dropped by autovacuum when a backend slot is connected, but not > using yet its own temp namespace. > > If we want the drop of temp relations to work properly, more thoughts > are needed regarding the storage part, and I am not actually sure that > it is autovacuum's job to handle that better. > > Any thoughts? Hi, Patch looks good to me and it is fixing the assert failure. -- Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Feb 28, 2020 at 11:29:56AM +0530, Mahendra Singh Thalor wrote: > Patch looks good to me and it is fixing the assert failure. Thanks for looking at the patch. Bringing the code to act consistently with what was done in 246a6c8 still looks like the correct direction to take, in short don't drop temp relations created without an existing temp schema and ignore them instead of creating more issues with the storage, so I'd like to apply and back-patch this stuff down to 11. First, let's wait a couple of extra days. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes: > And back on that one, I still like better the solution as of the > attached which skips any relations with their namespace gone missing > as 246a6c87's intention was only to allow orphaned temp relations to > be dropped by autovacuum when a backend slot is connected, but not > using yet its own temp namespace. Simply skipping the drop looks like basically the right fix to me. A tiny nit is that using "get_namespace_name(...) != NULL" as a test for existence of the namespace seems a bit weird/unreadable. I'd be more inclined to code that as a SearchSysCacheExists test, at least in the place where you don't actually need the namespace name. Also, I notice that isTempNamespaceInUse is already detecting the case where the namespace doesn't exist or isn't really a temp namespace. I wonder whether it'd be better to teach that to return an indicator about the namespace not being what you think it is. That would force us to look at its other callers to see if any of them have related bugs, which seems like a good thing to check --- and even if they don't, having to think about the point in future call sites might forestall new bugs. regards, tom lane
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
From
Tom Lane
Date:
I wrote: > Also, I notice that isTempNamespaceInUse is already detecting the case > where the namespace doesn't exist or isn't really a temp namespace. > I wonder whether it'd be better to teach that to return an indicator about > the namespace not being what you think it is. That would force us to look > at its other callers to see if any of them have related bugs, which seems > like a good thing to check --- and even if they don't, having to think > about the point in future call sites might forestall new bugs. After poking around, I see there aren't any other callers. But I think that the cause of this bug is clearly failure to think carefully about the different cases that isTempNamespaceInUse is recognizing, so that the right way to fix it is more like the attached. In the back branches, we could leave isTempNamespaceInUse() in place but unused, just in case somebody is calling it. I kind of doubt that anyone is, given the small usage in core, but maybe. regards, tom lane diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index e70243a..5ff7824 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -3217,7 +3217,7 @@ isOtherTempNamespace(Oid namespaceId) } /* - * isTempNamespaceInUse - is the given namespace owned and actively used + * checkTempNamespaceStatus - is the given namespace owned and actively used * by a backend? * * Note: this can be used while scanning relations in pg_class to detect @@ -3225,8 +3225,8 @@ isOtherTempNamespace(Oid namespaceId) * given database. The result may be out of date quickly, so the caller * must be careful how to handle this information. */ -bool -isTempNamespaceInUse(Oid namespaceId) +TempNamespaceStatus +checkTempNamespaceStatus(Oid namespaceId) { PGPROC *proc; int backendId; @@ -3235,25 +3235,25 @@ isTempNamespaceInUse(Oid namespaceId) backendId = GetTempNamespaceBackendId(namespaceId); - /* No such temporary namespace? */ + /* No such namespace, or its name shows it's not temp? */ if (backendId == InvalidBackendId) - return false; + return TEMP_NAMESPACE_NOT_TEMP; /* Is the backend alive? */ proc = BackendIdGetProc(backendId); if (proc == NULL) - return false; + return TEMP_NAMESPACE_IDLE; /* Is the backend connected to the same database we are looking at? */ if (proc->databaseId != MyDatabaseId) - return false; + return TEMP_NAMESPACE_IDLE; /* Does the backend own the temporary namespace? */ if (proc->tempNamespaceId != namespaceId) - return false; + return TEMP_NAMESPACE_IDLE; - /* all good to go */ - return true; + /* Yup, so namespace is busy */ + return TEMP_NAMESPACE_IN_USE; } /* diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 6d1f28c..5314228 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2073,7 +2073,7 @@ do_autovacuum(void) * We just ignore it if the owning backend is still active and * using the temporary schema. */ - if (!isTempNamespaceInUse(classForm->relnamespace)) + if (checkTempNamespaceStatus(classForm->relnamespace) == TEMP_NAMESPACE_IDLE) { /* * The table seems to be orphaned -- although it might be that @@ -2243,7 +2243,7 @@ do_autovacuum(void) continue; } - if (isTempNamespaceInUse(classForm->relnamespace)) + if (checkTempNamespaceStatus(classForm->relnamespace) != TEMP_NAMESPACE_IDLE) { UnlockRelationOid(relid, AccessExclusiveLock); continue; diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h index d67f93a..3e3a192 100644 --- a/src/include/catalog/namespace.h +++ b/src/include/catalog/namespace.h @@ -38,6 +38,16 @@ typedef struct _FuncCandidateList } *FuncCandidateList; /* + * Result of checkTempNamespaceStatus + */ +typedef enum TempNamespaceStatus +{ + TEMP_NAMESPACE_NOT_TEMP, /* nonexistent, or non-temp namespace */ + TEMP_NAMESPACE_IDLE, /* exists, belongs to no active session */ + TEMP_NAMESPACE_IN_USE /* belongs to some active session */ +} TempNamespaceStatus; + +/* * Structure for xxxOverrideSearchPath functions */ typedef struct OverrideSearchPath @@ -138,7 +148,7 @@ extern bool isTempToastNamespace(Oid namespaceId); extern bool isTempOrTempToastNamespace(Oid namespaceId); extern bool isAnyTempNamespace(Oid namespaceId); extern bool isOtherTempNamespace(Oid namespaceId); -extern bool isTempNamespaceInUse(Oid namespaceId); +extern TempNamespaceStatus checkTempNamespaceStatus(Oid namespaceId); extern int GetTempNamespaceBackendId(Oid namespaceId); extern Oid GetTempToastNamespace(void); extern void GetTempNamespaceState(Oid *tempNamespaceId,
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Feb 28, 2020 at 01:45:29PM -0500, Tom Lane wrote: > After poking around, I see there aren't any other callers. But I think > that the cause of this bug is clearly failure to think carefully about > the different cases that isTempNamespaceInUse is recognizing, so that > the right way to fix it is more like the attached. Good idea, thanks. Your suggestion looks good to me. > In the back branches, we could leave isTempNamespaceInUse() in place > but unused, just in case somebody is calling it. I kind of doubt that > anyone is, given the small usage in core, but maybe. I doubt that there are any external callers, but I'd rather leave the past API in place on back-branches. -- Michael
Attachment
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes: > On Fri, Feb 28, 2020 at 01:45:29PM -0500, Tom Lane wrote: >> After poking around, I see there aren't any other callers. But I think >> that the cause of this bug is clearly failure to think carefully about >> the different cases that isTempNamespaceInUse is recognizing, so that >> the right way to fix it is more like the attached. > Good idea, thanks. Your suggestion looks good to me. Will push that, thanks for looking. >> In the back branches, we could leave isTempNamespaceInUse() in place >> but unused, just in case somebody is calling it. I kind of doubt that >> anyone is, given the small usage in core, but maybe. > I doubt that there are any external callers, but I'd rather leave the > past API in place on back-branches. Agreed. regards, tom lane
Re: Assert failure due to "drop schema pg_temp_3 cascade" fortemporary tables and \d+ is not showing any info after drooping temp tableschema
From
Michael Paquier
Date:
On Fri, Feb 28, 2020 at 07:23:38PM -0500, Tom Lane wrote: > Will push that, thanks for looking. Thanks for the commit. -- Michael