Thread: DROP OWNED CASCADE vs Temp tables
I have a test where a user creates a temp table and then disconnect, concurrently we try to do DROP OWNED BY CASCADE on the same user. Seems this causes race condition between temptable deletion during disconnection (@RemoveTempRelations(myTempNamespace)) and DROP OWNED BY CASCADE operation which will try to remove same temp table when they find them as part of pg_shdepend. Which will result in internal error cache lookup failed as below.
DROP OWNED BY test_role CASCADE;
2020-01-07 12:35:06.524 IST [26064] ERROR: cache lookup failed for relation 41019
2020-01-07 12:35:06.524 IST [26064] STATEMENT: DROP OWNED BY test_role CASCADE;
reproduce.sql:8: ERROR: cache lookup failed for relation 41019
2020-01-07 12:35:06.524 IST [26064] ERROR: cache lookup failed for relation 41019
2020-01-07 12:35:06.524 IST [26064] STATEMENT: DROP OWNED BY test_role CASCADE;
reproduce.sql:8: ERROR: cache lookup failed for relation 41019
TEST
=====================
create database test_db;
create user test_superuser superuser;
\c test_db test_superuser
CREATE ROLE test_role nosuperuser login password 'test_pwd' ;
\c test_db test_role
CREATE TEMPORARY TABLE tmp_table(col1 int);
\c test_db test_superuser
DROP OWNED BY test_role CASCADE;
create user test_superuser superuser;
\c test_db test_superuser
CREATE ROLE test_role nosuperuser login password 'test_pwd' ;
\c test_db test_role
CREATE TEMPORARY TABLE tmp_table(col1 int);
\c test_db test_superuser
DROP OWNED BY test_role CASCADE;
On 2020-Jan-07, Mithun Cy wrote: > I have a test where a user creates a temp table and then disconnect, > concurrently we try to do DROP OWNED BY CASCADE on the same user. Seems > this causes race condition between temptable deletion during disconnection > (@RemoveTempRelations(myTempNamespace)) and DROP OWNED BY CASCADE operation > which will try to remove same temp table when they find them as part of > pg_shdepend. Cute. This seems fiddly to handle better; maybe you'd have to have a new PERFORM_DELETION_* flag that says to ignore "missing" objects; so when you go from shdepDropOwned, you pass that flag all the way down to doDeletion(), so the objtype-specific function is called with "missing_ok", and ignore if the object has already gone away. That's tedious because none of the Remove* functions have the concept of missing_ok. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-Jan-07, Mithun Cy wrote: > I have a test where a user creates a temp table and then disconnect, > concurrently we try to do DROP OWNED BY CASCADE on the same user. Seems > this causes race condition between temptable deletion during disconnection > (@RemoveTempRelations(myTempNamespace)) and DROP OWNED BY CASCADE operation > which will try to remove same temp table when they find them as part of > pg_shdepend. Cute. This seems fiddly to handle better; maybe you'd have to have a new PERFORM_DELETION_* flag that says to ignore "missing" objects; so when you go from shdepDropOwned, you pass that flag all the way down to doDeletion(), so the objtype-specific function is called with "missing_ok", and ignore if the object has already gone away. That's tedious because none of the Remove* functions have the concept of missing_ok. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 13, 2020 at 07:45:06PM -0300, Alvaro Herrera wrote: > This seems fiddly to handle better; maybe you'd have to have a new > PERFORM_DELETION_* flag that says to ignore "missing" objects; so when > you go from shdepDropOwned, you pass that flag all the way down to > doDeletion(), so the objtype-specific function is called with > "missing_ok", and ignore if the object has already gone away. That's > tedious because none of the Remove* functions have the concept of > missing_ok. Yes, that would be invasive and I'd rather not backpatch such a change but I don't see a better or cleaner way to handle that correctly either than the way you are describing. Looking at all the subroutines removing the objects by OID, a patch among those lines is repetitive, though not complicated to do. -- Michael
Attachment
On Mon, Jan 13, 2020 at 07:45:06PM -0300, Alvaro Herrera wrote: > This seems fiddly to handle better; maybe you'd have to have a new > PERFORM_DELETION_* flag that says to ignore "missing" objects; so when > you go from shdepDropOwned, you pass that flag all the way down to > doDeletion(), so the objtype-specific function is called with > "missing_ok", and ignore if the object has already gone away. That's > tedious because none of the Remove* functions have the concept of > missing_ok. Yes, that would be invasive and I'd rather not backpatch such a change but I don't see a better or cleaner way to handle that correctly either than the way you are describing. Looking at all the subroutines removing the objects by OID, a patch among those lines is repetitive, though not complicated to do. -- Michael
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2020-Jan-07, Mithun Cy wrote: >> I have a test where a user creates a temp table and then disconnect, >> concurrently we try to do DROP OWNED BY CASCADE on the same user. Seems >> this causes race condition between temptable deletion during disconnection >> (@RemoveTempRelations(myTempNamespace)) and DROP OWNED BY CASCADE operation >> which will try to remove same temp table when they find them as part of >> pg_shdepend. > Cute. Is this really any worse than any other attempt to issue two DROPs against the same object concurrently? Maybe we can just call it pilot error. > This seems fiddly to handle better; maybe you'd have to have a new > PERFORM_DELETION_* flag that says to ignore "missing" objects; so when > you go from shdepDropOwned, you pass that flag all the way down to > doDeletion(), so the objtype-specific function is called with > "missing_ok", and ignore if the object has already gone away. That's > tedious because none of the Remove* functions have the concept of > missing_ok. That seems fundamentally wrong. By the time we've queued an object for deletion in dependency.c, we have a lock on it, and we've verified that the object is still there (cf. systable_recheck_tuple calls). If shdepDropOwned is doing it differently, I'd say shdepDropOwned is doing it wrong. regards, tom lane
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2020-Jan-07, Mithun Cy wrote: >> I have a test where a user creates a temp table and then disconnect, >> concurrently we try to do DROP OWNED BY CASCADE on the same user. Seems >> this causes race condition between temptable deletion during disconnection >> (@RemoveTempRelations(myTempNamespace)) and DROP OWNED BY CASCADE operation >> which will try to remove same temp table when they find them as part of >> pg_shdepend. > Cute. Is this really any worse than any other attempt to issue two DROPs against the same object concurrently? Maybe we can just call it pilot error. > This seems fiddly to handle better; maybe you'd have to have a new > PERFORM_DELETION_* flag that says to ignore "missing" objects; so when > you go from shdepDropOwned, you pass that flag all the way down to > doDeletion(), so the objtype-specific function is called with > "missing_ok", and ignore if the object has already gone away. That's > tedious because none of the Remove* functions have the concept of > missing_ok. That seems fundamentally wrong. By the time we've queued an object for deletion in dependency.c, we have a lock on it, and we've verified that the object is still there (cf. systable_recheck_tuple calls). If shdepDropOwned is doing it differently, I'd say shdepDropOwned is doing it wrong. regards, tom lane
On 2020-Jan-13, Tom Lane wrote: > That seems fundamentally wrong. By the time we've queued an object for > deletion in dependency.c, we have a lock on it, and we've verified that > the object is still there (cf. systable_recheck_tuple calls). > If shdepDropOwned is doing it differently, I'd say shdepDropOwned is > doing it wrong. Hmm, it seems to be doing it differently. Maybe it should be acquiring locks on all objects in that nested loop and verified them for existence, so that when it calls performMultipleDeletions the objects are already locked, as you say. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-Jan-13, Tom Lane wrote: > That seems fundamentally wrong. By the time we've queued an object for > deletion in dependency.c, we have a lock on it, and we've verified that > the object is still there (cf. systable_recheck_tuple calls). > If shdepDropOwned is doing it differently, I'd say shdepDropOwned is > doing it wrong. Hmm, it seems to be doing it differently. Maybe it should be acquiring locks on all objects in that nested loop and verified them for existence, so that when it calls performMultipleDeletions the objects are already locked, as you say. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-Jan-14, Alvaro Herrera wrote: > On 2020-Jan-13, Tom Lane wrote: > > > That seems fundamentally wrong. By the time we've queued an object for > > deletion in dependency.c, we have a lock on it, and we've verified that > > the object is still there (cf. systable_recheck_tuple calls). > > If shdepDropOwned is doing it differently, I'd say shdepDropOwned is > > doing it wrong. > > Hmm, it seems to be doing it differently. Maybe it should be acquiring > locks on all objects in that nested loop and verified them for > existence, so that when it calls performMultipleDeletions the objects > are already locked, as you say. Yeah, this solves the reported bug. This is not a 100% solution: there's the cases when the user is removed from an ACL and from a policy, and those deletions are done directly instead of accumulating to the end for a mass deletion. I had to export AcquireDeletionLock (previously a static in dependency.c). I wonder if I should export ReleaseDeletionLock too, for symmetry. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 2020-Jan-14, Alvaro Herrera wrote: > On 2020-Jan-13, Tom Lane wrote: > > > That seems fundamentally wrong. By the time we've queued an object for > > deletion in dependency.c, we have a lock on it, and we've verified that > > the object is still there (cf. systable_recheck_tuple calls). > > If shdepDropOwned is doing it differently, I'd say shdepDropOwned is > > doing it wrong. > > Hmm, it seems to be doing it differently. Maybe it should be acquiring > locks on all objects in that nested loop and verified them for > existence, so that when it calls performMultipleDeletions the objects > are already locked, as you say. Yeah, this solves the reported bug. This is not a 100% solution: there's the cases when the user is removed from an ACL and from a policy, and those deletions are done directly instead of accumulating to the end for a mass deletion. I had to export AcquireDeletionLock (previously a static in dependency.c). I wonder if I should export ReleaseDeletionLock too, for symmetry. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-Jan-23, Alvaro Herrera wrote: > This is not a 100% solution: there's the cases when the user is removed > from an ACL and from a policy, and those deletions are done directly > instead of accumulating to the end for a mass deletion. > > I had to export AcquireDeletionLock (previously a static in > dependency.c). I wonder if I should export ReleaseDeletionLock too, for > symmetry. FWIW I'm going to withhold this bugfix until after the next set of minors are out. I'd rather not find out later that I have no way to fix 9.4 if I break it, for a bug that has existed forever ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-Jan-23, Alvaro Herrera wrote: > This is not a 100% solution: there's the cases when the user is removed > from an ACL and from a policy, and those deletions are done directly > instead of accumulating to the end for a mass deletion. > > I had to export AcquireDeletionLock (previously a static in > dependency.c). I wonder if I should export ReleaseDeletionLock too, for > symmetry. FWIW I'm going to withhold this bugfix until after the next set of minors are out. I'd rather not find out later that I have no way to fix 9.4 if I break it, for a bug that has existed forever ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: not tested I have tested the patch with REL_12_STABLE for the given error scenario by running the "CREATE TEMPORARY TABLE..." and "DROPOWNED BY..." commands concurrently using parallel background workers. I have also tested a few related scenarios andthe patch does seem to fix the reported bug. Ran make installcheck-world, no difference with and without patch.
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2020-Jan-14, Alvaro Herrera wrote: >> Hmm, it seems to be doing it differently. Maybe it should be acquiring >> locks on all objects in that nested loop and verified them for >> existence, so that when it calls performMultipleDeletions the objects >> are already locked, as you say. > Yeah, this solves the reported bug. I looked this over and think it should be fine. There will be cases where we get a deadlock error, but such risks existed anyway, since we'd have acquired all the same locks later in the process. > This is not a 100% solution: there's the cases when the user is removed > from an ACL and from a policy, and those deletions are done directly > instead of accumulating to the end for a mass deletion. Let's worry about that when and if we get field complaints. > I had to export AcquireDeletionLock (previously a static in > dependency.c). I wonder if I should export ReleaseDeletionLock too, for > symmetry. Hmmm ... there is an argument for doing ReleaseDeletionLock in the code paths where you discover that the object's been deleted. Holding a lock on a no-longer-existent object OID should be harmless from a deadlock standpoint; but it does represent useless consumption of a shared lock table entry, and that's a resource this operation could already burn with abandon. Also, if we're exporting these, it's worth expending a bit more effort on their header comments. In particular AcquireDeletionLock should describe its flags argument; perhaps along the lines of "Accepts the same flags as performDeletion (though currently only PERFORM_DELETION_CONCURRENTLY does anything)". regards, tom lane
On 2020-Apr-06, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > On 2020-Jan-14, Alvaro Herrera wrote: > >> Hmm, it seems to be doing it differently. Maybe it should be acquiring > >> locks on all objects in that nested loop and verified them for > >> existence, so that when it calls performMultipleDeletions the objects > >> are already locked, as you say. > > > Yeah, this solves the reported bug. > > I looked this over and think it should be fine. There will be cases > where we get a deadlock error, but such risks existed anyway, since > we'd have acquired all the same locks later in the process. Thanks for looking again. I have pushed this to all branches, with these changes: > Hmmm ... there is an argument for doing ReleaseDeletionLock in the code > paths where you discover that the object's been deleted. Added this. This of course required also exporting ReleaseDeletionLock, which closes my concern about exporting only half of that API. > Also, if we're exporting these, it's worth expending a bit more > effort on their header comments. In particular AcquireDeletionLock > should describe its flags argument; perhaps along the lines of > "Accepts the same flags as performDeletion (though currently only > PERFORM_DELETION_CONCURRENTLY does anything)". Did this too. I also changed the comment to indicate that, since they're now exported APIs, they might grow the ability to lock shared objects in the future. In fact, we have some places where we're using LockSharedObject directly to lock objects to drop; it seems reasonable to think that we should augment AcquireDeletionLock to handle those objects and make those places use the new API. Lastly: right now, only performMultipleDeletions passes the flags down to AcquireDeletionLock -- there are a couple places that drop objects and call AcquireDeletionLock with flags=0. There's no bug AFAICS because those cannot be called while running concurrent object drop. But for correctness, those should pass flags too. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services