DROP TABLE and concurrent modifications - Mailing list pgsql-hackers

From Neil Conway
Subject DROP TABLE and concurrent modifications
Date
Msg-id 871xoui0j6.fsf@mailbox.samurai.com
Whole thread Raw
Responses Re: DROP TABLE and concurrent modifications
Re: DROP TABLE and concurrent modifications
List pgsql-hackers
I can reproduce the following behavior with CVS HEAD.
    1. Have a process do INSERTs into a table in a tight loop (I've       attached a trivial libpq app that does this)
    2. In another session, repeatedly drop and re-create the table       that is being modified

You should see a stream of error messages from the INSERT client like:

query failed: ERROR:  relation 29118 deleted while still in use
query failed: ERROR:  relation "test_tbl" does not exist
query failed: ERROR:  relation "test_tbl" does not exist
query failed: ERROR:  relation "test_tbl" does not exist
query failed: ERROR:  relation "test_tbl" does not exist
query failed: ERROR:  relation "test_tbl" does not exist
query failed: ERROR:  relation "test_tbl" does not exist
query failed: ERROR:  relation 32430 deleted while still in use
query failed: ERROR:  relation "test_tbl" does not exist
query failed: ERROR:  relation "test_tbl" does not exist
query failed: ERROR:  relation 34206 deleted while still in use

The problem is the variant of the error message. When the error
message variant occurs, the INSERT backend is in the following state:

[ ... ]
#2  0x0824ff48 in RelationClearRelation (relation=0x40c92538, rebuild=1 '\001') at relcache.c:1711
#3  0x0825006e in RelationFlushRelation (relation=0x40c92538) at relcache.c:1775
#4  0x082501b5 in RelationCacheInvalidateEntry (relationId=17145, rnode=0x0) at relcache.c:1842
#5  0x0824d153 in LocalExecuteInvalidationMessage (msg=0xbfffeed0) at inval.c:452
#6  0x081c6af5 in ReceiveSharedInvalidMessages (invalFunction=0x824d043 <LocalExecuteInvalidationMessage>,
resetFunction=0x824d213<InvalidateSystemCaches>) at sinval.c:125
 
#7  0x0824d3c6 in AcceptInvalidationMessages () at inval.c:611
#8  0x081c8f99 in LockRelation (relation=0x40c92538, lockmode=3) at lmgr.c:143
#9  0x08089232 in relation_open (relationId=17145, lockmode=3) at heapam.c:462
#10 0x080892c9 in relation_openrv (relation=0x83956e0, lockmode=3) at heapam.c:506
#11 0x08089576 in heap_openrv (relation=0x83956e0, lockmode=3) at heapam.c:610
#12 0x080ee857 in setTargetTable (pstate=0x83955ec, relation=0x83956e0, inh=0 '\0', alsoSource=0 '\0', requiredPerms=1)
atparse_clause.c:142
 
#13 0x080d4390 in transformInsertStmt (pstate=0x83955ec, stmt=0x8395808, extras_before=0xbffff0a0,
extras_after=0xbffff09c)at analyze.c:543
 
[ ... ]

i.e. it is waiting to acquire a lock on the relation it wants to
INSERT into, but before returning from LockRelation() it receives a
shared-cache invalidation message for the relation the other backend
has just dropped. This causes it to error out in the bowels of
RelationClearRelation():
    if (RelationBuildDesc(buildinfo, relation) != relation)    {        /* Should only get here if relation was deleted
*/       FreeTupleDesc(old_att);        if (old_rulescxt)            MemoryContextDelete(old_rulescxt);
pfree(relation);       elog(ERROR, "relation %u deleted while still in use",             buildinfo.i.info_id);    }
 

Assuming my analysis is correct, is this a bug?

AFAICS it should be totally harmless, but at the least it would be
nice to display a more friendly/informative error message. Can anyone
see a way to do this without too much pain?

-Neil



pgsql-hackers by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: No Timeout in SELECT..FOR UPDATE
Next
From: Neil Conway
Date:
Subject: Re: DROP TABLE and concurrent modifications