Re: BUG #17182: Race condition on concurrent DROP and CREATE of dependent object - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #17182: Race condition on concurrent DROP and CREATE of dependent object
Date
Msg-id 2872252.1630851337@sss.pgh.pa.us
Whole thread Raw
In response to BUG #17182: Race condition on concurrent DROP and CREATE of dependent object  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #17182: Race condition on concurrent DROP and CREATE of dependent object  (Alexander Lakhin <exclusion@gmail.com>)
List pgsql-bugs
PG Bug reporting form <noreply@postgresql.org> writes:
> As result of the following script:
> for i in `seq 100`; do
> ( { for n in `seq 20`; do echo "DROP DOMAIN i;"; done } | psql ) >psql1.log
> 2>&1 &
> ( echo "
> CREATE DOMAIN i AS int;
> CREATE FUNCTION f1() RETURNS i LANGUAGE SQL RETURN 1;
> CREATE FUNCTION f2() RETURNS i LANGUAGE SQL RETURN 2;
> CREATE FUNCTION f3() RETURNS i LANGUAGE SQL RETURN 3;
> CREATE FUNCTION f4() RETURNS i LANGUAGE SQL RETURN 4;
> CREATE FUNCTION f5() RETURNS i LANGUAGE SQL RETURN 5;
> " | psql ) >psql2.log 2>&1 &
> wait
> psql -c "DROP DOMAIN i CASCADE" >psql3.log 2>&1
> done

> I get several broken functions with the invalid return type:
> SELECT f1()
> ERROR:  cache lookup failed for type 16519
> CONTEXT:  SQL function "f1" during inlining

I don't find this particularly surprising, and I'm unwilling to add the
amount of locking overhead it'd take to prevent it.

The generic problem is that a newly-created dependent object is not
protected against deletion of its referenced object(s) until we commit
its new pg_depend entries; before that, a concurrent DROP won't see
the dependencies.  I recall some discussion of trying to take an
anti-deletion lock in recordDependency, but that's too late: the
deletion might have committed and released its own lock since we
looked up the type (or other referenced object).  So the only real fix
for this would be to make every object lookup in the entire system do
the sort of dance that's done in RangeVarGetRelidExtended.  We have
agreed that the cost is worth it for tables (though I don't think
that that was without controversy, nor am I 100% convinced that
RangeVarGetRelidExtended is correct).  But I'm not excited about
extending the principle to other object types.

            regards, tom lane



pgsql-bugs by date:

Previous
From: hubert depesz lubaczewski
Date:
Subject: Logs vanish after partial log destination change
Next
From: Tom Lane
Date:
Subject: Re: Logs vanish after partial log destination change