Michael Paquier <michael@paquier.xyz> writes:
> On Tue, Aug 25, 2020 at 01:35:11PM -0400, Bruce Momjian wrote:
>> On Tue, Aug 25, 2020 at 12:14:20PM -0400, Tom Lane wrote:
>>> I think the nature of the problem (and Robins' other report too) is pretty
>>> clear. We have a SQL or plpgsql function that's trying to access a table
>>> that is inconsistent during an ALTER TABLE operation. The function would
>>> be locked out from seeing that transient state if it were in another
>>> session, thanks to normal locking rules; but the lock acquisition rules
>>> don't prevent same-session accesses.
> There are already some safeguards to prevent directly the use of
> aggregates in USING, and here we have a function that itself calls an
> aggregate on the table.
That's an independent issue though. It stems mostly from not wanting
to use the full-scale planner or executor for subexpressions of utility
commands. (Of course, the PL function handler does so internally, but
that's its problem.)
The core issue here is that the table's catalog entries, as visible within
this transaction, don't match what's in its disk file. ALTER TABLE knows
that and is careful not to touch the old table using the new rowtype ---
but nothing else knows that. So I think we need to block other code
from touching the table-under-modification. As you say, there's not
any existing infrastructure that will serve for that directly. We might
be able to invent something comparable to the existing relcache entry
refcount, but counting exclusive opens ("there can be only one").
regards, tom lane