On 2025-Apr-15, Tender Wang wrote:
> I thought further about the lockmode calling find_inheritance_children
> in ATPrepAddPrimaryKey.
> What we do here? We first get oids of children, then check the if the
> column of children has marked not-null, if not, report an error.
> No operation here on children. I check other places that call
> find_inheritance_children, if we have operation on children, we usually pass
> Lockmode to find_inheritance_children, otherwise pass NoLock.
Hmm, I'm wary of doing this, although you're perhaps right that there's
no harm. If we do need to add a not-null constraint on the children,
surely we'll acquire a stronger lock further down the execution chain.
In principle this sounds a good idea though. (I'm not sure about doing
SearchSysCacheAttName() on a relation that might be dropped
concurrently; does dropping the child acquire lock on its parent? I
suppose so, in which case this is okay; but still icky. What about
DETACH CONCURRENTLY?)
However, I've also been looking at this and realized that this code can
have different structure which may allows us to skip the
find_inheritance_children() altogether. The reason is that we already
scan the parent's list of columns searching for not-null constraints on
each of them; we only need to run this verification on children for
columns where there is none in the parent, and then only in the case
where recursion is turned off.
So I propose the attached patch, which also has some comments to
hopefully explain what is going on and why. I ran Tom's test script a
few hundred times in a loop and I see no deadlock anymore.
Note that I also considered the idea of just not doing the check at all;
that is, if a child table doesn't have a not-null constraint, then let
ALTER TABLE ONLY parent ADD PRIMARY KEY ( ... )
create the not-null constraint. This works fine (it breaks one
regression test query though, would be easily fixed). But I don't like
this very much, because it means the user could be surprised by the
lengthy unexpected runtime of creating the primary key, only to realize
that the server is checking the child table for nulls. This is
especially bad if the user says ONLY. I think it's better if they have
the chance to create the not-null constraint on their own volition.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Learn about compilers. Then everything looks like either a compiler or
a database, and now you have two problems but one of them is fun."
https://twitter.com/thingskatedid/status/1456027786158776329