On Sat, 4 Sept 2021 at 20:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> We have two things that we need to worry about when considering
> reducing ALTER TABLE lock levels:
>
> 1. Is it semantically okay (which is what you claim above)?
>
> 2. Will onlooker processes see sufficiently-consistent catalog data
> if they look at the table during the change?
>
> Trying to reduce the lock level for ADD CHECK fails the second
> test, because it has to alter two different catalogs. It has
> to increment pg_class.relchecks, and it has to make an entry in
> pg_constraint. This patch makes it possible for onlookers to
> see a value of pg_class.relchecks that is inconsistent with what
> they find in pg_constraint, and then they will blow up.
Thanks for the review. I will check this consideration for any future patches.
> That happens because the systable_beginscan() in CheckConstraintFetch
> will get a new snapshot, so now it sees the new entry in pg_constraint,
> making the count of entries inconsistent with what it found in pg_class.
This is clearly important and we must now return the patch with feedback.
I've looked at other similar cases and can't find any bugs in other areas, phew!
> It's possible that this could be made safe if we replaced the exact
> "relchecks" count with a boolean "relhaschecks", analogous to the
> way indexes are handled. It's not clear to me though that the effort,
> and ensuing compatibility costs for applications that look at pg_class,
> would be repaid by having a bit more concurrency here. One could
> also worry about whether we really want to give up this consistency
> cross-check between pg_class and pg_constraint.
I will work on a patch for this and see how complex it is.
At very least I will add a longer comment patch to mention this for the future.
--
Simon Riggs http://www.EnterpriseDB.com/