Re: Getting rid of "tuple concurrently updated" elog()s withconcurrent DDLs (at least ALTER TABLE) - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Getting rid of "tuple concurrently updated" elog()s withconcurrent DDLs (at least ALTER TABLE)
Date
Msg-id 20171227075342.GA1281@paquier.xyz
Whole thread Raw
In response to Re: Getting rid of "tuple concurrently updated" elog()s withconcurrent DDLs (at least ALTER TABLE)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Getting rid of "tuple concurrently updated" elog()s withconcurrent DDLs (at least ALTER TABLE)  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Tue, Dec 26, 2017 at 10:47:59PM -0800, Robert Haas wrote:
> On Tue, Dec 26, 2017 at 4:14 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Tue, Dec 26, 2017 at 4:30 PM, Andres Freund <andres@anarazel.de> wrote:
>>> You're proposing to lock the entire relation against many forms of concurrent DDL, just to get rid of that error?
Thatseems unacceptable. 
>>> Isn't the canonical way to solve this to take object locks?
>>
>> Sure. That's where things in lmgr.c come into play, like
>> LockSharedObject(), and you could hold with an exclusive lock on a
>> given object until the end of a transaction before opening the catalog
>> relation with heap_open(), however with those you need to know the
>> object OID before taking a lock on the parent relation, right? So you
>> face problems with lock upgrades, or race conditions show up more
>> easily. I have to admit that I have not dug much into the problem yet,
>> it is easy enough to have isolation tests by the way, and I just
>> noticed that ALTER DATABASE SET can equally trigger the error.
>
> I don't understand what you mean by "you need to know the object OID
> before taking a lock on the parent relation, right?".  That seems
> wrong.

Well, the point I am trying to outline here is that it is essential to
take a lock on the object ID before opening pg_authid with heap_open()
with a low-level lock to avoid any concurrency issues when working on
the catalog relation opened.

> I think you might need something like what was done in
> b3ad5d02c9cd8a4c884cd78480f221afe8ce5590; if, after we look up the
> name and before we acquire a lock on the OID, we accept any
> invalidation messages, recheck that the object we've locked is still
> the one associated with that name.

Indeed, this bit is important, and RangeVarGet does so. Looking at
get_object_address(), it also handles locking of the object. Using that
as interface to address all the concurrency problems could be appealing,
and less invasive for a final result as many DDLs are visibly
impacted (still need to make a list here). What do you think?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: "Michelle Konzack"
Date:
Subject: Re: [table partitioning] How many partitions are possibel?
Next
From: Konstantin Knizhnik
Date:
Subject: Re: AS OF queries