On Thu, Nov 14, 2019 at 05:35:06PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Thu, Nov 14, 2019 at 04:36:54PM -0500, Tom Lane wrote:
>>> Hm. No, it's not enough, unless you add more logic to deal with the
>>> possibility that the stats object is gone by the time you have the
>>> table lock. Consider e.g. two concurrent DROP STATISTICS commands,
>>> or a statistics drop that's cascading from something like a drop
>>> of a relevant function and so has no earlier table lock.
>
>> Isn't that solved by RemoveObjects() doing this?
>
>> /* Get an ObjectAddress for the object. */
>> address = get_object_address(stmt->removeType,
>> object,
>> &relation,
>> AccessExclusiveLock,
>> stmt->missing_ok);
>
>Ah, I see, we already have AEL on the stats object itself. So that
>eliminates my concern about a race between two RemoveStatisticsById
>calls, but what we have instead is fear of deadlock. A DROP STATISTICS
>command will acquire AEL on the stats object but then AEL on the table,
>the opposite of what will happen during DROP TABLE, so concurrent
>executions of those will deadlock. That might be better than the
>failures Mark is seeing now, but not by much.
>
Hmmm, yeah :-(
>A correct fix I think is that the planner ought to acquire AccessShareLock
>on a stats object it's trying to use (and then recheck whether the object
>is still there). That seems rather expensive, but there may be no other
>way.
Yes, so something like for indexes, although we don't need the recheck
in that case. I think the attached patch does that (but it's 1AM here).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services