On Tue, Apr 13, 2021 at 04:17:39PM +0000, Tom Lane wrote:
> Avoid improbable PANIC during heap_update.
>
> heap_update needs to clear any existing "all visible" flag on
> the old tuple's page (and on the new page too, if different).
> Per coding rules, to do this it must acquire pin on the appropriate
> visibility-map page while not holding exclusive buffer lock;
> which creates a race condition since someone else could set the
> flag whenever we're not holding the buffer lock. The code is
> supposed to handle that by re-checking the flag after acquiring
> buffer lock and retrying if it became set. However, one code
> path through heap_update itself, as well as one in its subroutine
> RelationGetBufferForTuple, failed to do this. The end result,
> in the unlikely event that a concurrent VACUUM did set the flag
> while we're transiently not holding lock, is a non-recurring
> "PANIC: wrong buffer passed to visibilitymap_clear" failure.
>
Hi,
This doesn't look as improbable because I saw it at least 3 times with
v15beta4.
The first time I thought it was my fault, then I tried with a commit on
september 25 (didn't remember which exactly but that doesn't seems too
relevant).
Finally I saw it again in a build with TRACE_VISIBILITYMAP defined (the
same commit).
But I haven't see it anymore on rc1. Anyway I'm attaching the backtrace
(this is from the build with TRACE_VISIBILITYMAP), the query that was
running at the time was (no changes were made to quad_poly_tbl table
nor any indexes were added to this table):
"""
update public.quad_poly_tbl set
id = public.quad_poly_tbl.id
returning
public.quad_poly_tbl.id as c0
"""
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL