Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb
Date
Msg-id 856392.1616510642@sss.pgh.pa.us
Whole thread Raw
In response to Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb  (Michael Paquier <michael@paquier.xyz>)
Responses Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb
List pgsql-hackers
Michael Paquier <michael@paquier.xyz> writes:
> On Tue, Mar 23, 2021 at 04:12:01PM +0900, Michael Paquier wrote:
>> It takes some time to initialize a cluster under CLOBBER_CACHE_ALWAYS,
>> but the test is quick enough to reproduce.  It would be good to bisect
>> the origin point here as a first step.

> One bisect later, the winner is:
> commit: 3d351d916b20534f973eda760cde17d96545d4c4
> author: Tom Lane <tgl@sss.pgh.pa.us>
> date: Sun, 30 Aug 2020 12:21:51 -0400
> Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.

> I am too tired to poke at that today, so I'll try tomorrow.  Tom may
> beat me at that though.

I think that's an artifact.  That commit didn't touch anything related to
relation opening or closing.  What it could have done, though, is change
CLUSTER's behavior on this empty table from use-an-index to use-a-seqscan,
thus causing us to follow the buggy code path where before we didn't.

The interesting question here seems to be "why didn't the existing
CLOBBER_CACHE_ALWAYS buildfarm testing catch this?".  It looks to me like
the answer is that it only happens for an empty table (or at least one
where the data pattern is such that we skip the RelationOpenSmgr call
earlier in end_heap_rewrite) and we don't happen to be exercising that
exact scenario in the regression tests.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Replication slot stats misgivings
Next
From: Bruce Momjian
Date:
Subject: Re: tool to migrate database