Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb - Mailing list pgsql-hackers

From Amul Sul
Subject Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb
Date
Msg-id CAAJ_b94tVyuZ=HvGbznv8WNZGzT_fmOJUbFfnKskP_mF17AN_w@mail.gmail.com
Whole thread Raw
In response to Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Mar 23, 2021 at 8:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > Michael Paquier <michael@paquier.xyz> writes:
> >> One bisect later, the winner is:
> >> commit: 3d351d916b20534f973eda760cde17d96545d4c4
> >> author: Tom Lane <tgl@sss.pgh.pa.us>
> >> date: Sun, 30 Aug 2020 12:21:51 -0400
> >> Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.
>
> > I think that's an artifact.  That commit didn't touch anything related to
> > relation opening or closing.  What it could have done, though, is change
> > CLUSTER's behavior on this empty table from use-an-index to use-a-seqscan,
> > thus causing us to follow the buggy code path where before we didn't.
>
> On closer inspection, I believe the true culprit is c6b92041d,
> which did this:
>
>          */
>         if (RelationNeedsWAL(state->rs_new_rel))
> -               heap_sync(state->rs_new_rel);
> +               smgrimmedsync(state->rs_new_rel->rd_smgr, MAIN_FORKNUM);
>
>         logical_end_heap_rewrite(state);
>
> heap_sync was careful about opening rd_smgr, the new code not so much.
>
> I read the rest of that commit and didn't see any other equivalent
> bugs, but I might've missed something.
>

I too didn't find any other place replacing heap_sync() or equivalent place from
this commit where smgr* operation reaches without necessary precautions call.
heap_sync() was calling RelationOpenSmgr() through FlushRelationBuffers() before
it reached smgrimmedsync().  So we also need to make sure of the
RelationOpenSmgr() call before smgrimmedsync() as proposed previously.

Regards,
Amul



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Protect syscache from bloating with negative cache entries
Next
From: Mark Dilger
Date:
Subject: Re: pg_amcheck contrib application