Re: server process (PID 2964738) was terminated by signal 11: Segmentation fault - Mailing list pgsql-general

From Laurenz Albe
Subject Re: server process (PID 2964738) was terminated by signal 11: Segmentation fault
Date
Msg-id 0fb331d6fa4743a89070e092c62020ffee3020a5.camel@cybertec.at
Whole thread Raw
In response to Re: server process (PID 2964738) was terminated by signal 11: Segmentation fault  (Stefan Froehlich <postgresql@froehlich.priv.at>)
Responses Re: server process (PID 2964738) was terminated by signal 11: Segmentation fault  (Mladen Gogala <gogala.mladen@gmail.com>)
List pgsql-general
On Mon, 2022-11-07 at 11:17 +0100, Stefan Froehlich wrote:
> On Sun, Nov 06, 2022 at 09:48:32AM -0500, Tom Lane wrote:
> > Stefan Froehlich <postgresql@froehlich.priv.at> writes:
> > > > # create extension amcheck;
> > > > # select oid, relname from pg_class where relname ='faultytablename_pkey';
> > > > [returns oid 537203]
> > > > # select bt_index_check(537203, true);
> > > > server closed the connection unexpectedly
> 
> > Another idea is to try using contrib/pageinspect to examine each
> > page of the table.  Its output is just gobbledegook to most
> > people, but there's a good chance it'd fail visibly on the
> > corrupted page(s).
> 
> Fortunately I was able to identify a window of 100 records (out of
> 25 mio.) containing all the errors. After deleting and re-inserting
> those records everything seems to be ok (at least, pg_dump and
> "reindex database" work without errors).

Don't continue to work with that cluster even if everything seems OK now.
"pg_dumpall" and restore to a new cluster on good hardware.

> I suspect a bad RAM module to be the root of the problems. We'll
> see.
> 
> Side question: If it is possible to simply delete and create such
> records is it necessary that the server *core* *dumps*? There could
> be a switch adding additional safety (at the cost of performance)
> which would make troubleshooting not only much faster but
> non-invasive for the other databases on the same server as well.

Crashing is never nice.  On the other hand, adding checks and error
messages for conditions that are always true in a correct block cost
performance.  I can't tell about your specific case, but a build
of PostgreSQL --enable-cassert has assitional checks in the code.
That will still crash, but the log will show what condition was violated.

Yours,
Laurenz Albe



pgsql-general by date:

Previous
From: Stefan Froehlich
Date:
Subject: Re: server process (PID 2964738) was terminated by signal 11: Segmentation fault
Next
From: Laurenz Albe
Date:
Subject: Re: postgres replication without pg_basebackup? postgres 13.3