Re: [HACKERS] SIGSEGV in BRIN autosummarize - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: [HACKERS] SIGSEGV in BRIN autosummarize
Date
Msg-id 20171018172227.qa2pbutwwvsvazg3@alvherre.pgsql
Whole thread Raw
In response to Re: [HACKERS] SIGSEGV in BRIN autosummarize  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: [HACKERS] SIGSEGV in BRIN autosummarize
List pgsql-hackers
Justin Pryzby wrote:
> On Wed, Oct 18, 2017 at 06:54:09PM +0200, Alvaro Herrera wrote:

> > And the previous code crashes in 45 minutes?  That's solid enough for
> > me; I'll clean up the patch and push in the next few days.  I think what
> > you have now should be sufficient for the time being for your production
> > system.
> 
> No - the crash happened 4 times since adding BRIN+autosummarize 6 days ago, and
> in once instance occured twice within 3 hours (while I was trying to query logs
> for the preceding crash).

Oh, okay.  Then we don't know enough yet, ISTM.

> [pryzbyj@database ~]$ sudo grep -hE 'in postgres|Saved core' /var/log/messages*
> Oct 13 17:22:45 database kernel: postmaster[32127] general protection ip:4bd467 sp:7ffd9b349990 error:0 in
postgres[400000+692000]
> Oct 13 17:22:47 database abrt[32387]: Saved core dump of pid 32127 (/usr/pgsql-10/bin/postgres) to
/var/spool/abrt/ccpp-2017-10-13-17:22:47-32127(15040512 bytes)
 
> Oct 14 18:05:35 database kernel: postmaster[26500] general protection ip:84a177 sp:7ffd9b349b88 error:0 in
postgres[400000+692000]
> Oct 14 18:05:35 database abrt[27564]: Saved core dump of pid 26500 (/usr/pgsql-10/bin/postgres) to
/var/spool/abrt/ccpp-2017-10-14-18:05:35-26500(24137728 bytes)
 
> Oct 16 23:21:22 database kernel: postmaster[31543] general protection ip:4bd467 sp:7ffe08a94890 error:0 in
postgres[400000+692000]
> Oct 16 23:21:22 database abrt[570]: Saved core dump of pid 31543 (/usr/pgsql-10/bin/postgres) to
/var/spool/abrt/ccpp-2017-10-16-23:21:22-31543(25133056 bytes)
 
> Oct 17 01:58:36 database kernel: postmaster[8646]: segfault at 8 ip 000000000084a177 sp 00007ffe08a94a88 error 4 in
postgres[400000+692000]
> Oct 17 01:58:38 database abrt[9192]: Saved core dump of pid 8646 (/usr/pgsql-10/bin/postgres) to
/var/spool/abrt/ccpp-2017-10-17-01:58:38-8646(7692288 bytes)
 

Do you still have those core dumps?  If so, would you please verify the
database that autovacuum was running in?  Just open each with gdb (using
the original postgres binary, not the one you just installed) and do
"print MyDatabaseId".

> I'll continue runnning with the existing patch and come back if the issue
> recurs.

Thanks.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Implement table partitioning.
Next
From: Justin Pryzby
Date:
Subject: Re: [HACKERS] SIGSEGV in BRIN autosummarize