Re: BUG #14420: Parallel worker segfault - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: BUG #14420: Parallel worker segfault
Date
Msg-id CAA4eK1KB=-uuXnZbPJ0noi_RxLopn1OTq=+C7GwBebK3xJ=+2A@mail.gmail.com
Whole thread Raw
In response to BUG #14420: Parallel worker segfault  (rotten@windfish.net)
Responses Re: BUG #14420: Parallel worker segfault
List pgsql-bugs
On Sat, Nov 12, 2016 at 9:32 AM, Rick Otten <rotten@windfish.net> wrote:

Please keep pgsql-bugs in the loop.  It is important to keep everyone
in the loop not only because it is a way to work in this community,
but also because others can see something which I or you can't see.

> PostgreSQL was not started with the "-c" option.  I'll look into enabling
> that before this happens again.
>

makes sense.

> I'll read more from the other debugging article and see if there is anything
> I can do there as well.  Thanks.
>
> There were no files generated and dropped in PGDATA this time,
> unfortunately.
>
> Sorry, I know this isn't much to go on, but it is all I know at this time.
>
> There wasn't much else that wasn't routine in the logs before or after the
> two lines I pasted below other than a bunch of warnings for the the 30 or 40
> transactions that were in progress followed by this:
>

Okay, I think we can't get anything from these logs.  I think once
core is available, we can try to find the reason, but it would be much
better if we can generate an independent test to reproduce this
problem.  One possible way could be to find the culprit query.  You
might want to log long-running queries, as parallelism will generally
be used for such queries.

> 2016-11-11 21:31:26.292 UTC WARNING: terminating connection because of crash
> of another server process
> 2016-11-11 21:31:26.292 UTC DETAIL: The postmaster has commanded this server
> process to roll back the current transaction and exit, because another
> server process exited abnormally and possibly corrupted shared memory.
> 2016-11-11 21:31:26.292 UTC HINT: In a moment you should be able to
> reconnect to the database and repeat your command.
> 2016-11-11 21:31:26.301 UTC WARNING: terminating connection because of crash
> of another server process
> 2016-11-11 21:31:26.301 UTC DETAIL: The postmaster has commanded this server
> process to roll back the current transaction and exit, because another
> server process exited abnormally and possibly corrupted shared memory.
> 2016-11-11 21:31:26.301 UTC HINT: In a moment you should be able to
> reconnect to the database and repeat your command.
> 2016-11-11 21:31:30.762 UTC [unknown] x.x.x.x [unknown] LOG: connection
> received: host=x.x.x.x port=47692
> 2016-11-11 21:31:30.762 UTC clarivoy x.x.x.x some_user FATAL: the database
> system is in recovery mode
> 2016-11-11 21:31:31.766 UTC LOG: all server processes terminated;
> reinitializing
> 2016-11-11 21:31:33.526 UTC LOG: database system was interrupted; last known
> up at 2016-11-11 21:29:28 UTC
> 2016-11-11 21:31:33.660 UTC LOG: database system was not properly shut down;
> automatic recovery in progress
> 2016-11-11 21:31:33.674 UTC LOG: redo starts at 1DD/4F5A0320
> 2016-11-11 21:31:33.957 UTC LOG: unexpected pageaddr 1DC/16AEC000 in log
> segment 00000001000001DD00000056, offset 11452416
> 2016-11-11 21:31:33.958 UTC LOG: redo done at 1DD/56AEB7F8
> 2016-11-11 21:31:33.958 UTC LOG: last completed transaction was at log time
> 2016-11-11 21:31:26.07448+00
> 2016-11-11 21:31:34.705 UTC LOG: MultiXact member wraparound protections are
> now enabled
> 2016-11-11 21:31:34.724 UTC LOG: autovacuum launcher started
> 2016-11-11 21:31:34.725 UTC LOG: database system is ready to accept
> connections
>
> After that the database was pretty much back to normal.  Because everything
> connects from various pgbouncer instances running elsewhere, they quickly
> reconnected and started working again without having to restart any
> applications or services.
>



--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #14419: DB connections fails with "could not reserve shared memory region" in postgresql log.
Next
From: Julien Rouhaud
Date:
Subject: Re: BUG #14419: DB connections fails with "could not reserve shared memory region" in postgresql log.