Thread: Random crashes - segmentation fault

Random crashes - segmentation fault

From
kjonca@poczta.onet.pl (Kamil Jońca)
Date:
This is a copy of message sent to debian-user mailing list:


select version();
                                                     version
------------------------------------------------------------------------------------------------------------------
PostgreSQL 12.0 (Debian 12.0-1+b1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.2.1-8) 9.2.1 20190909, 64-bit


kjonca@poczta.onet.pl (Kamil Jońca) writes:

> It is home PC box with debian sid.
> Recently my postgres was upgraded from version 11 to 12.
> I migrate databases, and during last few days I have had 2 server
> crashes.
> Crashes were during different statements. And after crash these
> statements executed successfully.
> In log I have:
> ===============================================================
> 2019-11-04 00:07:38 CET LOG:  server process (PID 19244) was terminated by signal 11: Segmentation fault
> 2019-11-04 00:07:38 CET DETAIL:  Failed process was running: update queue set priority = -3 ;
> 2019-11-04 00:07:38 CET LOG:  terminating any other active server processes
> [...]
> 2019-11-04 00:07:39 CET LOG:  all server processes terminated; reinitializing
> 2019-11-04 00:07:39 CET DEBUG:  mmap(150994944) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
> 2019-11-04 00:07:39 CET LOG:  database system was interrupted; last known up at 2019-11-04 00:02:24 CET
> ===============================================================
> 2019-11-05 21:43:56 CET LOG:  server process (PID 23233) was terminated by signal 11: Segmentation fault
> 2019-11-05 21:43:56 CET DETAIL:  Failed process was running: SELECT po_nr FROM get_free_numbers(999);
> 2019-11-05 21:43:56 CET LOG:  terminating any other active server processes
> [...]
> 2019-11-05 21:43:57 CET LOG:  all server processes terminated; reinitializing
> 2019-11-05 21:43:57 CET DEBUG:  mmap(150994944) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
> 2019-11-05 21:43:58 CET LOG:  database system was interrupted; last known up at 2019-11-05 21:43:49 CET
> ===============================================================
>
> any hints?
>
> KJ


Today (13.XI.2019 - KJ)  was another crash.
Another piece of a puzzle: There is (unlogged) table with 70M+
rows. After crash this table is empty (but table itself exists.)

KJ
--
http://stopstopnop.pl/stop_stopnop.pl_o_nas.html
1.79 x 10^12 furlongs per fortnight -- it's not just a good idea, it's
the law!



Re: Random crashes - segmentation fault

From
Michael Paquier
Date:
On Wed, Nov 13, 2019 at 09:12:02AM +0100, Kamil Jońca wrote:
> Today (13.XI.2019 - KJ)  was another crash.
> Another piece of a puzzle: There is (unlogged) table with 70M+
> rows. After crash this table is empty (but table itself exists.)

Could it be possible to see a backtrace of the crash?  There is
nothing we can really do without any hint.  If you can send a
self-contained test case which allows to reproduce the problem, that's
even better.  Please note that unlogged tables are reinitialized after
a crash at the beginning of recovery.  That's their design.
--
Michael

Attachment

Re: Random crashes - segmentation fault

From
kjonca@fastmail.com (Kamil Jońca)
Date:
Michael Paquier <michael@paquier.xyz> writes:

> On Wed, Nov 13, 2019 at 09:12:02AM +0100, Kamil Jońca wrote:
>> Today (13.XI.2019 - KJ)  was another crash.
>> Another piece of a puzzle: There is (unlogged) table with 70M+
>> rows. After crash this table is empty (but table itself exists.)
>
> Could it be possible to see a backtrace of the crash?  There is

Erm. Only thing I can do is to configure for debug future crashes. Could you
tell me what and how to configure to get backtrace?
This is debian sid machine with postgres packaged by debian folks.
I guess I should enable core dumps. Anything else?

> nothing we can really do without any hint.  If you can send a
> self-contained test case which allows to reproduce the problem, that's
> even better.
I am afraid I cannot. These crasheas are rather rare and unpredictable.

> Please note that unlogged tables are reinitialized after
> a crash at the beginning of recovery.  That's their design.
I see. Thanks for explanation.

KJ


--
http://stopstopnop.pl/stop_stopnop.pl_o_nas.html
If I have trouble installing Linux, something is wrong. Very wrong.
        -- Linus Torvalds



Re: Random crashes - segmentation fault

From
Tom Lane
Date:
kjonca@fastmail.com (Kamil =?iso-8859-2?Q?Jo=F1ca?=) writes:
> Michael Paquier <michael@paquier.xyz> writes:
>> Could it be possible to see a backtrace of the crash?  There is

> Erm. Only thing I can do is to configure for debug future crashes. Could you
> tell me what and how to configure to get backtrace?

There's some accumulated wisdom at

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

I'd suggest that before worrying about that, you should install 12.1
(official release is tomorrow) and see if the crashes go away.  There's
not much info in this report, but it could match any of several
already-fixed bugs.

            regards, tom lane



Re: Random crashes - segmentation fault

From
kjonca@fastmail.com (Kamil Jońca)
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> kjonca@fastmail.com (Kamil =?iso-8859-2?Q?Jo=F1ca?=) writes:
>> Michael Paquier <michael@paquier.xyz> writes:
>>> Could it be possible to see a backtrace of the crash?  There is
>
>> Erm. Only thing I can do is to configure for debug future crashes. Could you
>> tell me what and how to configure to get backtrace?
>
> There's some accumulated wisdom at
>
> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
>
> I'd suggest that before worrying about that, you should install 12.1
> (official release is tomorrow) and see if the crashes go away.  There's
> not much info in this report, but it could match any of several
> already-fixed bugs.
>


Today server crashed again.
--8<---------------cut here---------------start------------->8---
2019-12-13 15:41:38 CET DEBUG:  starting background worker process "parallel worker for PID 364516"
TRAP: FailedAssertion("!(result->tdrefcount == -1)", File:
"/build/postgresql-12-123lrX/postgresql-12-12.1/build/../src/backend/utils/cache/typcache.c",Line: 2621)
 
--8<---------------cut here---------------end--------------->8---

I have core dump.

KJ

>             regards, tom lane
>

-- 
http://wolnelektury.pl/wesprzyj/teraz/



Re: Random crashes - segmentation fault

From
Michael Paquier
Date:
On Fri, Dec 13, 2019 at 04:05:34PM +0100, Kamil Jońca wrote:
> Today server crashed again.
> --8<---------------cut here---------------start------------->8---
> 2019-12-13 15:41:38 CET DEBUG:  starting background worker process
> "parallel worker for PID 364516"
> TRAP: FailedAssertion("!(result->tdrefcount == -1)", File:
> /build/postgresql-12-123lrX/postgresql-12-12.1/build/../src/backend/utils/cache/typcache.c",
> Line: 2621)
> --8<---------------cut here---------------end--------------->8---
>
> I have core dump.

Could it be possible to see a full backtrace then?  Using gdb that
would mean to use "bt" to print the full stack.  What kind of function
is get_free_numbers() actually and what does it do?
--
Michael

Attachment