Thread: Random crashes - segmentation fault
This is a copy of message sent to debian-user mailing list: select version(); version ------------------------------------------------------------------------------------------------------------------ PostgreSQL 12.0 (Debian 12.0-1+b1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.2.1-8) 9.2.1 20190909, 64-bit kjonca@poczta.onet.pl (Kamil Jońca) writes: > It is home PC box with debian sid. > Recently my postgres was upgraded from version 11 to 12. > I migrate databases, and during last few days I have had 2 server > crashes. > Crashes were during different statements. And after crash these > statements executed successfully. > In log I have: > =============================================================== > 2019-11-04 00:07:38 CET LOG: server process (PID 19244) was terminated by signal 11: Segmentation fault > 2019-11-04 00:07:38 CET DETAIL: Failed process was running: update queue set priority = -3 ; > 2019-11-04 00:07:38 CET LOG: terminating any other active server processes > [...] > 2019-11-04 00:07:39 CET LOG: all server processes terminated; reinitializing > 2019-11-04 00:07:39 CET DEBUG: mmap(150994944) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory > 2019-11-04 00:07:39 CET LOG: database system was interrupted; last known up at 2019-11-04 00:02:24 CET > =============================================================== > 2019-11-05 21:43:56 CET LOG: server process (PID 23233) was terminated by signal 11: Segmentation fault > 2019-11-05 21:43:56 CET DETAIL: Failed process was running: SELECT po_nr FROM get_free_numbers(999); > 2019-11-05 21:43:56 CET LOG: terminating any other active server processes > [...] > 2019-11-05 21:43:57 CET LOG: all server processes terminated; reinitializing > 2019-11-05 21:43:57 CET DEBUG: mmap(150994944) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory > 2019-11-05 21:43:58 CET LOG: database system was interrupted; last known up at 2019-11-05 21:43:49 CET > =============================================================== > > any hints? > > KJ Today (13.XI.2019 - KJ) was another crash. Another piece of a puzzle: There is (unlogged) table with 70M+ rows. After crash this table is empty (but table itself exists.) KJ -- http://stopstopnop.pl/stop_stopnop.pl_o_nas.html 1.79 x 10^12 furlongs per fortnight -- it's not just a good idea, it's the law!
On Wed, Nov 13, 2019 at 09:12:02AM +0100, Kamil Jońca wrote: > Today (13.XI.2019 - KJ) was another crash. > Another piece of a puzzle: There is (unlogged) table with 70M+ > rows. After crash this table is empty (but table itself exists.) Could it be possible to see a backtrace of the crash? There is nothing we can really do without any hint. If you can send a self-contained test case which allows to reproduce the problem, that's even better. Please note that unlogged tables are reinitialized after a crash at the beginning of recovery. That's their design. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Wed, Nov 13, 2019 at 09:12:02AM +0100, Kamil Jońca wrote: >> Today (13.XI.2019 - KJ) was another crash. >> Another piece of a puzzle: There is (unlogged) table with 70M+ >> rows. After crash this table is empty (but table itself exists.) > > Could it be possible to see a backtrace of the crash? There is Erm. Only thing I can do is to configure for debug future crashes. Could you tell me what and how to configure to get backtrace? This is debian sid machine with postgres packaged by debian folks. I guess I should enable core dumps. Anything else? > nothing we can really do without any hint. If you can send a > self-contained test case which allows to reproduce the problem, that's > even better. I am afraid I cannot. These crasheas are rather rare and unpredictable. > Please note that unlogged tables are reinitialized after > a crash at the beginning of recovery. That's their design. I see. Thanks for explanation. KJ -- http://stopstopnop.pl/stop_stopnop.pl_o_nas.html If I have trouble installing Linux, something is wrong. Very wrong. -- Linus Torvalds
kjonca@fastmail.com (Kamil =?iso-8859-2?Q?Jo=F1ca?=) writes: > Michael Paquier <michael@paquier.xyz> writes: >> Could it be possible to see a backtrace of the crash? There is > Erm. Only thing I can do is to configure for debug future crashes. Could you > tell me what and how to configure to get backtrace? There's some accumulated wisdom at https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend I'd suggest that before worrying about that, you should install 12.1 (official release is tomorrow) and see if the crashes go away. There's not much info in this report, but it could match any of several already-fixed bugs. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > kjonca@fastmail.com (Kamil =?iso-8859-2?Q?Jo=F1ca?=) writes: >> Michael Paquier <michael@paquier.xyz> writes: >>> Could it be possible to see a backtrace of the crash? There is > >> Erm. Only thing I can do is to configure for debug future crashes. Could you >> tell me what and how to configure to get backtrace? > > There's some accumulated wisdom at > > https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend > > I'd suggest that before worrying about that, you should install 12.1 > (official release is tomorrow) and see if the crashes go away. There's > not much info in this report, but it could match any of several > already-fixed bugs. > Today server crashed again. --8<---------------cut here---------------start------------->8--- 2019-12-13 15:41:38 CET DEBUG: starting background worker process "parallel worker for PID 364516" TRAP: FailedAssertion("!(result->tdrefcount == -1)", File: "/build/postgresql-12-123lrX/postgresql-12-12.1/build/../src/backend/utils/cache/typcache.c",Line: 2621) --8<---------------cut here---------------end--------------->8--- I have core dump. KJ > regards, tom lane > -- http://wolnelektury.pl/wesprzyj/teraz/
On Fri, Dec 13, 2019 at 04:05:34PM +0100, Kamil Jońca wrote: > Today server crashed again. > --8<---------------cut here---------------start------------->8--- > 2019-12-13 15:41:38 CET DEBUG: starting background worker process > "parallel worker for PID 364516" > TRAP: FailedAssertion("!(result->tdrefcount == -1)", File: > /build/postgresql-12-123lrX/postgresql-12-12.1/build/../src/backend/utils/cache/typcache.c", > Line: 2621) > --8<---------------cut here---------------end--------------->8--- > > I have core dump. Could it be possible to see a full backtrace then? Using gdb that would mean to use "bt" to print the full stack. What kind of function is get_free_numbers() actually and what does it do? -- Michael