Thread: How to analyze a core dump
All; I have this message in the /var/log/message log on a server running PostgreSQL: ID 603404 kern.notice] NOTICE: core_log: postgres[20995] core dumped: /var/coredumps/core_smffnsdb3_postgres_1000_1500_1603426561_20995 and this in the postgresql log: 2020-10-23 04:17:40 GMT [5588]:[21-1] 5df751a6.15d4 >LOG: server process (PID 20995) was terminated by signal 10 How do I determine the root cause? Thanks in advance
On Mon, Nov 2, 2020 at 8:41 AM S Bob <sbob@quadratum-braccas.com> wrote: > 2020-10-23 04:17:40 GMT [5588]:[21-1] 5df751a6.15d4 >LOG: server > process (PID 20995) was terminated by signal 10 > > How do I determine the root cause? Is signal 10 SIGBUS on your platform? Perhaps check the relevant man page on your platform -- "man signal.7" works for me here. What CPU architecture and operating system are you using? -- Peter Geoghegan
On 11/2/20 9:55 AM, Peter Geoghegan wrote: > On Mon, Nov 2, 2020 at 8:41 AM S Bob <sbob@quadratum-braccas.com> wrote: >> 2020-10-23 04:17:40 GMT [5588]:[21-1] 5df751a6.15d4 >LOG: server >> process (PID 20995) was terminated by signal 10 >> >> How do I determine the root cause? > Is signal 10 SIGBUS on your platform? Perhaps check the relevant man > page on your platform -- "man signal.7" works for me here. > > What CPU architecture and operating system are you using? 32 bit, Solaris 10 (ugh) >
S Bob <sbob@quadratum-braccas.com> writes: > On 11/2/20 9:55 AM, Peter Geoghegan wrote: >> Is signal 10 SIGBUS on your platform? Perhaps check the relevant man >> page on your platform -- "man signal.7" works for me here. >> What CPU architecture and operating system are you using? > 32 bit, Solaris 10 Solaris should be enough like other Unixen to presume that SIGBUS is 10. However, that doesn't get you far towards finding a root cause. Since you have a core file, maybe you could extract a stack trace from it? We have some suggestions at https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD although I'm afraid that's pretty gdb-specific, and Solaris probably has different debugging tools. What PG version are you running, exactly? regards, tom lane
On 11/2/20 11:04 AM, Tom Lane wrote: > S Bob <sbob@quadratum-braccas.com> writes: >> On 11/2/20 9:55 AM, Peter Geoghegan wrote: >>> Is signal 10 SIGBUS on your platform? Perhaps check the relevant man >>> page on your platform -- "man signal.7" works for me here. >>> What CPU architecture and operating system are you using? >> 32 bit, Solaris 10 > Solaris should be enough like other Unixen to presume that SIGBUS is 10. > However, that doesn't get you far towards finding a root cause. > > Since you have a core file, maybe you could extract a stack trace > from it? We have some suggestions at > > https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD > > although I'm afraid that's pretty gdb-specific, and Solaris probably > has different debugging tools. > > What PG version are you running, exactly? This client is running Postgres version 9.0.4 I'll have a look at the link you sent, thanks! > > regards, tom lane
S Bob <sbob@quadratum-braccas.com> writes: > On 11/2/20 11:04 AM, Tom Lane wrote: >> What PG version are you running, exactly? > This client is running Postgres version 9.0.4 TBH, the first thing you should do is upgrade to a supported PG version, or failing that, at least the last minor version in the 9.0 branch (9.0.23). It's highly likely that you are hitting a bug that was fixed years ago. regards, tom lane