Home > mailing lists

Troubleshooting a segfault and instance crash - Mailing list pgsql-general

From	Blair Boadway
Subject	Troubleshooting a segfault and instance crash
Date	March 8, 2018 20:40:09
Msg-id	55CDE936-A7F3-41EE-B084-949AF84CD066@abebooks.com Whole thread Raw
Responses	Re: Troubleshooting a segfault and instance crash Re: Troubleshooting a segfault and instance crash
List	pgsql-general

Tree view

Hello,

We’re seeing an occasional segfault on a particular database

Mar 7 14:46:35 pgprod2 kernel:postgres[29351]: segfault at 0 ip 000000302f32868a sp 00007ffcf1547498 error 4 in libc-2.12.so[302f200000+18a000]

Mar 7 14:46:35 pgprod2 POSTGRES[21262]: [5] user=,db=,app=client= LOG: server process (PID 29351) was terminated by signal 11: Segmentation fault

It crashes the database, though it starts again on its own without any apparent issues. This has happened 3 times in 2 months and each time the segfault error and memory address is the same. We’ve only seen it on one database, though we’ve seen it on both hosts of primary/standby setup—we switched over primary to other host and got a segfault there, which seems to eliminate a hardware issue. Oddly the database has no issues for normal DML workloads (it is a moderately busy prod oltp system) but the segfault has happened very shortly after DML changes are made. Most recently it happened while running a series of grants for new db users we were deploying (ie. running a sql script from psql on the primary host)

grant usage on schema app to app_user1;

grant usage on schema app to app_user2;

...

Our set up is

RHEL 6.9 - 2.6.32-696.16.1.el6.x86_64

PostgreSQL 9.6.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit

Extensions - pg_cron,repmgr_funcs,pgaudit,pg_stat_statements,pg_hint_plan,pglogical

So far can’t reproduce on a test system, have just added some OS config to collect core from the OS but haven’t collected a core yet. There isn’t any particular config change or extension that we can link to the problem, this is a system that has run for months without problems since last config changes. Appreciate any ideas.

Regards,

Blair

pgsql-general by date:

From: Alexandru Lazarev
Date: 08 March 2018, 20:06:59
Subject: Re: What is the meaning of pg_restore output?

From: Pavel Stehule
Date: 08 March 2018, 20:47:43
Subject: Re: Troubleshooting a segfault and instance crash

Troubleshooting a segfault and instance crash - Mailing list pgsql-general

Previous

Next