Re: Postgres process is crashing continously in 9.1.1 - Mailing list pgsql-general

From Craig Ringer
Subject Re: Postgres process is crashing continously in 9.1.1
Date
Msg-id 4FBB3EE5.5080908@ringerc.id.au
Whole thread Raw
In response to Postgres process is crashing continously in 9.1.1  (Jayashankar K B <Jayashankar.KB@lnties.com>)
List pgsql-general
On 05/22/2012 01:57 PM, Jayashankar K B wrote:

> Please let us know why this crash is happening and how we can fix it.

> LOG: server process (PID 4016) was terminated by signal 11: Segmentation
> fault

If you can't reproduce this crash on a more developer-friendly machine
than your embedded system, what you will need to do is trap this crash
and get a backtrace that shows where and how the Pg backend(s) died.
Your embedded devs should hopefully have no problem with this.

You can enable core dumps and have Pg coredump if you have the storage.
This works even if you can't predict exactly when the crash will happen
or which backend will crash. It requires enough disk space to write out
a core file. If you're using a vaguely modern Linux kernel you can set a
core dump path on an NFS volume or other network file store to write
cores to, so you don't need local storage. See man 5 core

    http://linux.die.net/man/5/core

and the kernel.core_pattern sysctl. Note that you can even pipe core
dumps to a program (like, say, scp or netcat) so they don't have to be
written even to a network mounted file system.

Alternately, you can attach gdb to a backend you know will crash and
trap the crash that way.

See:


http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD




You will need PostgreSQL to have been compiled with debugging enabled
and will need the debug symbols for your libraries. On many embedded
platforms those are not included; the binaries are typically stripped.
If you're working with stripped binaries you'll get one of the useless
backtraces shown in the wiki article above.

If your binaries are stripped you can still create a useful backtrace so
long as you have access to unstripped copies of those binaries in your
development environment, outside the running embedded machine, or you
have debuginfo files. You need a core file, either one you let Linux
save on crash, or one you created by trapping a crash with gdb and
saving it with the "gcore /path/to/core/file/postgres.core" command.

Once you have the core file and have it copied to your development
environment, you can debug it with gdb from there using versions of your
libraries with full debug symbols or detached debuginfo. Note that the
libraries and PostgreSQL binaries must be EXACTLY IDENTICAL to the ones
running on the real host except for not being stripped. You can't use
binaries that're just the same version of the libraries, they have to be
the _same_, built with the same version of the same compiler with the
same options as the ones you were actually running. Usually they're the
exact same binaries, just copies made before you stripped them for
copying onto the embedded device. Of course, you'll be running gdb
inside your cross-compile environment to debug. Again, your embedded
developers should know how to do all this.

If your embedded platform doesn't have debuginfo files or unstripped
versions of your libraries, yell at whoever built it and get them to fix it.

If you don't have unstripped binaries, you can still build a debug
version of PostgreSQL and examine that, you'll just have lots of "???"
entries for non-PostgreSQL parts of the call path. The stack trace might
be useless, but might not be too.

--
Craig Ringer

pgsql-general by date:

Previous
From: Jayashankar K B
Date:
Subject: Re: Postgres process is crashing continously in 9.1.1
Next
From: Magnus Hagander
Date:
Subject: Re: Bitrock XML Source