Thread: Postgres process is crashing continously in 9.1.1

Postgres process is crashing continously in 9.1.1

From
Jayashankar K B
Date:

Hi,

 

We are using Postgres 9.1.1 on a board with Coldfire controller.

The postgres processes are crashing and restarting upon executing a particular instruction and it keeps repeating. Even when we tried with Postgres 9.1.3, same problem happens.

It works fine until the FINANCIALTRANSACTIONID reaches 1000.

But the same setup is working fine on a windows PC. We have tried to compare the configuration differences between windows PC and the board and found that only difference is the Shared Buffers which is 32 on the PC and 24 on the board.

 

I am pasting the server log from the board here.

The line highlighted in yellow is the instruction which is causing the crash.

Please let us know why this crash is happening and how we can fix it.

 

 

LOG:  redo starts at 0/D9B75B4

LOG:  record with zero length at 0/D9BBE5C

LOG:  redo done at 0/D9BBE22

LOG:  last completed transaction was at log time 2012-05-22 02:22:26.641488+00

LOG:  database system is ready to accept connections

LOG:  autovacuum launcher started

ERROR:  duplicate key value violates unique constraint "financialtransaction_pkey"

DETAIL:  Key (financialtransactionid)=(1004) already exists.

STATEMENT:  Insert into FINANCIALTRANSACTION (ATTENDANT,ENGINEHOUR,RECEIPTPRINTED,FINANCIALTRANSACTIONID) values ('0','0.0','0','1004')

LOG:  server process (PID 4016) was terminated by signal 11: Segmentation fault

LOG:  terminating any other active server processes

WARNING:  terminating connection because of crash of another server process

DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT:  In a moment you should be able to reconnect to the database and repeat your command.

FATAL:  poll() failed in statistics collector: Unknown error 516

LOG:  statistics collector process (PID 3962) exited with exit code 1

LOG:  all server processes terminated; reinitializing

LOG:  database system was interrupted; last known up at 2012-05-22 02:22:29 UTC

LOG:  database system was not properly shut down; automatic recovery in progress

LOG:  consistent recovery state reached at 0/D9BBEAA

LOG:  redo starts at 0/D9BBEAA

LOG:  record with zero length at 0/D9C07FA

LOG:  redo done at 0/D9C07C0

LOG:  last completed transaction was at log time 2012-05-22 02:23:05.372245+00

LOG:  database system is ready to accept connections

LOG:  autovacuum launcher started

ERROR:  duplicate key value violates unique constraint "financialtransaction_pkey"

DETAIL:  Key (financialtransactionid)=(1004) already exists.

STATEMENT:  Insert into FINANCIALTRANSACTION (ATTENDANT,ENGINEHOUR,RECEIPTPRINTED,FINANCIALTRANSACTIONID) values ('0','0.0','0','1004')

LOG:  server process (PID 4098) was terminated by signal 11: Segmentation fault

LOG:  terminating any other active server processes

WARNING:  terminating connection because of crash of another server process

DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT:  In a moment you should be able to reconnect to the database and repeat your command.

FATAL:  poll() failed in statistics collector: Unknown error 516

LOG:  statistics collector process (PID 4035) exited with exit code 1

LOG:  all server processes terminated; reinitializing

LOG:  database system was interrupted; last known up at 2012-05-22 02:23:08 UTC

LOG:  database system was not properly shut down; automatic recovery in progress

LOG:  consistent recovery state reached at 0/D9C0848

LOG:  redo starts at 0/D9C0848

LOG:  record with zero length at 0/D9C5218

LOG:  redo done at 0/D9C51DE

LOG:  last completed transaction was at log time 2012-05-22 02:23:49.659502+00

LOG:  database system is ready to accept connections

LOG:  autovacuum launcher started

 

 

 

Thanks and Regards

Jayashankar

Larsen & Toubro Limited

www.larsentoubro.com

This Email may contain confidential or privileged information for the intended recipient (s). If you are not the intended recipient, please do not use or disseminate the information, notify the sender and delete it from your system.

 Earth Day. Every Day.

Re: Postgres process is crashing continously in 9.1.1

From
Jayashankar K B
Date:

We can understand the difference in shared buffer size as the Windows PC has 2GB of RAM and the board has 256MB of RAM.

So please let us know if this shared buffer parameter has any relation to the problem we are facing.

 

Thanks and Regards

Jayashankar

 

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Jayashankar K B
Sent: Tuesday, May 22, 2012 11:27 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Postgres process is crashing continously in 9.1.1

 

Hi,

 

We are using Postgres 9.1.1 on a board with Coldfire controller.

The postgres processes are crashing and restarting upon executing a particular instruction and it keeps repeating. Even when we tried with Postgres 9.1.3, same problem happens.

It works fine until the FINANCIALTRANSACTIONID reaches 1000.

But the same setup is working fine on a windows PC. We have tried to compare the configuration differences between windows PC and the board and found that only difference is the Shared Buffers which is 32 on the PC and 24 on the board.

 

I am pasting the server log from the board here.

The line highlighted in yellow is the instruction which is causing the crash.

Please let us know why this crash is happening and how we can fix it.

 

 

LOG:  redo starts at 0/D9B75B4

LOG:  record with zero length at 0/D9BBE5C

LOG:  redo done at 0/D9BBE22

LOG:  last completed transaction was at log time 2012-05-22 02:22:26.641488+00

LOG:  database system is ready to accept connections

LOG:  autovacuum launcher started

ERROR:  duplicate key value violates unique constraint "financialtransaction_pkey"

DETAIL:  Key (financialtransactionid)=(1004) already exists.

STATEMENT:  Insert into FINANCIALTRANSACTION (ATTENDANT,ENGINEHOUR,RECEIPTPRINTED,FINANCIALTRANSACTIONID) values ('0','0.0','0','1004')

LOG:  server process (PID 4016) was terminated by signal 11: Segmentation fault

LOG:  terminating any other active server processes

WARNING:  terminating connection because of crash of another server process

DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT:  In a moment you should be able to reconnect to the database and repeat your command.

FATAL:  poll() failed in statistics collector: Unknown error 516

LOG:  statistics collector process (PID 3962) exited with exit code 1

LOG:  all server processes terminated; reinitializing

LOG:  database system was interrupted; last known up at 2012-05-22 02:22:29 UTC

LOG:  database system was not properly shut down; automatic recovery in progress

LOG:  consistent recovery state reached at 0/D9BBEAA

LOG:  redo starts at 0/D9BBEAA

LOG:  record with zero length at 0/D9C07FA

LOG:  redo done at 0/D9C07C0

LOG:  last completed transaction was at log time 2012-05-22 02:23:05.372245+00

LOG:  database system is ready to accept connections

LOG:  autovacuum launcher started

ERROR:  duplicate key value violates unique constraint "financialtransaction_pkey"

DETAIL:  Key (financialtransactionid)=(1004) already exists.

STATEMENT:  Insert into FINANCIALTRANSACTION (ATTENDANT,ENGINEHOUR,RECEIPTPRINTED,FINANCIALTRANSACTIONID) values ('0','0.0','0','1004')

LOG:  server process (PID 4098) was terminated by signal 11: Segmentation fault

LOG:  terminating any other active server processes

WARNING:  terminating connection because of crash of another server process

DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT:  In a moment you should be able to reconnect to the database and repeat your command.

FATAL:  poll() failed in statistics collector: Unknown error 516

LOG:  statistics collector process (PID 4035) exited with exit code 1

LOG:  all server processes terminated; reinitializing

LOG:  database system was interrupted; last known up at 2012-05-22 02:23:08 UTC

LOG:  database system was not properly shut down; automatic recovery in progress

LOG:  consistent recovery state reached at 0/D9C0848

LOG:  redo starts at 0/D9C0848

LOG:  record with zero length at 0/D9C5218

LOG:  redo done at 0/D9C51DE

LOG:  last completed transaction was at log time 2012-05-22 02:23:49.659502+00

LOG:  database system is ready to accept connections

LOG:  autovacuum launcher started

 

 

 

Thanks and Regards

Jayashankar

Larsen & Toubro Limited

www.larsentoubro.com

This Email may contain confidential or privileged information for the intended recipient (s). If you are not the intended recipient, please do not use or disseminate the information, notify the sender and delete it from your system.

 Earth Day. Every Day.

Larsen & Toubro Limited

www.larsentoubro.com

This Email may contain confidential or privileged information for the intended recipient (s). If you are not the intended recipient, please do not use or disseminate the information, notify the sender and delete it from your system.

 Earth Day. Every Day.

Re: Postgres process is crashing continously in 9.1.1

From
John R Pierce
Date:
On 05/21/12 11:05 PM, Jayashankar K B wrote:
> board with Coldfire controller.

what is this board?   Coldfire is the embedded 68k-like Freescale processor?

what operating system is this under?   what sort of storage does this
embedded system use for the database?

telling us FINANCIALWHATEVERID > 1000 doesn't really do us much good
since we have no idea what your database looks like, or what your code
is doing.   the log seems to indicate there was a constraint violation
just before the exception hit.



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: Postgres process is crashing continously in 9.1.1

From
Jayashankar K B
Date:
Yes the board has the embedded 68k architecture based  Freescale Coldfire processor.
The board has a custom built Linux based on the kernel 2.6.38
The database is stored on an SD card of 4GB capacity.

This is the table we have.
CREATE TABLE financialtransaction
(
  FINANCIALTRANSACTIONID   BIGINT NOT NULL PRIMARY KEY,
  TIME_STAMP               TIMESTAMP,
  ATTENDANT                SMALLINT,
  RECEIPTPRINTED                BOOLEAN DEFAULT FALSE,
  ODOMETER                 VARCHAR(20),
  ENGINEHOUR               NUMERIC(9,2),
  CONSTRAINT financialtransaction_pkey PRIMARY KEY (FINANCIALTRANSACTIONID )
)
WITH (
  OIDS=FALSE
);
ALTER TABLE financialtransaction
  OWNER TO postgres;

On writing into this table, a stored procedure is triggered which inserts into another table.
But crash is happening while writing into this financialtransaction table once this table has more than 1000 records.
Please let me know if you need any other information.

Thanks and Regards
Jayashankar

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of John R Pierce
Sent: 22 May 2012 PM 12:00
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Postgres process is crashing continously in 9.1.1

On 05/21/12 11:05 PM, Jayashankar K B wrote:
> board with Coldfire controller.

what is this board?   Coldfire is the embedded 68k-like Freescale processor?

what operating system is this under?   what sort of storage does this
embedded system use for the database?

telling us FINANCIALWHATEVERID > 1000 doesn't really do us much good since we have no idea what your database looks
like,or what your code 
is doing.   the log seems to indicate there was a constraint violation
just before the exception hit.



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Larsen & Toubro Limited

www.larsentoubro.com

This Email may contain confidential or privileged information for the intended recipient (s). If you are not the
intendedrecipient, please do not use or disseminate the information, notify the sender and delete it from your system. 

 Earth Day. Every Day.

Re: Postgres process is crashing continously in 9.1.1

From
Craig Ringer
Date:
On 05/22/2012 01:57 PM, Jayashankar K B wrote:

> Please let us know why this crash is happening and how we can fix it.

> LOG: server process (PID 4016) was terminated by signal 11: Segmentation
> fault

If you can't reproduce this crash on a more developer-friendly machine
than your embedded system, what you will need to do is trap this crash
and get a backtrace that shows where and how the Pg backend(s) died.
Your embedded devs should hopefully have no problem with this.

You can enable core dumps and have Pg coredump if you have the storage.
This works even if you can't predict exactly when the crash will happen
or which backend will crash. It requires enough disk space to write out
a core file. If you're using a vaguely modern Linux kernel you can set a
core dump path on an NFS volume or other network file store to write
cores to, so you don't need local storage. See man 5 core

    http://linux.die.net/man/5/core

and the kernel.core_pattern sysctl. Note that you can even pipe core
dumps to a program (like, say, scp or netcat) so they don't have to be
written even to a network mounted file system.

Alternately, you can attach gdb to a backend you know will crash and
trap the crash that way.

See:


http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD




You will need PostgreSQL to have been compiled with debugging enabled
and will need the debug symbols for your libraries. On many embedded
platforms those are not included; the binaries are typically stripped.
If you're working with stripped binaries you'll get one of the useless
backtraces shown in the wiki article above.

If your binaries are stripped you can still create a useful backtrace so
long as you have access to unstripped copies of those binaries in your
development environment, outside the running embedded machine, or you
have debuginfo files. You need a core file, either one you let Linux
save on crash, or one you created by trapping a crash with gdb and
saving it with the "gcore /path/to/core/file/postgres.core" command.

Once you have the core file and have it copied to your development
environment, you can debug it with gdb from there using versions of your
libraries with full debug symbols or detached debuginfo. Note that the
libraries and PostgreSQL binaries must be EXACTLY IDENTICAL to the ones
running on the real host except for not being stripped. You can't use
binaries that're just the same version of the libraries, they have to be
the _same_, built with the same version of the same compiler with the
same options as the ones you were actually running. Usually they're the
exact same binaries, just copies made before you stripped them for
copying onto the embedded device. Of course, you'll be running gdb
inside your cross-compile environment to debug. Again, your embedded
developers should know how to do all this.

If your embedded platform doesn't have debuginfo files or unstripped
versions of your libraries, yell at whoever built it and get them to fix it.

If you don't have unstripped binaries, you can still build a debug
version of PostgreSQL and examine that, you'll just have lots of "???"
entries for non-PostgreSQL parts of the call path. The stack trace might
be useless, but might not be too.

--
Craig Ringer

Re: Postgres process is crashing continously in 9.1.1

From
Chris Angelico
Date:
On Tue, May 22, 2012 at 4:51 PM, Jayashankar K B
<Jayashankar.KB@lnties.com> wrote:
> On writing into this table, a stored procedure is triggered which inserts into another table.
> But crash is happening while writing into this financialtransaction table once this table has more than 1000 records.

What language is the stored procedure written in? Is it possible that
it's that procedure that segfaults? Postgres experts, do stored
procedure segfaults bring down the backend process like that?

ChrisA

Re: Postgres process is crashing continously in 9.1.1

From
Jayashankar K B
Date:
But here, the crash is happening right at the insert statement. That is insert itself is failing.
Unless the insert is successful, stored procedure is not triggered.

Thanks and regards
Jayashankar

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Chris Angelico
Sent: Tuesday, May 22, 2012 3:10 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Postgres process is crashing continously in 9.1.1

On Tue, May 22, 2012 at 4:51 PM, Jayashankar K B <Jayashankar.KB@lnties.com> wrote:
> On writing into this table, a stored procedure is triggered which inserts into another table.
> But crash is happening while writing into this financialtransaction table once this table has more than 1000 records.

What language is the stored procedure written in? Is it possible that it's that procedure that segfaults? Postgres
experts,do stored procedure segfaults bring down the backend process like that? 

ChrisA

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Larsen & Toubro Limited

www.larsentoubro.com

This Email may contain confidential or privileged information for the intended recipient (s). If you are not the
intendedrecipient, please do not use or disseminate the information, notify the sender and delete it from your system. 

 Earth Day. Every Day.

Re: Postgres process is crashing continously in 9.1.1

From
Chris Angelico
Date:
On Tue, May 22, 2012 at 8:23 PM, Jayashankar K B
<Jayashankar.KB@lnties.com> wrote:
> But here, the crash is happening right at the insert statement. That is insert itself is failing.
> Unless the insert is successful, stored procedure is not triggered.

Hmm. I wonder is it possible that going past ID 999 and into a
four-digit number is causing stack damage that crashes the server a
few iterations later... many things are possible. I'd look at the code
of the procedure and see if there's any possible memory/stack issues.

ChrisA

Re: Postgres process is crashing continously in 9.1.1

From
Merlin Moncure
Date:
On Tue, May 22, 2012 at 5:41 AM, Chris Angelico <rosuav@gmail.com> wrote:
>
> On Tue, May 22, 2012 at 8:23 PM, Jayashankar K B
> <Jayashankar.KB@lnties.com> wrote:
> > But here, the crash is happening right at the insert statement. That is
> > insert itself is failing.
> > Unless the insert is successful, stored procedure is not triggered.
>
> Hmm. I wonder is it possible that going past ID 999 and into a
> four-digit number is causing stack damage that crashes the server a
> few iterations later... many things are possible. I'd look at the code
> of the procedure and see if there's any possible memory/stack issues.

Hm, on linux you check stack size with ulimit -s?  If stack is set too
low, a lower setting of max_stack_depth should prevent the crash.
It's pretty hard to hit that unless you have extraordinarily complex
and/or recursive functions though.

Any chance of seeing the function source?

merlin