Re: invalid memory alloc request size - Mailing list pgsql-general

From Janning Vygen
Subject Re: invalid memory alloc request size
Date
Msg-id 200601231833.50314.vygen@gmx.de
Whole thread Raw
In response to Re: invalid memory alloc request size  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: invalid memory alloc request size
List pgsql-general
Am Montag, 23. Januar 2006 17:05 schrieb Tom Lane:
> Janning Vygen <vygen@gmx.de> writes:
> > pg_dump: ERROR:  invalid memory alloc request size 18446744073709551614
> > pg_dump: SQL command to dump the contents of table "spieletipps" failed:
> > PQendcopy() failed.
>
> This looks more like a corrupt-data problem than anything else.  Have
> you tried the usual memory and disk testing programs?

no, i didn't. What are the usual memory and disk testing programs? ( a few
weeks ago i wanted to start a troubleshooting guide for guys like me, but i
didn't start yet.... this needs to be documented.). I am not a system
administrator and a hard disk is a black box to me.

By the way: the database is still running and serving requests.

> > recent thread on HACKERS but sorry guys: i dont know how to produce a
> > backtrace.
>
> Time to learn ;-)
>
>     gdb /path/to/postgres_executable /path/to/core_file
>     gdb> bt
>     gdb> q

I shouldn't call gdb while my database is up and running, don't i?

I tried to find and delete the corrupted row (as you mentioned in
http://archives.postgresql.org/pgsql-admin/2006-01/msg00117.php)

I found it:

$ select sp_id from spieletipps limit 1 offset 387583;
Server beendete die Verbindung unerwartet
        Das heißt wahrscheinlich, daß der Server abnormal beendete
        bevor oder während die Anweisung bearbeitet wurde.
Die Verbindung zum Server wurde verloren.  Versuche Reset: Fehlgeschlagen.
!> \q

and i can get the ctid:

$ select ctid from spieletipps limit 1 offset 387583;
   ctid
-----------
 (3397,49)
(1 Zeile)


but when i want to delete it:
$ delete from spieletipps where ctid = '(3397,49)';
Server beendete die Verbindung unerwartet
        Das heißt wahrscheinlich, daß der Server abnormal beendete
        bevor oder während die Anweisung bearbeitet wurde.
Die Verbindung zum Server wurde verloren.  Versuche Reset: Fehlgeschlagen.

How can i get rid of it? (I don't have oids in the table, i created them
without oids)

> > The core file will be somewhere under $PGDATA, named either "core" or
> "core.nnnnn" depending on your kernel settings.  If you don't see one
> then it's probable that the postmaster was started under "ulimit -c 0".
> Put "ulimit -c unlimited" in your postgres startup script, restart,
> trigger the crash again.
>
> It's also a good idea to look in the postmaster log to see if any
> unusual messages appeared before the crash.

this is form the postmaster log:

LOG:  server process (PID 14756) was terminated by signal 11
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing
FATAL:  the database system is starting up
LOG:  database system was interrupted at 2006-01-23 09:46:03 CET
LOG:  checkpoint record is at 1/D890C0E0
LOG:  redo record is at 1/D88F93E8; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 485068; next OID: 16882321
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 1/D88F93E8
LOG:  record with zero length at 1/D8953988
LOG:  redo done at 1/D8953920
LOG:  database system is ready
LOG:  server process (PID 15198) was terminated by signal 11
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat
your command.
FATAL:  the database system is in recovery mode
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted at 2006-01-23 09:46:15 CET
LOG:  checkpoint record is at 1/D8953988
LOG:  redo record is at 1/D8953988; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 485130; next OID: 16882321
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 1/D89539D0
LOG:  record with zero length at 1/D8966BF8
LOG:  redo done at 1/D8966BC8
LOG:  database system is ready
LOG:  server process (PID 15400) was terminated by signal 11
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat
your command.
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted at 2006-01-23 09:46:24 CET
LOG:  checkpoint record is at 1/D8966BF8
LOG:  redo record is at 1/D8966BF8; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 485183; next OID: 16882321
LOG:  database system was not properly shut down; automatic recovery in
progress
FATAL:  the database system is starting up
LOG:  redo starts at 1/D8966C40
LOG:  record with zero length at 1/D8991CC8
LOG:  redo done at 1/D8991C98
LOG:  database system is ready

any further help is very appreciated,

kind regards
janning


pgsql-general by date:

Previous
From: Lincoln Yeoh
Date:
Subject: Re: RAID 5 and postgresql
Next
From: Tom Lane
Date:
Subject: Re: invalid memory alloc request size