Re: "invalid memory alloc request size" + "Could not open file "pg_clog/XXXX" - Mailing list pgsql-general

From Albe Laurenz
Subject Re: "invalid memory alloc request size" + "Could not open file "pg_clog/XXXX"
Date
Msg-id D960CB61B694CF459DCFB4B0128514C2078D88B7@exadv11.host.magwien.gv.at
Whole thread Raw
In response to "invalid memory alloc request size" + "Could not open file "pg_clog/XXXX"  (scheu_postgresql <scheu.postgresql@gmail.com>)
List pgsql-general
scheu_postgresql wrote:
> In my Postgresql 8.4.0 server, since this morning some tables are
unavailable, see example below :
>
> --> pg_dump MY_DB > bkp_MY_DB.dmp
> pg_dump: SQL command failed
> pg_dump: Error message from server: ERROR:  invalid memory alloc
request size 18446744073709551613
> pg_dump: The command was: COPY <schema>.<unavailable_table> (col1,
col2, ...).
>
> --> vacuum analyze <schema>.<unavailable_table> ;
>             WARNING:  terminating connection because of crash of
another server process
>             DETAIL:  The postmaster has commanded this server process
to roll back the current
> transaction and exit, because another server process exited abnormally
and possibly corrupted shared
> memory.
>             HINT:  In a moment you should be able to reconnect to the
database and repeat your
> command.
>
> --> select * from <schema>.<unavailable_table> ;
> ERROR:  invalid memory alloc request size 18446744073709551613
>
> --> server log file
> Feb 29 05:31:44 my_server postgres[6686]: [17-1] user=,db= LOG:
server process (PID 3887) was
> terminated by signal 11: Segmentation fault
> Feb 29 05:31:44 my_server postgres[6686]: [18-1] user=,db= LOG:
terminating any other active server
> processes
> Feb 29 05:31:44 my_server postgres[6686]: [19-1] user=,db= LOG:  all
server processes terminated;
> reinitializing
> Feb 29 05:31:44 my_server postgres[3892]: [20-1] user=,db= LOG:
database system was interrupted; last
> known up at 2012-02-29 05:22:33 CET
> Feb 29 05:31:44 my_server postgres[3892]: [21-1] user=,db= LOG:
database system was not properly shut
> down; automatic recovery in progress
> Feb 29 05:31:44 my_server postgres[3892]: [22-1] user=,db= LOG:  redo
starts at 10/67C2A3B8
> Feb 29 05:31:45 my_server postgres[3892]: [23-1] user=,db= LOG:
record with zero length at
> 10/68BCF990
> Feb 29 05:31:45 my_server postgres[3892]: [24-1] user=,db= LOG:  redo
done at 10/68BCF960
> Feb 29 05:31:45 my_server postgres[3892]: [25-1] user=,db= LOG:  last
completed transaction was at log
> time 2012-02-29 05:31:42.618352+01
> Feb 29 05:31:45 my_server postgres[6686]: [20-1] user=,db= LOG:
database system is ready to accept
> connections
> Feb 29 05:32:52 my_server postgres[4469]: [21-1]
user=[unknown],db=[unknown] LOG:  incomplete startup
> packet
> Feb 29 05:33:52 my_server postgres[6686]: [21-1] user=,db= LOG:
server process (PID 5151) was
> terminated by signal 11: Segmentation fault
> Feb 29 05:33:52 my_server postgres[6686]: [22-1] user=,db= LOG:
terminating any other active server
> processes
> Feb 29 05:33:52 my_server postgres[6686]: [23-1] user=,db= LOG:  all
server processes terminated;
> reinitializing
> Feb 29 05:33:52 my_server postgres[5152]: [24-1] user=,db= LOG:
database system was interrupted; last
> known up at 2012-02-29 05:31:45 CET
> Feb 29 05:33:52 my_server postgres[5152]: [25-1] user=,db= LOG:
database system was not properly shut
> down; automatic recovery in progress
> Feb 29 05:33:52 my_server postgres[5152]: [26-1] user=,db= LOG:
record with zero length at
> 10/68BCF9D8
> Feb 29 05:33:52 my_server postgres[5152]: [27-1] user=,db= LOG:  redo
is not required
> Feb 29 05:33:52 my_server postgres[5153]: [24-1] user=match,db=MY_DB
FATAL:  the database system is in
> recovery mode
> Feb 29 05:33:52 my_server postgres[6686]: [24-1] user=,db= LOG:
database system is ready to accept
> connections
> Feb 29 05:37:19 my_server postgres[6686]: [25-1] user=,db= LOG:
server process (PID 8065) was
> terminated by signal 11: Segmentation fault
> Feb 29 05:37:19 my_server postgres[6686]: [26-1] user=,db= LOG:
terminating any other active server
> processes
> Feb 29 05:37:19 my_server postgres[6686]: [27-1] user=,db= LOG:  all
server processes terminated;
> reinitializing
> Feb 29 05:37:19 my_server postgres[8066]: [28-1] user=,db= LOG:
database system was interrupted; last
> known up at 2012-02-29 05:33:52 CET
> Feb 29 05:37:19 my_server postgres[8066]: [29-1] user=,db= LOG:
database system was not properly shut
> down; automatic recovery in progress
> Feb 29 05:37:19 my_server postgres[8066]: [30-1] user=,db= LOG:  redo
starts at 10/68BCFA20
> Feb 29 05:37:19 my_server postgres[8066]: [31-1] user=,db= LOG:
record with zero length at
> 10/68BD5BD0
> Feb 29 05:37:19 my_server postgres[8066]: [32-1] user=,db= LOG:  redo
done at 10/68BD5BA0
> Feb 29 05:37:19 my_server postgres[8066]: [33-1] user=,db= LOG:  last
completed transaction was at log
> time 2012-02-29 05:35:44.468968+01
> Feb 29 05:37:19 my_server postgres[6686]: [28-1] user=,db= LOG:
database system is ready to accept
> connections
> Feb 29 05:38:27 my_server postgres[8639]: [29-1]
user=[unknown],db=[unknown] LOG:  incomplete startup
> packet
> Feb 29 05:38:53 my_server postgres[6686]: [29-1] user=,db= LOG:
server process (PID 8809) was
> terminated by signal 11: Segmentation fault
>
>
> I have tried to restart Postgresql but it did not solve these issues
> I cannot backup the full database because some tables have become
unreadable
> I have got 7 databases on this server and only 2 have got this problem
>
> What could be the cause of the problem ?

If a sequential scan fails, I would say that the table is corrupted.
The cause could be faulty hardware, a corrupted file system or a
software bug.
I notice that you are running 8.4.0 which is a really bad idea.
A number of data corruption bugs have been fixed since.

Check the hardware and the file systems.

> Is there a way to fix it without losing data and without dropping and
recreating the db with my
> nightly pg_dump backup ?

Without losing data? Not unless you can poke around in the guts of
the corrupted blocks and make sense of what you find there...

If your requirement is "no data loss", you'll have to use a different
backup strategy.

Yours,
Laurenz Albe

pgsql-general by date:

Previous
From: Tim Wilson
Date:
Subject: Re: PickSplit method of 2 columns ... error
Next
From: Vincent Veyron
Date:
Subject: Re: what Linux to run