Thread: Out of memory (Failed on request size 24)

Out of memory (Failed on request size 24)

From

"Rob Owen"

Date:

14 November 2006, 10:23:15

PostgreSQL 8.0.3 running on AIX 5.3 (same thing happens on 5.1 though).
DBMS was running fine for some months but now one of the databases isn't accessible. Any help would be greatly
appreciated. 

DBMS starts up fine, but any operation on the files database (psql files, vaccumdb files, pgdump files) yields the same
result.The client responds with 

> psql files
DEBUG:  InitPostgres
DEBUG:  StartTransaction
DEBUG:  name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 626915/1/0, nestlvl: 1, children: <>
psql: FATAL:  out of memory
DETAIL:  Failed on request of size 24.

The server log file contains the following

LOG:  database system was shut down at 2006-11-14 04:38:54 EST
LOG:  checkpoint record is at 0/19A82450
LOG:  redo record is at 0/19A82450; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 626912; next OID: 54355
LOG:  database system is ready
DEBUG:  proc_exit(0)
DEBUG:  shmem_exit(0)
DEBUG:  exit(0)
DEBUG:  reaping dead processes
DEBUG:  forked new backend, pid=168724 socket=7
DEBUG:  postmaster child[168724]: starting with (
DEBUG:          postgres
DEBUG:          -v196608
DEBUG:          -p
DEBUG:          files
DEBUG:  )
DEBUG:  InitPostgres
DEBUG:  StartTransaction
DEBUG:  name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 626912/1/0, nestlvl: 1, children: <>
TopMemoryContext: 42416 total in 4 blocks; 12312 free (2 chunks); 30104 used
TopTransactionContext: 2145378304 total in 266 blocks; 928 free (14 chunks); 2145377376 used
PortalMemory: 0 total in 0 blocks; 0 free (0 chunks); 0 used
CacheMemoryContext: 516096 total in 6 blocks; 178752 free (10 chunks); 337344 used
pg_operator_oid_index: 1024 total in 1 blocks; 840 free (0 chunks); 184 used
pg_amproc_opc_proc_index: 1024 total in 1 blocks; 736 free (0 chunks); 288 used
pg_amop_opc_strat_index: 1024 total in 1 blocks; 736 free (0 chunks); 288 used
pg_index_indexrelid_index: 1024 total in 1 blocks; 840 free (0 chunks); 184 used
pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 744 free (0 chunks); 280 used
pg_class_oid_index: 1024 total in 1 blocks; 840 free (0 chunks); 184 used
pg_amproc_opc_proc_index: 1024 total in 1 blocks; 736 free (0 chunks); 288 used
pg_amop_opc_strat_index: 1024 total in 1 blocks; 736 free (0 chunks); 288 used
pg_class_relname_nsp_index: 1024 total in 1 blocks; 744 free (0 chunks); 280 used
MdSmgr: 8192 total in 1 blocks; 7808 free (0 chunks); 384 used
DynaHash: 8192 total in 1 blocks; 5936 free (0 chunks); 2256 used
Operator class cache: 8192 total in 1 blocks; 1968 free (0 chunks); 6224 used
smgr relation table: 24576 total in 2 blocks; 16256 free (5 chunks); 8320 used
Portal hash: 8192 total in 1 blocks; 4032 free (0 chunks); 4160 used
Relcache by OID: 8192 total in 1 blocks; 928 free (0 chunks); 7264 used
Relcache by name: 24576 total in 2 blocks; 14208 free (5 chunks); 10368 used
LockTable (locallock hash): 24576 total in 2 blocks; 16272 free (6 chunks); 8304 used
ErrorContext: 8192 total in 1 blocks; 8160 free (7 chunks); 32 used
FATAL:  out of memory
DETAIL:  Failed on request of size 24.
DEBUG:  proc_exit(0)
DEBUG:  shmem_exit(0)
DEBUG:  exit(0)
DEBUG:  reaping dead processes
DEBUG:  server process (PID 168724) exited with exit code 0

Re: Out of memory (Failed on request size 24)

From

Martijn van Oosterhout

Date:

14 November 2006, 10:44:44

On Tue, Nov 14, 2006 at 05:53:08AM -0500, Rob Owen wrote:
> PostgreSQL 8.0.3 running on AIX 5.3 (same thing happens on 5.1 though).
> DBMS was running fine for some months but now one of the databases isn't accessible. Any help would be greatly
appreciated. 
>
> DBMS starts up fine, but any operation on the files database (psql files, vaccumdb files, pgdump files) yields the
sameresult. The client responds with 
>
> > psql files

<snip>

Something screwed up:

> TopTransactionContext: 2145378304 total in 266 blocks; 928 free (14 chunks); 2145377376 used

That's a lot of memory. I thought there was a check on negative sized
allocations... Did "make check" pass ok?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

Re: Out of memory (Failed on request size 24)

From

"Rob Owen"

Date:

14 November 2006, 12:31:53

Thanks Martijn,
I reduced a number of the buffers and connection settings, and added some more tracing and this is the result. The
number(TopTransactionContext) is smaller, but still very large. Any reason why this number would suddenly go sky high -
thesame system was working fine just a month ago.  

<2006-11-14 05:48:35 EST>LOG:  00000: database system was shut down at 2006-11-14 05:48:30 EST
<2006-11-14 05:48:35 EST>LOCATION:  StartupXLOG, xlog.c:4049
<2006-11-14 05:48:35 EST>LOG:  00000: checkpoint record is at 0/19A825B8
<2006-11-14 05:48:35 EST>LOCATION:  StartupXLOG, xlog.c:4132
<2006-11-14 05:48:35 EST>LOG:  00000: redo record is at 0/19A825B8; undo record is at 0/0; shutdown TRUE
<2006-11-14 05:48:35 EST>LOCATION:  StartupXLOG, xlog.c:4160
<2006-11-14 05:48:35 EST>LOG:  00000: next transaction ID: 626916; next OID: 54355
<2006-11-14 05:48:35 EST>LOCATION:  StartupXLOG, xlog.c:4163
<2006-11-14 05:48:35 EST>LOG:  00000: database system is ready
<2006-11-14 05:48:35 EST>LOCATION:  StartupXLOG, xlog.c:4526
<2006-11-14 05:48:35 EST>DEBUG:  00000: proc_exit(0)
<2006-11-14 05:48:35 EST>LOCATION:  proc_exit, ipc.c:95
<2006-11-14 05:48:35 EST>DEBUG:  00000: shmem_exit(0)
<2006-11-14 05:48:35 EST>LOCATION:  shmem_exit, ipc.c:126
<2006-11-14 05:48:35 EST>DEBUG:  00000: exit(0)
<2006-11-14 05:48:35 EST>LOCATION:  proc_exit, ipc.c:113
<2006-11-14 05:48:35 EST>DEBUG:  00000: reaping dead processes
<2006-11-14 05:48:35 EST>LOCATION:  reaper, postmaster.c:1988
<2006-11-14 05:48:46 EST>DEBUG:  00000: forked new backend, pid=168246 socket=7
<2006-11-14 05:48:46 EST>LOCATION:  BackendStartup, postmaster.c:2499
<2006-11-14 05:48:46 EST>DEBUG:  00000: postmaster child[168246]: starting with (
<2006-11-14 05:48:46 EST>LOCATION:  BackendRun, postmaster.c:2829
<2006-11-14 05:48:46 EST>DEBUG:  00000:         postgres
<2006-11-14 05:48:46 EST>LOCATION:  BackendRun, postmaster.c:2832
<2006-11-14 05:48:46 EST>DEBUG:  00000:         -v196608
<2006-11-14 05:48:46 EST>LOCATION:  BackendRun, postmaster.c:2832
<2006-11-14 05:48:46 EST>DEBUG:  00000:         -p
<2006-11-14 05:48:46 EST>LOCATION:  BackendRun, postmaster.c:2832
<2006-11-14 05:48:46 EST>DEBUG:  00000:         files
<2006-11-14 05:48:46 EST>LOCATION:  BackendRun, postmaster.c:2832
<2006-11-14 05:48:46 EST>DEBUG:  00000: )
<2006-11-14 05:48:46 EST>LOCATION:  BackendRun, postmaster.c:2834
<2006-11-14 05:48:46 EST>DEBUG:  00000: InitPostgres
<2006-11-14 05:48:46 EST>LOCATION:  PostgresMain, postgres.c:2719
<2006-11-14 05:48:46 EST>DEBUG:  00000: StartTransaction
<2006-11-14 05:48:46 EST>LOCATION:  ShowTransactionState, xact.c:3609
<2006-11-14 05:48:46 EST>DEBUG:  00000: name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid:
626916/
1/0, nestlvl: 1, children: <>
<2006-11-14 05:48:46 EST>LOCATION:  ShowTransactionStateRec, xact.c:3634
TopMemoryContext: 32768 total in 3 blocks; 10760 free (3 chunks); 22008 used
TopTransactionContext: 1340071936 total in 170 blocks; 928 free (14 chunks); 1340071008 used
PortalMemory: 0 total in 0 blocks; 0 free (0 chunks); 0 used
CacheMemoryContext: 516096 total in 6 blocks; 178752 free (10 chunks); 337344 used
pg_operator_oid_index: 1024 total in 1 blocks; 840 free (0 chunks); 184 used
pg_amproc_opc_proc_index: 1024 total in 1 blocks; 736 free (0 chunks); 288 used
pg_amop_opc_strat_index: 1024 total in 1 blocks; 736 free (0 chunks); 288 used
pg_index_indexrelid_index: 1024 total in 1 blocks; 840 free (0 chunks); 184 used
pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 744 free (0 chunks); 280 used
pg_class_oid_index: 1024 total in 1 blocks; 840 free (0 chunks); 184 used
pg_amproc_opc_proc_index: 1024 total in 1 blocks; 736 free (0 chunks); 288 used
pg_amop_opc_strat_index: 1024 total in 1 blocks; 736 free (0 chunks); 288 used
pg_class_relname_nsp_index: 1024 total in 1 blocks; 744 free (0 chunks); 280 used
MdSmgr: 8192 total in 1 blocks; 7808 free (0 chunks); 384 used
DynaHash: 8192 total in 1 blocks; 5936 free (0 chunks); 2256 used
Operator class cache: 8192 total in 1 blocks; 1968 free (0 chunks); 6224 used
smgr relation table: 24576 total in 2 blocks; 16256 free (5 chunks); 8320 used
Portal hash: 8192 total in 1 blocks; 4032 free (0 chunks); 4160 used
Relcache by OID: 8192 total in 1 blocks; 928 free (0 chunks); 7264 used
Relcache by name: 24576 total in 2 blocks; 14208 free (5 chunks); 10368 used
LockTable (locallock hash): 24576 total in 2 blocks; 16272 free (6 chunks); 8304 used
ErrorContext: 8192 total in 1 blocks; 8160 free (7 chunks); 32 used
<2006-11-14 05:50:03 EST>FATAL:  53200: out of memory
<2006-11-14 05:50:03 EST>DETAIL:  Failed on request of size 24.
<2006-11-14 05:50:03 EST>LOCATION:  AllocSetAlloc, aset.c:702
<2006-11-14 05:50:03 EST>DEBUG:  00000: proc_exit(0)
<2006-11-14 05:50:03 EST>LOCATION:  proc_exit, ipc.c:95
<2006-11-14 05:50:03 EST>DEBUG:  00000: shmem_exit(0)
<2006-11-14 05:50:03 EST>LOCATION:  shmem_exit, ipc.c:126
<2006-11-14 05:50:03 EST>DEBUG:  00000: exit(0)
<2006-11-14 05:50:03 EST>LOCATION:  proc_exit, ipc.c:113
<2006-11-14 05:50:03 EST>DEBUG:  00000: reaping dead processes
<2006-11-14 05:50:03 EST>LOCATION:  reaper, postmaster.c:1988
<2006-11-14 05:50:03 EST>DEBUG:  00000: server process (PID 168246) exited with exit code 0
<2006-11-14 05:50:03 EST>LOCATION:  LogChildExit, postmaster.c:2349
<2006-11-14 05:53:49 EST>DEBUG:  00000: postmaster received signal 15
<2006-11-14 05:53:49 EST>LOCATION:  pmdie, postmaster.c:1850
<2006-11-14 05:53:49 EST>LOG:  00000: received smart shutdown request
<2006-11-14 05:53:49 EST>LOCATION:  pmdie, postmaster.c:1865
<2006-11-14 05:53:49 EST>LOG:  00000: shutting down
<2006-11-14 05:53:49 EST>LOCATION:  ShutdownXLOG, xlog.c:4706
<2006-11-14 05:53:49 EST>DEBUG:  00000: reaping dead processes
<2006-11-14 05:53:49 EST>LOCATION:  reaper, postmaster.c:1988
<2006-11-14 05:53:49 EST>LOG:  00000: database system is shut down
<2006-11-14 05:53:49 EST>LOCATION:  ShutdownXLOG, xlog.c:4715
<2006-11-14 05:53:49 EST>DEBUG:  00000: proc_exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  proc_exit, ipc.c:95
<2006-11-14 05:53:49 EST>DEBUG:  00000: shmem_exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  shmem_exit, ipc.c:126
<2006-11-14 05:53:49 EST>DEBUG:  00000: exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  proc_exit, ipc.c:113
<2006-11-14 05:53:49 EST>DEBUG:  00000: reaping dead processes
<2006-11-14 05:53:49 EST>LOCATION:  reaper, postmaster.c:1988
<2006-11-14 05:53:49 EST>DEBUG:  00000: proc_exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  proc_exit, ipc.c:95
<2006-11-14 05:53:49 EST>DEBUG:  00000: shmem_exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  shmem_exit, ipc.c:126
<2006-11-14 05:53:49 EST>DEBUG:  00000: exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  proc_exit, ipc.c:113
<2006-11-14 05:53:49 EST>LOG:  00000: logger shutting down
<2006-11-14 05:53:49 EST>LOCATION:  SysLoggerMain, syslogger.c:361
<2006-11-14 05:53:49 EST>DEBUG:  00000: proc_exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  proc_exit, ipc.c:95
<2006-11-14 05:53:49 EST>DEBUG:  00000: shmem_exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  shmem_exit, ipc.c:126
<2006-11-14 05:53:49 EST>DEBUG:  00000: exit(0)
<2006-11-14 05:53:49 EST>LOCATION:  proc_exit, ipc.c:113

-----Original Message-----
From: Martijn van Oosterhout [mailto:kleptog@svana.org]
Sent: Tuesday, November 14, 2006 6:44 AM
To: Rob Owen
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Out of memory (Failed on request size 24)

On Tue, Nov 14, 2006 at 05:53:08AM -0500, Rob Owen wrote:
> PostgreSQL 8.0.3 running on AIX 5.3 (same thing happens on 5.1 though).
> DBMS was running fine for some months but now one of the databases isn't accessible. Any help would be greatly
appreciated. 
>
> DBMS starts up fine, but any operation on the files database (psql
> files, vaccumdb files, pgdump files) yields the same result. The
> client responds with
>
> > psql files

<snip>

Something screwed up:

> TopTransactionContext: 2145378304 total in 266 blocks; 928 free (14
> chunks); 2145377376 used

That's a lot of memory. I thought there was a check on negative sized allocations... Did "make check" pass ok?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Out of memory (Failed on request size 24)

From

Tom Lane

Date:

14 November 2006, 15:01:02

"Rob Owen" <Rob.Owen@sas.com> writes:
> PostgreSQL 8.0.3 running on AIX 5.3 (same thing happens on 5.1 though).
> DBMS was running fine for some months but now one of the databases isn't accessible. Any help would be greatly
appreciated. 

Just one database?  Sounds like it might be corrupt data in that
database's system catalogs.  Can you get a stack trace from the point
of the error to help us narrow it down?

The way I usually debug startup-time failures is:

    export PGOPTIONS="-W 30"
    psql ...

Now I have 30 seconds to identify the PID of the backend process in
another window and do (as the postgres user)

    gdb /path/to/postgres PID

Once you've got gdb control of the backend, do

    gdb> break errfinish
    gdb> cont

... wait for the timeout to finish elapsing, if needed ...  Once
gdb reports that the breakpoint has been reached, say

    gdb> bt
    ... useful info here...
    gdb> cont

            regards, tom lane

Re: Out of memory (Failed on request size 24)

From

"Rob Owen"

Date:

14 November 2006, 17:08:56

Attached to backend postmaster and got the following. Hope this helps.

Attaching to program: /nfs/silence/bigdisk/eurrow/pgsql/bin/postmaster, process 170422
[Switching to Thread 1]
0x000000000000377c in ?? ()
(gdb) break errfinish
Breakpoint 1 at 0x1000019dc
(gdb) cont
Continuing.
[Switching to Thread 1]

Breakpoint 1, 0x00000001000019dc in errfinish ()
(gdb) bt
#0  0x00000001000019dc in errfinish ()
#1  0x00000001002920d0 in reaper ()
#2  <signal handler called>
#3  0x0fffffffffffd810 in ?? ()
Cannot access memory at address 0x203fe94000000000
(gdb) cont
Continuing.

Breakpoint 1, 0x00000001000019dc in errfinish ()
(gdb) bt
#0  0x00000001000019dc in errfinish ()
#1  0x0000000100292680 in LogChildExit ()
#2  0x00000001002971a8 in CleanupBackend ()
#3  0x00000001002923d0 in reaper ()
#4  <signal handler called>
#5  0x0fffffffffffd810 in ?? ()
Cannot access memory at address 0x203fe94000000000
(gdb) cont
Continuing.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, November 14, 2006 11:01 AM
To: Rob Owen
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Out of memory (Failed on request size 24)

"Rob Owen" <Rob.Owen@sas.com> writes:
> PostgreSQL 8.0.3 running on AIX 5.3 (same thing happens on 5.1 though).
> DBMS was running fine for some months but now one of the databases isn't accessible. Any help would be greatly
appreciated. 

Just one database?  Sounds like it might be corrupt data in that database's system catalogs.  Can you get a stack trace
fromthe point of the error to help us narrow it down? 

The way I usually debug startup-time failures is:

    export PGOPTIONS="-W 30"
    psql ...

Now I have 30 seconds to identify the PID of the backend process in another window and do (as the postgres user)

    gdb /path/to/postgres PID

Once you've got gdb control of the backend, do

    gdb> break errfinish
    gdb> cont

... wait for the timeout to finish elapsing, if needed ...  Once gdb reports that the breakpoint has been reached, say

    gdb> bt
    ... useful info here...
    gdb> cont

            regards, tom lane

Re: Out of memory (Failed on request size 24)

From

Tom Lane

Date:

14 November 2006, 19:07:48

"Rob Owen" <Rob.Owen@sas.com> writes:
> Attached to backend postmaster and got the following. Hope this helps.

Nope, you got the postmaster itself there, you need to look at the new
child process.  (It should look like "postgres: startup" in ps.)

            regards, tom lane

Re: Out of memory (Failed on request size 24)

From

"Rob Owen"

Date:

14 November 2006, 19:45:06

Breakpoint 1, 0x00000001000019dc in errfinish () from postmaster
(gdb) bt
#0  0x00000001000019dc in errfinish () from postmaster
#1  0x000000010000a680 in AllocSetAlloc () from postmaster
#2  0x0000000100002a1c in MemoryContextAlloc () from postmaster
#3  0x0000000100108c28 in _bt_search () from postmaster
#4  0x0000000100106484 in _bt_first () from postmaster
#5  0x00000001001045b4 in btgettuple () from postmaster
#6  0x0000000100029fb0 in FunctionCall2 () from postmaster
#7  0x00000001000295f8 in index_getnext () from postmaster
#8  0x000000010002942c in systable_getnext () from postmaster
#9  0x000000010000f9a0 in ScanPgRelation () from postmaster
#10 0x0000000100011088 in RelationBuildDesc () from postmaster
#11 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster
#12 0x000000010000e620 in relation_openr () from postmaster
#13 0x000000010000e44c in heap_openr () from postmaster
#14 0x0000000100041044 in RelationBuildTriggers () from postmaster
#15 0x00000001000111a4 in RelationBuildDesc () from postmaster
#16 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster
#17 0x000000010000e620 in relation_openr () from postmaster
#18 0x000000010000e44c in heap_openr () from postmaster
#19 0x000000010000e1c8 in CatalogCacheInitializeCache () from postmaster
#20 0x000000010000dab8 in SearchCatCache () from postmaster
#21 0x000000010000da3c in SearchSysCache () from postmaster
#22 0x000000010028b570 in InitializeSessionUserId () from postmaster
#23 0x0000000100288ae8 in InitPostgres () from postmaster
#24 0x000000010029e2a8 in PostgresMain () from postmaster
#25 0x00000001002990e0 in BackendRun () from postmaster
#26 0x0000000100298758 in BackendStartup () from postmaster
#27 0x0000000100297db0 in ServerLoop () from postmaster
#28 0x0000000100009b90 in PostmasterMain () from postmaster
#29 0x0000000100000680 in main () from postmaster
#30 0x000000010000028c in __start () from postmaster
(gdb) cont
Continuing.

Breakpoint 1, 0x00000001000019dc in errfinish () from postmaster
(gdb) bt
#0  0x00000001000019dc in errfinish () from postmaster
#1  0x0000000100002c58 in elog_finish () from postmaster
#2  0x0000000100007aa8 in proc_exit () from postmaster
#3  0x0000000100001c5c in errfinish () from postmaster
#4  0x000000010000a680 in AllocSetAlloc () from postmaster
#5  0x0000000100002a1c in MemoryContextAlloc () from postmaster
#6  0x0000000100108c28 in _bt_search () from postmaster
#7  0x0000000100106484 in _bt_first () from postmaster
#8  0x00000001001045b4 in btgettuple () from postmaster
#9  0x0000000100029fb0 in FunctionCall2 () from postmaster
#10 0x00000001000295f8 in index_getnext () from postmaster
#11 0x000000010002942c in systable_getnext () from postmaster
#12 0x000000010000f9a0 in ScanPgRelation () from postmaster
#13 0x0000000100011088 in RelationBuildDesc () from postmaster
#14 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster
#15 0x000000010000e620 in relation_openr () from postmaster
#16 0x000000010000e44c in heap_openr () from postmaster
#17 0x0000000100041044 in RelationBuildTriggers () from postmaster
#18 0x00000001000111a4 in RelationBuildDesc () from postmaster
#19 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster
#20 0x000000010000e620 in relation_openr () from postmaster
#21 0x000000010000e44c in heap_openr () from postmaster
#22 0x000000010000e1c8 in CatalogCacheInitializeCache () from postmaster
#23 0x000000010000dab8 in SearchCatCache () from postmaster
#24 0x000000010000da3c in SearchSysCache () from postmaster
#25 0x000000010028b570 in InitializeSessionUserId () from postmaster
#26 0x0000000100288ae8 in InitPostgres () from postmaster
#27 0x000000010029e2a8 in PostgresMain () from postmaster
#28 0x00000001002990e0 in BackendRun () from postmaster
#29 0x0000000100298758 in BackendStartup () from postmaster
#30 0x0000000100297db0 in ServerLoop () from postmaster
#31 0x0000000100009b90 in PostmasterMain () from postmaster
#32 0x0000000100000680 in main () from postmaster
#33 0x000000010000028c in __start () from postmaster
(gdb) cont
Continuing.

Breakpoint 1, 0x00000001000019dc in errfinish () from postmaster
(gdb) bt
#0  0x00000001000019dc in errfinish () from postmaster
#1  0x0000000100002c58 in elog_finish () from postmaster
#2  0x0000000100007bcc in shmem_exit () from postmaster
#3  0x0000000100007ab4 in proc_exit () from postmaster
#4  0x0000000100001c5c in errfinish () from postmaster
#5  0x000000010000a680 in AllocSetAlloc () from postmaster
#6  0x0000000100002a1c in MemoryContextAlloc () from postmaster
#7  0x0000000100108c28 in _bt_search () from postmaster
#8  0x0000000100106484 in _bt_first () from postmaster
#9  0x00000001001045b4 in btgettuple () from postmaster
#10 0x0000000100029fb0 in FunctionCall2 () from postmaster
#11 0x00000001000295f8 in index_getnext () from postmaster
#12 0x000000010002942c in systable_getnext () from postmaster
#13 0x000000010000f9a0 in ScanPgRelation () from postmaster
#14 0x0000000100011088 in RelationBuildDesc () from postmaster
#15 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster
#16 0x000000010000e620 in relation_openr () from postmaster
#17 0x000000010000e44c in heap_openr () from postmaster
#18 0x0000000100041044 in RelationBuildTriggers () from postmaster
#19 0x00000001000111a4 in RelationBuildDesc () from postmaster
#20 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster
#21 0x000000010000e620 in relation_openr () from postmaster
#22 0x000000010000e44c in heap_openr () from postmaster
#23 0x000000010000e1c8 in CatalogCacheInitializeCache () from postmaster
#24 0x000000010000dab8 in SearchCatCache () from postmaster
#25 0x000000010000da3c in SearchSysCache () from postmaster
#26 0x000000010028b570 in InitializeSessionUserId () from postmaster
#27 0x0000000100288ae8 in InitPostgres () from postmaster
#28 0x000000010029e2a8 in PostgresMain () from postmaster
#29 0x00000001002990e0 in BackendRun () from postmaster
#30 0x0000000100298758 in BackendStartup () from postmaster
#31 0x0000000100297db0 in ServerLoop () from postmaster
#32 0x0000000100009b90 in PostmasterMain () from postmaster
#33 0x0000000100000680 in main () from postmaster
#34 0x000000010000028c in __start () from postmaster
(gdb) cont
Continuing.

Breakpoint 1, 0x00000001000019dc in errfinish () from postmaster
(gdb) bt
#0  0x00000001000019dc in errfinish () from postmaster
#1  0x0000000100002c58 in elog_finish () from postmaster
#2  0x0000000100007b3c in proc_exit () from postmaster
#3  0x0000000100001c5c in errfinish () from postmaster
#4  0x000000010000a680 in AllocSetAlloc () from postmaster
#5  0x0000000100002a1c in MemoryContextAlloc () from postmaster
#6  0x0000000100108c28 in _bt_search () from postmaster
#7  0x0000000100106484 in _bt_first () from postmaster
#8  0x00000001001045b4 in btgettuple () from postmaster
#9  0x0000000100029fb0 in FunctionCall2 () from postmaster
#10 0x00000001000295f8 in index_getnext () from postmaster
#11 0x000000010002942c in systable_getnext () from postmaster
#12 0x000000010000f9a0 in ScanPgRelation () from postmaster
#13 0x0000000100011088 in RelationBuildDesc () from postmaster
#14 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster
#15 0x000000010000e620 in relation_openr () from postmaster
#16 0x000000010000e44c in heap_openr () from postmaster
#17 0x0000000100041044 in RelationBuildTriggers () from postmaster
#18 0x00000001000111a4 in RelationBuildDesc () from postmaster
#19 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster
#20 0x000000010000e620 in relation_openr () from postmaster
#21 0x000000010000e44c in heap_openr () from postmaster
#22 0x000000010000e1c8 in CatalogCacheInitializeCache () from postmaster
#23 0x000000010000dab8 in SearchCatCache () from postmaster
#24 0x000000010000da3c in SearchSysCache () from postmaster
#25 0x000000010028b570 in InitializeSessionUserId () from postmaster
#26 0x0000000100288ae8 in InitPostgres () from postmaster
#27 0x000000010029e2a8 in PostgresMain () from postmaster
#28 0x00000001002990e0 in BackendRun () from postmaster
#29 0x0000000100298758 in BackendStartup () from postmaster
#30 0x0000000100297db0 in ServerLoop () from postmaster
#31 0x0000000100009b90 in PostmasterMain () from postmaster
#32 0x0000000100000680 in main () from postmaster
#33 0x000000010000028c in __start () from postmaster
(gdb) cont
Continuing.

Program exited normally.
(gdb)

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, November 14, 2006 11:01 AM
To: Rob Owen
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Out of memory (Failed on request size 24)

"Rob Owen" <Rob.Owen@sas.com> writes:
> PostgreSQL 8.0.3 running on AIX 5.3 (same thing happens on 5.1 though).
> DBMS was running fine for some months but now one of the databases isn't accessible. Any help would be greatly
appreciated. 

Just one database?  Sounds like it might be corrupt data in that database's system catalogs.  Can you get a stack trace
fromthe point of the error to help us narrow it down? 

The way I usually debug startup-time failures is:

    export PGOPTIONS="-W 30"
    psql ...

Now I have 30 seconds to identify the PID of the backend process in another window and do (as the postgres user)

    gdb /path/to/postgres PID

Once you've got gdb control of the backend, do

    gdb> break errfinish
    gdb> cont

... wait for the timeout to finish elapsing, if needed ...  Once gdb reports that the breakpoint has been reached, say

    gdb> bt
    ... useful info here...
    gdb> cont

            regards, tom lane

Re: Out of memory (Failed on request size 24)

From

Tom Lane

Date:

14 November 2006, 22:10:24

"Rob Owen" <Rob.Owen@sas.com> writes:
> Breakpoint 1, 0x00000001000019dc in errfinish () from postmaster
> (gdb) bt
> #0  0x00000001000019dc in errfinish () from postmaster
> #1  0x000000010000a680 in AllocSetAlloc () from postmaster
> #2  0x0000000100002a1c in MemoryContextAlloc () from postmaster
> #3  0x0000000100108c28 in _bt_search () from postmaster
> #4  0x0000000100106484 in _bt_first () from postmaster
> #5  0x00000001001045b4 in btgettuple () from postmaster
> #6  0x0000000100029fb0 in FunctionCall2 () from postmaster
> #7  0x00000001000295f8 in index_getnext () from postmaster
> #8  0x000000010002942c in systable_getnext () from postmaster
> #9  0x000000010000f9a0 in ScanPgRelation () from postmaster
> #10 0x0000000100011088 in RelationBuildDesc () from postmaster
> #11 0x000000010000e6fc in RelationSysNameGetRelation () from postmaster

I think you are in luck: this looks like the corrupted data is in one of
the indexes on pg_class, so you should be able to recover by reindexing.
See the man page for REINDEX for the gory details of doing this (you
need the "ignore system indexes" option, and maybe some other pushups
depending on your Postgres version).

            regards, tom lane

Re: Out of memory (Failed on request size 24)

From

"Rob Owen"

Date:

15 November 2006, 21:08:06

Thanks Tom. It's all working again now.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, November 14, 2006 6:10 PM
To: Rob Owen
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Out of memory (Failed on request size 24)

"Rob Owen" <Rob.Owen@sas.com> writes:
> Breakpoint 1, 0x00000001000019dc in errfinish () from postmaster
> (gdb) bt
> #0  0x00000001000019dc in errfinish () from postmaster
> #1  0x000000010000a680 in AllocSetAlloc () from postmaster
> #2  0x0000000100002a1c in MemoryContextAlloc () from postmaster
> #3  0x0000000100108c28 in _bt_search () from postmaster
> #4  0x0000000100106484 in _bt_first () from postmaster
> #5  0x00000001001045b4 in btgettuple () from postmaster
> #6  0x0000000100029fb0 in FunctionCall2 () from postmaster
> #7  0x00000001000295f8 in index_getnext () from postmaster
> #8  0x000000010002942c in systable_getnext () from postmaster
> #9  0x000000010000f9a0 in ScanPgRelation () from postmaster #10
> 0x0000000100011088 in RelationBuildDesc () from postmaster
> #11 0x000000010000e6fc in RelationSysNameGetRelation () from
> postmaster

I think you are in luck: this looks like the corrupted data is in one of the indexes on pg_class, so you should be able
torecover by reindexing. 
See the man page for REINDEX for the gory details of doing this (you need the "ignore system indexes" option, and maybe
someother pushups depending on your Postgres version). 

            regards, tom lane