Thread: More AIX 5.3 fun - out of memory ?

More AIX 5.3 fun - out of memory ?

From
Gavin Hamill
Date:
Having successfully got pg 8.1.3 to run, I'm simply trying to pg_restore
a dump from our live server. However, even very soon into the restore, I
get these messages from the autovacuum daemon once per minute:

2006-03-28 14:15:35 BSTLOG:  autovacuum: processing database "laterooms"
TopMemoryContext: 50584 total in 6 blocks; 11112 free (13 chunks); 39472
used
TopTransactionContext: 8192 total in 1 blocks; 7688 free (10 chunks);
504 used
Operator class cache: 8192 total in 1 blocks; 4936 free (0 chunks); 3256
used
Autovacuum context: 122880 total in 4 blocks; 61472 free (17 chunks);
61408 used
smgr relation table: 8192 total in 1 blocks; 3904 free (0 chunks); 4288 used
Portal hash: 8192 total in 1 blocks; 5968 free (0 chunks); 2224 used
PortalMemory: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Relcache by OID: 8192 total in 1 blocks; 1816 free (0 chunks); 6376 used
CacheMemoryContext: 516096 total in 6 blocks; 78480 free (0 chunks);
437616 used
pg_index_indrelid_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_attrdef_adrelid_adnum_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_autovacuum_vacrelid_index: 1024 total in 1 blocks; 392 free (0
chunks); 632 used
pg_type_typname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks);
696 used
pg_type_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_statistic_relid_att_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_auth_members_member_role_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_auth_members_role_member_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_proc_proname_args_nsp_index: 1024 total in 1 blocks; 256 free (0
chunks); 768 used
pg_proc_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_operator_oprname_l_r_n_index: 1024 total in 1 blocks; 192 free (0
chunks); 832 used
pg_operator_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_opclass_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_opclass_am_name_nsp_index: 1024 total in 1 blocks; 256 free (0
chunks); 768 used
pg_namespace_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632
used
pg_namespace_nspname_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_language_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_language_name_index: 1024 total in 1 blocks; 392 free (0 chunks); 632
used
pg_inherits_relid_seqno_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_index_indexrelid_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_authid_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_authid_rolname_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_database_datname_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_conversion_oid_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_conversion_name_nsp_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_conversion_default_index: 1024 total in 1 blocks; 192 free (0
chunks); 832 used
pg_class_relname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks);
696 used
pg_class_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_cast_source_target_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_amproc_opc_proc_index: 1024 total in 1 blocks; 256 free (0 chunks);
768 used
pg_amop_opr_opc_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_amop_opc_strat_index: 1024 total in 1 blocks; 256 free (0 chunks);
768 used
pg_aggregate_fnoid_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
Per-database table: 57344 total in 3 blocks; 17120 free (9 chunks);
40224 used
Databases hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
MdSmgr: 8192 total in 1 blocks; 8080 free (0 chunks); 112 used
LockTable (locallock hash): 8192 total in 1 blocks; 3912 free (0
chunks); 4280 used
Timezones: 48408 total in 2 blocks; 5968 free (0 chunks); 42440 used
Postmaster: 355016 total in 2 blocks; 5800 free (46 chunks); 349216 used
ErrorContext: 8192 total in 1 blocks; 8176 free (4 chunks); 16 used
2006-03-28 14:15:35 BSTERROR:  out of memory
2006-03-28 14:15:35 BSTDETAIL:  Failed on request of size 268435452.

Out of memory? This machine is dedicated to pg and has 16GB of RAM
installed! My AIX knowledge is next to zero, but I'm hoping this helps:

bash-3.00# svmon -U postgres

===============================================================================
User                                 Inuse      Pin     Pgsp  Virtual
postgres                             18989     5304        0    17286

...............................................................................
SYSTEM segments                      Inuse      Pin     Pgsp  Virtual
                                      6767     5292        0     6767

    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp
Virtual
       0         0 work kernel                       s   6767  5292
0  6767

...............................................................................
EXCLUSIVE segments                   Inuse      Pin     Pgsp  Virtual
                                      7557       12        0     6074

    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp
Virtual
  1e28be         3 work shared memory segment        s   4078     0
0  4078
  1213f2         1 pers code,/dev/hd2:2809           s   1482     0
-     -
  1425f4         6 work shared memory segment        s   1389     0
0  1389
   525e5         2 work process private              s    150     2
0   150
   82608         2 work process private              s    131     2
0   131
  1b25bb         2 work process private              s     99     2
0    99
  152895         2 work process private              s     74     2
0    74
  1025f0         f work shared library data          s     45     0
0    45
   227e2         2 work process private              s     33     2
0    33
  1125f1         f work shared library data          s     20     0
0    20
   225e2         2 work process private              s     19     2
0    19
  1725f7         f work shared library data          s     11     0
0    11
  172777         f work shared library data          s     10     0
0    10
   827e8         f work shared library data          s      9     0
0     9
   32603         f work shared library data          s      6     0
0     6
  1b25db         - pers /dev/hd3:24                  s      1     0
-     -
   425c4         4 work shared memory segment        s      0     0
0     0
   925e9         5 work shared memory segment        s      0     0
0     0

...............................................................................
SHARED segments                      Inuse      Pin     Pgsp  Virtual
                                      4665        0        0     4445

    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp
Virtual
   10021         d work shared library text          s   4445     0
0  4445
   f042f         1 pers code,/dev/hd10opt:4164       s    160     0
-     -
  1500d5         1 pers code,/dev/hd2:5664           s     58     0
-     -
  180058         - pers /dev/hd2:45535               s      2     0
-     -


I've checked that the 'postgres' user does not have any limits set with
smitty, and then logged out + in again, but when I restart the
postmaster, I immediately get the same errors from autovac.

This is doubtless some AIX feature that I simply don't know about,
rather than any interaction as a result of pg itself, but AIX admins
must have come across this before :(

Cheers,
Gavin.


Re: More AIX 5.3 fun - out of memory ?

From
Seneca Cunningham
Date:
Gavin Hamill wrote:
> Having successfully got pg 8.1.3 to run, I'm simply trying to pg_restore
> a dump from our live server. However, even very soon into the restore, I
> get these messages from the autovacuum daemon once per minute:
[...]
> 2006-03-28 14:15:35 BSTERROR:  out of memory
> 2006-03-28 14:15:35 BSTDETAIL:  Failed on request of size 268435452.
>
> Out of memory? This machine is dedicated to pg and has 16GB of RAM
> installed! My AIX knowledge is next to zero, but I'm hoping this helps:
[...]
> I've checked that the 'postgres' user does not have any limits set with
> smitty, and then logged out + in again, but when I restart the
> postmaster, I immediately get the same errors from autovac.
>
> This is doubtless some AIX feature that I simply don't know about,
> rather than any interaction as a result of pg itself, but AIX admins
> must have come across this before :(

You're using a 32-bit build with the default memory model, it would
seem.  You're limited to less than 256MB for the heap in that model.
Try stopping the postmaster and starting it back up with the environment
variable LDR_CNTRL set to "MAXDATA=0x80000000".  This expands the heap
to 2GB and if it helps, you can use ldedit on your postgres binaries to
make that the default (or rebuild, passing the linker
"-bmaxdata:0x80000000").

Anyway, "Large Program Support"[1] from the AIX docs give an overview of
the situation.  Chapter 3 of the redbook "Developing and Porting C and
C++ Applications on AIX"[2] goes into all the gory details of what's
happening.

[1]
http://publib.boulder.ibm.com/infocenter/pseries/topic/com.ibm.aix.doc/aixprggd/genprogc/lrg_prg_support.htm
[2] http://www.redbooks.ibm.com/abstracts/sg245674.html?Open
--
Seneca Cunningham
scunning@ca.afilias.info

Re: More AIX 5.3 fun - out of memory ?

From
Tom Lane
Date:
Gavin Hamill <gdh@laterooms.com> writes:
> Having successfully got pg 8.1.3 to run, I'm simply trying to pg_restore
> a dump from our live server. However, even very soon into the restore, I
> get these messages from the autovacuum daemon once per minute:
> 2006-03-28 14:15:35 BSTERROR:  out of memory
> 2006-03-28 14:15:35 BSTDETAIL:  Failed on request of size 268435452.

What have you got maintenance_work_mem set to?  If it's more than 1Gb,
back it off --- 8.1.3's VACUUM command doesn't handle that nicely.
(It's unlikely you really want it that high anyway.)

You might wanna put a space at the end of your log_line_prefix, too...

            regards, tom lane

Re: More AIX 5.3 fun - out of memory ?

From
Gavin Hamill
Date:
Seneca Cunningham wrote:

> You're using a 32-bit build with the default memory model, it would
> seem.  You're limited to less than 256MB for the heap in that model.
> Try stopping the postmaster and starting it back up with the
> environment variable LDR_CNTRL set to "MAXDATA=0x80000000".  This
> expands the heap to 2GB and if it helps, you can use ldedit on your
> postgres binaries to make that the default (or rebuild, passing the
> linker "-bmaxdata:0x80000000").


Yep you're right on the money with default everything, and the kernel is
indeed 32-bit. Still a bit of a mixed bag tho :)

Last night, I passed

$ LDR_CNTRL="MAXDATA=0x8000000" bin/pg_ctl start -D /usr/local/pgsql/share/

Which immediately threw back an 'shm' error that the process table was
full, so I tried with a more modest 0x4000000 request and that worked a
treat; autovac ran fine and a full pg_restore completed perfectly -
hurrah! :)

This morning, however, pg starts with either  0x8000000 or 0x4000000,
but I am right back at square one with the same messages when autovac fires:

"2006-03-29 09:18:45 BSTERROR:  out of memory
2006-03-29 09:18:45 BSTDETAIL:  Failed on request of size 268435452." so
I'm at a loss :)

Would it be better to 'Enable 64-bit application environment' as the
option in 'smitty' has kept tempting me with? Or use the AIX 64-bit
kernel instead? Or both? :)) I'm using gcc 4.1.0 to compile pg - would I
need to recompile with any particular flags for 64-bit goodness?

> Anyway, "Large Program Support"[1] from the AIX docs give an overview
> of the situation.  Chapter 3 of the redbook "Developing and porting C
> and C++ Applications on AIX"[2] goes into all the gory details of
> what's happening.
>
> [1]
> http://publib.boulder.ibm.com/infocenter/pseries/topic/com.ibm.aix.doc/aixprggd/genprogc/lrg_prg_support.htm
>


Wow, that's quite some bedtime reading - hopefully some day I will be
able to read it without feeling that my brain is about to melt :)

FAO Tom Lane - I have maintenance_work_mem set to 256MB, so that also
matches perfectly with the size of the out of memory error :)

Cheers,
Gavin.



Re: More AIX 5.3 fun - out of memory ?

From
Seneca Cunningham
Date:
Gavin Hamill wrote:
> Seneca Cunningham wrote:
>
>> You're using a 32-bit build with the default memory model, it would
>> seem.  You're limited to less than 256MB for the heap in that model.
>> Try stopping the postmaster and starting it back up with the
>> environment variable LDR_CNTRL set to "MAXDATA=0x80000000".  This
>> expands the heap to 2GB and if it helps, you can use ldedit on your
>> postgres binaries to make that the default (or rebuild, passing the
>> linker "-bmaxdata:0x80000000").
>
>
> Yep you're right on the money with default everything, and the kernel is
> indeed 32-bit. Still a bit of a mixed bag tho :)

Builds default to 32-bit even with a 64-bit kernel in use.  64-bit
binaries can be built and used even with a 32-bit kernel just so long as
the underlying hardware is 64-bit and 64-bit mode is enabled.  You have
64-bit hardware.

> Last night, I passed
>
> $ LDR_CNTRL="MAXDATA=0x8000000" bin/pg_ctl start -D /usr/local/pgsql/share/
>
> Which immediately threw back an 'shm' error that the process table was
> full, so I tried with a more modest 0x4000000 request and that worked a
> treat; autovac ran fine and a full pg_restore completed perfectly -
> hurrah! :)
>
> This morning, however, pg starts with either  0x8000000 or 0x4000000,
> but I am right back at square one with the same messages when autovac
> fires:
>
> "2006-03-29 09:18:45 BSTERROR:  out of memory
> 2006-03-29 09:18:45 BSTDETAIL:  Failed on request of size 268435452." so
> I'm at a loss :)

Are "0x8000000" and "0x4000000" typos in this message, or did you really
request less than 256MB?  Those numbers are the number of bytes
requested for the heap and memory is set aside in 256MB segments.

> Would it be better to 'Enable 64-bit application environment' as the
> option in 'smitty' has kept tempting me with? Or use the AIX 64-bit
> kernel instead? Or both? :)) I'm using gcc 4.1.0 to compile pg - would I
> need to recompile with any particular flags for 64-bit goodness?

The 64-bit application environment allows you to use 64-bit apps with a
32-bit kernel.  Your hardware is intended for use with the 64-bit kernel.

If you want a 64-bit postgres, you will need to recompile.  Set
OBJECT_MODE to 64 before running ./configure, and pass CC="gcc -maix64"
and LDFLAGS="-Wl,-bbigtoc" to ./configure.  OBJECT_MODE tells the
linker, assembler, archiver, and other AIX build tools to operate on
64-bit objects.  "-maix64" is the flag to tell gcc to use the 64-bit AIX
ABI.  "-bbigtoc" tells the linker that it is allowed to generate a toc
greater than 64KB if needed (it will be).

If you do switch to a 64-bit build, you will need to re-initdb.  I have
heard that the statistics collector of a 64-bit build may have some
problems, but it does pass "gmake check".

--
Seneca Cunningham
scunning@ca.afilias.info

Re: More AIX 5.3 fun - out of memory ?

From
Gavin Hamill
Date:
On Wed, 29 Mar 2006 11:22:11 -0500
Seneca Cunningham <scunning@ca.afilias.info> wrote:
>
> Are "0x8000000" and "0x4000000" typos in this message, or did you really
> request less than 256MB?  Those numbers are the number of bytes
> requested for the heap and memory is set aside in 256MB segments.

No, they're not typos - I think I must have had my brain out - oops :)

> If you want a 64-bit postgres, you will need to recompile.

Ahh you jewel, I passed in all the variables you specified and the recompile ran smoothly once I'd recompiled zlib and
movedthe .a library from /usr/lib/ to /usr/lib/ppc64 - the cube + earthdistance modules also compiled nicely. 

pg_restore ran perfectly and I have much more faith in the running binary now everything is 64-bit native :)

> If you do switch to a 64-bit build, you will need to re-initdb.  I have
> heard that the statistics collector of a 64-bit build may have some
> problems, but it does pass "gmake check".

Yep, zapped + restored from a dumpfile.

Thank you so much for this info! I was really beginning to worry and wondered if I should just cut my losses and
installDebian on the pSeries box. 

Cheers,
Gavin.