Thread: out of memory during query execution

out of memory during query execution

From
DANTE ALEXANDRA
Date:
Hello,

I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3, with
300GB of datas.
Some of the queries launched on this database finish with an "*out of
memory*". The queries which have failed contain a lot of join (between 6
tables), sub-select and aggregate. For these queries, the log file
contains :
psql:Q9.sql:40: ERROR:  out of memory
DETAIL:  Failed on request of size 148.

On the server used, I got 3GB of memory and 1 CPU.
The settings specified in the "postgresql.conf" are :
# - Memory -
shared_buffers = 12288
#temp_buffers = 1000
#max_prepared_transactions = 5
work_mem = 65536
maintenance_work_mem = 262144
max_stack_depth = 24574

Are some of these values false?
Is the "out of memory" error due to smaller memory available ?
Has somenone ever seen this problem ?

Thank you for your help.

Regards,
Alexandra DANTE

Re: out of memory during query execution

From
Chris Browne
Date:
ALEXANDRA.DANTE@BULL.NET (DANTE ALEXANDRA) writes:
> I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3,
> with 300GB of datas.
> Some of the queries launched on this database finish with an "*out of
> memory*". The queries which have failed contain a lot of join (between
> 6 tables), sub-select and aggregate. For these queries, the log file
> contains :
> psql:Q9.sql:40: ERROR:  out of memory
> DETAIL:  Failed on request of size 148.
>
> On the server used, I got 3GB of memory and 1 CPU.
> The settings specified in the "postgresql.conf" are :
> # - Memory -
> shared_buffers = 12288                 #temp_buffers = 1000
> #max_prepared_transactions = 5       work_mem = 65536
> maintenance_work_mem = 262144         max_stack_depth = 24574
>
> Are some of these values false?
> Is the "out of memory" error due to smaller memory available ?
> Has somenone ever seen this problem ?

We have seen this problem...

It's *probably* related to the memory model you're using.

I have thus far evaded *fully* understanding the details (and hope
that can persist!), but here are some of the things to consider:

- By default, AIX really prefers to build 32 bit binaries

- The sorta-hacks that IBM put in place on library segmentation (and
this stuff is quite ghastly) mean that any backend will likely have
quite a bit less than 2GB of even theoretically-available memory space.

The problem is probably that the memory model is throttling you to
*WAY* less than 2GB of memory.

You may want to try a 64 bit build.  With GCC, this requires something
like the following ./configure incantation...

  CC="gcc -maix64" LDFLAGS="-Wl,-bbigtoc" ./configure
--
(reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc"))
http://cbbrowne.com/info/sgml.html
Rules of  the Evil Overlord  #86. "I will  make sure that  my doomsday
device is up to code and properly grounded."
<http://www.eviloverlord.com/>

Re: out of memory during query execution

From
Seneca Cunningham
Date:
Chris Browne wrote:
> The problem is probably that the memory model is throttling you to
> *WAY* less than 2GB of memory.
>
> You may want to try a 64 bit build.  With GCC, this requires something
> like the following ./configure incantation...
>
>   CC="gcc -maix64" LDFLAGS="-Wl,-bbigtoc" ./configure

You'll also want to export OBJECT_MODE=64 before the ./configure line.

--
Seneca Cunningham
scunning@ca.afilias.info

Re: out of memory during query execution

From
Tom Lane
Date:
DANTE ALEXANDRA <ALEXANDRA.DANTE@BULL.NET> writes:
> I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3, with
> 300GB of datas.
> Some of the queries launched on this database finish with an "*out of
> memory*". The queries which have failed contain a lot of join (between 6
> tables), sub-select and aggregate. For these queries, the log file
> contains :
> psql:Q9.sql:40: ERROR:  out of memory
> DETAIL:  Failed on request of size 148.

Hmm ... what ulimit settings are you running the postmaster under?
Could we see the EXPLAIN plans for some of the failing queries?

            regards, tom lane

Re: out of memory during query execution

From
DANTE ALEXANDRA
Date:
Hello,

Thank you for your answer.
The person is charge of building PostGreSQL 8.1.0 has done a 32 bit
build and has used the "cc_r" compiler.
This person does not succeed to build PostGreSQL 8.1.0 with "gcc" and 64
bits. Unfortunatly, I don't have the errors or the logs of the 64 bits
build and I can't tell you what error occurs.

The build done was realized in 32 bits, with the cc_r compiler.
To build POstGreSQL, a rpm was done and the ".spec" file contained the
following instructions :
export OBJECT_MODE=32
./configure CC=/usr/vac/bin/cc_r CFLAGS="-O2 -qmaxmem=-1 -qsrcmsg
-qlargepage" --enable-thread-safety
--without-readline --prefix=%{buildroot}%{prefix}
gmake -j 4
unset OBJECT_MODE

Do you think that my problems of "out of memory" are due to the 32 bits
build ?
Do you think that I must build PostGreSQL wih 64 bits to solve this error ?

Thank you for your help.
Regards,
Alexandra DANTE


Chris Browne a écrit :

>ALEXANDRA.DANTE@BULL.NET (DANTE ALEXANDRA) writes:
>
>
>>I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3,
>>with 300GB of datas.
>>Some of the queries launched on this database finish with an "*out of
>>memory*". The queries which have failed contain a lot of join (between
>>6 tables), sub-select and aggregate. For these queries, the log file
>>contains :
>>psql:Q9.sql:40: ERROR:  out of memory
>>DETAIL:  Failed on request of size 148.
>>
>>On the server used, I got 3GB of memory and 1 CPU.
>>The settings specified in the "postgresql.conf" are :
>># - Memory -
>>shared_buffers = 12288                 #temp_buffers = 1000
>>#max_prepared_transactions = 5       work_mem = 65536
>>maintenance_work_mem = 262144         max_stack_depth = 24574
>>
>>Are some of these values false?
>>Is the "out of memory" error due to smaller memory available ?
>>Has somenone ever seen this problem ?
>>
>>
>
>We have seen this problem...
>
>It's *probably* related to the memory model you're using.
>
>I have thus far evaded *fully* understanding the details (and hope
>that can persist!), but here are some of the things to consider:
>
>- By default, AIX really prefers to build 32 bit binaries
>
>- The sorta-hacks that IBM put in place on library segmentation (and
>this stuff is quite ghastly) mean that any backend will likely have
>quite a bit less than 2GB of even theoretically-available memory space.
>
>The problem is probably that the memory model is throttling you to
>*WAY* less than 2GB of memory.
>
>You may want to try a 64 bit build.  With GCC, this requires something
>like the following ./configure incantation...
>
>  CC="gcc -maix64" LDFLAGS="-Wl,-bbigtoc" ./configure
>
>


Re: out of memory during query execution

From
DANTE ALEXANDRA
Date:
Hello,

The postmaster is launched by the user "pg_810" who is not the root user.
When I launch the "ulimit -a" command, I've got :
$ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        unlimited
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) 2000

You will find below the explain plan of one of the queries which has
finished with "out of memory". This query contains aggregate and a
sub-select with 6 joins :
                                                              QUERY
PLAN


------------------------------------------------------------------------------------------------------------------------------
---------
 GroupAggregate  (cost=103283274.03..103283274.07 rows=1 width=76)
   ->  Sort  (cost=103283274.03..103283274.04 rows=1 width=76)
         Sort Key: nation.n_name, date_part('year'::text,
(orders.o_orderdate)::timestamp without time zone)
         ->  Nested Loop  (cost=2447049.00..103283274.02 rows=1 width=76)
               Join Filter: ("outer".s_nationkey = "inner".n_nationkey)
               ->  Nested Loop  (cost=2447049.00..103283272.45 rows=1
width=55)
                     ->  Nested Loop  (cost=2447049.00..103283267.25
rows=1 width=59)
                           ->  Hash Join  (cost=2447049.00..103256685.03
rows=4800 width=80)
                                 Hash Cond: ("outer".l_suppkey =
"inner".s_suppkey)
                                 ->  Hash Join
(cost=2311445.00..102985544.04 rows=2880228 width=64)
                                       Hash Cond: ("outer".l_partkey =
"inner".p_partkey)
                                       ->  Seq Scan on lineitem
(cost=0.00..69142803.64 rows=1800142464 width=56)
                                       ->  Hash
(cost=2311205.00..2311205.00 rows=96000 width=8)
                                             ->  Seq Scan on part
(cost=0.00..2311205.00 rows=96000 width=8)
                                                   Filter:
((p_name)::text ~~ '%green%'::text)
                                 ->  Hash  (cost=110525.00..110525.00
rows=3000000 width=16)
                                       ->  Seq Scan on supplier
(cost=0.00..110525.00 rows=3000000 width=16)
                           ->  Index Scan using i_ps_partkey_suppkey on
partsupp  (cost=0.00..5.52 rows=1 width=27)
                                 Index Cond: ((partsupp.ps_partkey =
"outer".l_partkey) AND (partsupp.ps_suppkey = "outer".l_s
uppkey))
                     ->  Index Scan using i_o_orderkey on orders
(cost=0.00..5.19 rows=1 width=12)
                           Index Cond: (orders.o_orderkey =
"outer".l_orderkey)
               ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37)
(22 rows)

Regards,
Alexandra DANTE

Tom Lane a écrit :

>DANTE ALEXANDRA <ALEXANDRA.DANTE@BULL.NET> writes:
>
>
>>I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3, with
>>300GB of datas.
>>Some of the queries launched on this database finish with an "*out of
>>memory*". The queries which have failed contain a lot of join (between 6
>>tables), sub-select and aggregate. For these queries, the log file
>>contains :
>>psql:Q9.sql:40: ERROR:  out of memory
>>DETAIL:  Failed on request of size 148.
>>
>>
>
>Hmm ... what ulimit settings are you running the postmaster under?
>Could we see the EXPLAIN plans for some of the failing queries?
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>
>
>


Re: out of memory during query execution

From
Martijn van Oosterhout
Date:
On Tue, Dec 20, 2005 at 01:35:03PM +0100, DANTE ALEXANDRA wrote:
> You will find below the explain plan of one of the queries which has
> finished with "out of memory". This query contains aggregate and a
> sub-select with 6 joins :

1. Firstly, it could be the Hash node. Does the estimated number of
matches in part (96000 rows) match reality?

2. Secondly, looks like lineitem could use an index on partkey. Maybe it
could then use a more efficient join?

Do you have indexes on the relevent columns?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: out of memory during query execution

From
Seneca Cunningham
Date:
DANTE ALEXANDRA wrote:
> The person is charge of building PostGreSQL 8.1.0 has done a 32 bit
> build and has used the "cc_r" compiler.
> This person does not succeed to build PostGreSQL 8.1.0 with "gcc" and 64
> bits. Unfortunatly, I don't have the errors or the logs of the 64 bits
> build and I can't tell you what error occurs.

Too bad, I may have been able to determine what had happened with the
gcc build.

> The build done was realized in 32 bits, with the cc_r compiler.
> To build POstGreSQL, a rpm was done and the ".spec" file contained the
> following instructions :
> export OBJECT_MODE=32
> ./configure CC=/usr/vac/bin/cc_r CFLAGS="-O2 -qmaxmem=-1 -qsrcmsg
> -qlargepage" --enable-thread-safety
> --without-readline --prefix=%{buildroot}%{prefix}
> gmake -j 4
> unset OBJECT_MODE
>
> Do you think that my problems of "out of memory" are due to the 32 bits
> build ?
> Do you think that I must build PostGreSQL wih 64 bits to solve this error ?

It is quite likely that the out of memory errors are due to your use of
the default 32-bit memory model.  In that model, a single 256MB memory
segment contains your heap, stack, thread stacks, and other per-process,
non-shared-library data.  Switching to 64-bit would stop the errors if
this is true.  It is also possible to adjust the amount of space
available to a 32-bit process' heap with the -bmaxdata linker option,
but the largest heap size that I would consider safe with 32-bit is 2GB
and comes with the cost of reducing the amount of shared memory
available to the process.

Setting OBJECT_MODE to 64 before the ./configure and gmake should result
in a 64-bit build, but I don't have a copy of IBM's compiler to test
with.  I would be interested in seeing the errors output by the 64-bit
gcc build if another build is attempted.

--
Seneca Cunningham
scunning@ca.afilias.info

Re: out of memory during query execution

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Tue, Dec 20, 2005 at 01:35:03PM +0100, DANTE ALEXANDRA wrote:
>> You will find below the explain plan of one of the queries which has
>> finished with "out of memory". This query contains aggregate and a
>> sub-select with 6 joins :

> 1. Firstly, it could be the Hash node. Does the estimated number of
> matches in part (96000 rows) match reality?

Actually, the hash on "supplier" (3000000 rows) looks like a bigger
risk.  But if this is 8.1 then there is code in there to spill oversize
hash tables to disk, so I don't understand where the memory is going.

The "out of memory" failure should have provoked a MemoryContextStats
report in the postmaster log.  Are there a bunch of lines like
    %s: %ld total in %ld blocks; %ld free (%ld chunks); %ld used
and if so could we see 'em?

            regards, tom lane

Re: out of memory during query execution

From
Kevin Murphy
Date:
I'm certainly not an AIX expert, but I remember my 32-bit AIX programs
being limited to 256MB of heap by default.  When I linked, I think I had
to ask for more maximum data page space using something like:

-bmaxdata:0x40000000

(which asks for 1GB, I believe)

-Kevin Murphy


Re: out of memory during query execution

From
DANTE ALEXANDRA
Date:
Hello,

The part table contains 60000000 rows, so I think that the 96000 rows
estimated matches in part could match reality.

Currently, the lineitem table contains only one index :
TPCH=# \d lineitem
               Table "public.lineitem"
     Column      |         Type          | Modifiers
-----------------+-----------------------+-----------
 l_orderkey      | bigint                | not null
 l_partkey       | bigint                | not null
 l_suppkey       | bigint                | not null
 l_linenumber    | bigint                | not null
 l_quantity      | numeric               |
 l_extendedprice | numeric               |
 l_discount      | numeric               |
 l_tax           | numeric               | not null
 l_returnflag    | character(1)          |
 l_linestatus    | character(1)          |
 l_shipdate      | date                  |
 l_commitdate    | date                  |
 l_receiptdate   | date                  |
 l_shipinstruct  | character(25)         |
 l_shipmode      | character(10)         |
 l_comment       | character varying(44) |
Indexes:
    "i_l_orderkey" btree (l_orderkey), tablespace "tb_index"
Tablespace: "tb_lit"

I think I will try to optimize PostGreSQL in a second time by creating
appropriate indexes.
I don't think that this index is on relevent column for this query.

Regards,
Alexandra DANTE

Martijn van Oosterhout a écrit :

>On Tue, Dec 20, 2005 at 01:35:03PM +0100, DANTE ALEXANDRA wrote:
>
>
>>You will find below the explain plan of one of the queries which has
>>finished with "out of memory". This query contains aggregate and a
>>sub-select with 6 joins :
>>
>>
>
>1. Firstly, it could be the Hash node. Does the estimated number of
>matches in part (96000 rows) match reality?
>
>2. Secondly, looks like lineitem could use an index on partkey. Maybe it
>could then use a more efficient join?
>
>Do you have indexes on the relevent columns?
>
>Have a nice day,
>
>


Re: out of memory during query execution

From
Tom Lane
Date:
Kevin Murphy <murphy@genome.chop.edu> writes:
> I'm certainly not an AIX expert, but I remember my 32-bit AIX programs
> being limited to 256MB of heap by default.

Hmm ... if that's the case then it'd probably explain the problem.
Alexandra had work_mem set to 64MB, so the two hashes and sort would
think they could use 3/4ths of the available heap; given that there
are other needs and our management of memory-use limitations is fairly
sloppy, that could easily translate into running out.

So the answer is either to increase the available heap or reduce
work_mem to a smaller fraction of it.

            regards, tom lane

Re: out of memory during query execution

From
DANTE ALEXANDRA
Date:
Hello,

Thank you for all the answers I've got on this problem.

Tom, I've checked this morning if the logfile contains lines like
 %s: %ld total in %ld blocks; %ld free (%ld chunks); %ld used
and it is the case.

As I launch a series of queries, I hope that the following lines are the
exact lines for the query which has finished with an "out of memory" :
TopMemoryContext: 40960 total in 4 blocks; 11248 free (9 chunks); 29712 used
Operator class cache: 8192 total in 1 blocks; 4936 free (0 chunks); 3256
used
Type information cache: 8192 total in 1 blocks; 1864 free (0 chunks);
6328 used
TopTransactionContext: 8192 total in 1 blocks; 7856 free (0 chunks); 336
used
MessageContext: 253952 total in 5 blocks; 7784 free (8 chunks); 246168 used
smgr relation table: 8192 total in 1 blocks; 2872 free (0 chunks); 5320 used
Portal hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used
PortalHeapMemory: 1024 total in 1 blocks; 880 free (0 chunks); 144 used
ExecutorState: 21586616 total in 25 blocks; 14624 free (33 chunks);
21571992 used
HashTableContext: 8192 total in 1 blocks; 8128 free (2 chunks); 64 used
HashBatchContext: 134152352 total in 22 blocks; 14859064 free (55434
chunks); 119293288 used
HashTableContext: 8192 total in 1 blocks; 8128 free (0 chunks); 64 used
HashBatchContext: 108527800 total in 16 blocks; 1016512 free (24
chunks); 107511288 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 8192 total in 1 blocks; 8128 free (0 chunks); 64 used
AggContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Relcache by OID: 8192 total in 1 blocks; 3376 free (0 chunks); 4816 used
CacheMemoryContext: 516096 total in 6 blocks; 108608 free (0 chunks);
407488 used
i_o_orderkey: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
i_ps_partkey_suppkey: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
i_l_orderkey: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_index_indrelid_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_type_typname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks);
696 used
pg_type_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_statistic_relid_att_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_auth_members_member_role_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_auth_members_role_member_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_proc_proname_args_nsp_index: 1024 total in 1 blocks; 256 free (0
chunks); 768 used
pg_proc_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_operator_oprname_l_r_n_index: 1024 total in 1 blocks; 192 free (0
chunks); 832 used
pg_operator_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_opclass_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_opclass_am_name_nsp_index: 1024 total in 1 blocks; 256 free (0
chunks); 768 used
pg_namespace_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632
used
pg_namespace_nspname_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_language_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_language_name_index: 1024 total in 1 blocks; 392 free (0 chunks); 632
used
pg_inherits_relid_seqno_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_index_indexrelid_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_authid_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_authid_rolname_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_database_datname_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_conversion_oid_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
pg_conversion_name_nsp_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_conversion_default_index: 1024 total in 1 blocks; 192 free (0
chunks); 832 used
pg_class_relname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks);
696 used
pg_class_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_cast_source_target_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 328 free (0
chunks); 696 used
pg_amproc_opc_proc_index: 1024 total in 1 blocks; 256 free (0 chunks);
768 used
pg_amop_opr_opc_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_amop_opc_strat_index: 1024 total in 1 blocks; 256 free (0 chunks);
768 used
pg_aggregate_fnoid_index: 1024 total in 1 blocks; 392 free (0 chunks);
632 used
MdSmgr: 24576 total in 2 blocks; 9712 free (1 chunks); 14864 used
LockTable (locallock hash): 8192 total in 1 blocks; 3912 free (0
chunks); 4280 used
Timezones: 48408 total in 2 blocks; 5968 free (0 chunks); 42440 used
ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
ERROR:  out of memory
DETAIL:  Failed on request of size 148.

I hope this will answer to your question.
Thank you for your help.

Regards,
Alexandra DANTE

Tom Lane a écrit :

>Martijn van Oosterhout <kleptog@svana.org> writes:
>
>
>>On Tue, Dec 20, 2005 at 01:35:03PM +0100, DANTE ALEXANDRA wrote:
>>
>>
>>>You will find below the explain plan of one of the queries which has
>>>finished with "out of memory". This query contains aggregate and a
>>>sub-select with 6 joins :
>>>
>>>
>
>
>
>>1. Firstly, it could be the Hash node. Does the estimated number of
>>matches in part (96000 rows) match reality?
>>
>>
>
>Actually, the hash on "supplier" (3000000 rows) looks like a bigger
>risk.  But if this is 8.1 then there is code in there to spill oversize
>hash tables to disk, so I don't understand where the memory is going.
>
>The "out of memory" failure should have provoked a MemoryContextStats
>report in the postmaster log.  Are there a bunch of lines like
>    %s: %ld total in %ld blocks; %ld free (%ld chunks); %ld used
>and if so could we see 'em?
>
>            regards, tom lane
>
>
>


Re: out of memory during query execution

From
DANTE ALEXANDRA
Date:
Hello,

According to several answers, my problem could be due to the 32 bits build.
I hope I will try to make a 64 bits build in January, in order to give
you more explanations about the possible errors I will encounter.

I keep preciously your advice, and I hope I will be able to give you an
answer in January.

Thank you.
Regards,
Alexandra DANTE

Seneca Cunningham a écrit :

>DANTE ALEXANDRA wrote:
>
>
>>The person is charge of building PostGreSQL 8.1.0 has done a 32 bit
>>build and has used the "cc_r" compiler.
>>This person does not succeed to build PostGreSQL 8.1.0 with "gcc" and 64
>>bits. Unfortunatly, I don't have the errors or the logs of the 64 bits
>>build and I can't tell you what error occurs.
>>
>>
>
>Too bad, I may have been able to determine what had happened with the
>gcc build.
>
>
>
>>The build done was realized in 32 bits, with the cc_r compiler.
>>To build POstGreSQL, a rpm was done and the ".spec" file contained the
>>following instructions :
>>export OBJECT_MODE=32
>>./configure CC=/usr/vac/bin/cc_r CFLAGS="-O2 -qmaxmem=-1 -qsrcmsg
>>-qlargepage" --enable-thread-safety
>>--without-readline --prefix=%{buildroot}%{prefix}
>>gmake -j 4
>>unset OBJECT_MODE
>>
>>Do you think that my problems of "out of memory" are due to the 32 bits
>>build ?
>>Do you think that I must build PostGreSQL wih 64 bits to solve this error ?
>>
>>
>
>It is quite likely that the out of memory errors are due to your use of
>the default 32-bit memory model.  In that model, a single 256MB memory
>segment contains your heap, stack, thread stacks, and other per-process,
>non-shared-library data.  Switching to 64-bit would stop the errors if
>this is true.  It is also possible to adjust the amount of space
>available to a 32-bit process' heap with the -bmaxdata linker option,
>but the largest heap size that I would consider safe with 32-bit is 2GB
>and comes with the cost of reducing the amount of shared memory
>available to the process.
>
>Setting OBJECT_MODE to 64 before the ./configure and gmake should result
>in a 64-bit build, but I don't have a copy of IBM's compiler to test
>with.  I would be interested in seeing the errors output by the 64-bit
>gcc build if another build is attempted.
>
>
>


Re: out of memory during query execution

From
Martijn van Oosterhout
Date:
On Wed, Dec 21, 2005 at 10:22:05AM +0100, DANTE ALEXANDRA wrote:
> Hello,
>
> Thank you for all the answers I've got on this problem.
>
> Tom, I've checked this morning if the logfile contains lines like
> %s: %ld total in %ld blocks; %ld free (%ld chunks); %ld used
> and it is the case.

<snip>
> ExecutorState: 21586616 total in 25 blocks; 14624 free (33 chunks); 21571992 used
> HashTableContext: 8192 total in 1 blocks; 8128 free (2 chunks); 64 used
> HashBatchContext: 134152352 total in 22 blocks; 14859064 free (55434 chunks); 119293288 used
> HashTableContext: 8192 total in 1 blocks; 8128 free (0 chunks); 64 used
> HashBatchContext: 108527800 total in 16 blocks; 1016512 free (24 chunks); 107511288 used

These are by far the largest and seem to confirm that your total memory
usage is limited to not much more than 256MB, so what that other poster
said may be relevent. Whether that as reasonable amount for a hash
table to use in your context, I'll leave that to someone else...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: out of memory during query execution

From
DANTE ALEXANDRA
Date:
Hello,

Just to give you explanation, I will try to explain you why I have set
the "work_mem" to 64MB.

I noticed that some of the queries have finished with an "out of memory"
because the file system on which temporary files were created was to
small. Consequently, I have increase the size of this file system and I
have set work_mem to 64MB. The query still continued to swap into
temporary files but did not finish with an "out of memory".

The next steps for me will be to :
- decrease the work_mem
- try to build PostGreSQL with 32 bits and the option "-bmaxdata:0x40000000"
- try to build PostGreSQL with 64 bits.

I will give you the results as soon as possible.

Thank you for your help.
Regards,
Alexandra DANTE


Tom Lane a écrit :

>Kevin Murphy <murphy@genome.chop.edu> writes:
>
>
>>I'm certainly not an AIX expert, but I remember my 32-bit AIX programs
>>being limited to 256MB of heap by default.
>>
>>
>
>Hmm ... if that's the case then it'd probably explain the problem.
>Alexandra had work_mem set to 64MB, so the two hashes and sort would
>think they could use 3/4ths of the available heap; given that there
>are other needs and our management of memory-use limitations is fairly
>sloppy, that could easily translate into running out.
>
>So the answer is either to increase the available heap or reduce
>work_mem to a smaller fraction of it.
>
>            regards, tom lane
>
>
>


Re: out of memory during query execution

From
DANTE ALEXANDRA
Date:
Hello Tom,

I've got others questions on work-mem parameter.
On the "http://www.powerpostgresql.com/Downloads/annotated_conf_80.html"
web site, I've read that the work-mem specifies the amount of memory to
be used by internal sort operations ans hash tables before switching to
temporary disk files. Moreover, for a complex query, several sort or
hash operations might be running in parallel; each one will be allowed
to use as much memory as this value specifies before it starts to put
into temporary files.

In my case, does this mean that each one of the two hashs and sorts will
take 64MB, so 192MB ?
What do you want to say with "so the two hashes and sort would think
they could use 3/4ths of the available heap" ?

Last question, how can I see that my 32-bit AIX program being limited to
256MB of heap, as the user "pg_810" used to launch the postmaster got
when I execute the "ulimit -a" command :
$ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        unlimited
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) 2000

Thank you very much for your help.
Regards,
Alexandra DANTE


Tom Lane a écrit :

>Kevin Murphy <murphy@genome.chop.edu> writes:
>
>
>>I'm certainly not an AIX expert, but I remember my 32-bit AIX programs
>>being limited to 256MB of heap by default.
>>
>>
>
>Hmm ... if that's the case then it'd probably explain the problem.
>Alexandra had work_mem set to 64MB, so the two hashes and sort would
>think they could use 3/4ths of the available heap; given that there
>are other needs and our management of memory-use limitations is fairly
>sloppy, that could easily translate into running out.
>
>So the answer is either to increase the available heap or reduce
>work_mem to a smaller fraction of it.
>
>            regards, tom lane
>
>
>


Re: out of memory during query execution

From
Tom Lane
Date:
DANTE ALEXANDRA <ALEXANDRA.DANTE@BULL.NET> writes:
> In my case, does this mean that each one of the two hashs and sorts will
> take 64MB, so 192MB ?
> What do you want to say with "so the two hashes and sort would think
> they could use 3/4ths of the available heap" ?

Right, exactly.  In this particular case I think the top-level sort is
not going to be using much memory because it won't see very many rows,
but potentially it could try to eat 64Mb just like each of the hashes.

Your log entries show that the hashes are actually eating over 100Mb
apiece.  The memory space estimation for work_mem is not completely
accurate, and is not intended to be, but I would have liked to think
it would be closer than a factor-of-2 error.  Might be worth looking
into exactly what's happening there.

> Last question, how can I see that my 32-bit AIX program being limited to
> 256MB of heap,

For that you need to talk to an AIX expert, which I'm not.

            regards, tom lane

Re: out of memory during query execution

From
Seneca Cunningham
Date:
DANTE ALEXANDRA wrote:
> Last question, how can I see that my 32-bit AIX program being limited to
> 256MB of heap, as the user "pg_810" used to launch the postmaster got
> when I execute the "ulimit -a" command :
> $ ulimit -a
> time(seconds)        unlimited
> file(blocks)         unlimited
> data(kbytes)         unlimited
> stack(kbytes)        unlimited
> memory(kbytes)       unlimited
> coredump(blocks)     unlimited
> nofiles(descriptors) 2000

It's actually less than 256MB of heap.  When I saw your ./configure, I
noticed that no special options were passed to the linker, so you're
using the default 32-bit memory model.  Even an unlimited ulimit does
not allow for more memory than the model sets aside.

You can try setting the environment variable LDR_CNTRL to
MAXDATA=0x40000000 (where the first digit is the number of 256MB
segments to allocate to heap, max 8) before starting the postmaster, at
the cost of reducing the amount of shared memory addressable by postgres.

Diagram of default memory model (Figure 3-3):
  <http://www.redbooks.ibm.com/redbooks/SG245674/images/11-08-05.jpg>

The redbook it's from, "Developing and Porting C and C++ Applications on
AIX":
  <http://www.redbooks.ibm.com/redbooks/SG245674/>

--
Seneca Cunningham
scunning@ca.afilias.info