Thread: out of memory during query execution
Hello, I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3, with 300GB of datas. Some of the queries launched on this database finish with an "*out of memory*". The queries which have failed contain a lot of join (between 6 tables), sub-select and aggregate. For these queries, the log file contains : psql:Q9.sql:40: ERROR: out of memory DETAIL: Failed on request of size 148. On the server used, I got 3GB of memory and 1 CPU. The settings specified in the "postgresql.conf" are : # - Memory - shared_buffers = 12288 #temp_buffers = 1000 #max_prepared_transactions = 5 work_mem = 65536 maintenance_work_mem = 262144 max_stack_depth = 24574 Are some of these values false? Is the "out of memory" error due to smaller memory available ? Has somenone ever seen this problem ? Thank you for your help. Regards, Alexandra DANTE
ALEXANDRA.DANTE@BULL.NET (DANTE ALEXANDRA) writes: > I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3, > with 300GB of datas. > Some of the queries launched on this database finish with an "*out of > memory*". The queries which have failed contain a lot of join (between > 6 tables), sub-select and aggregate. For these queries, the log file > contains : > psql:Q9.sql:40: ERROR: out of memory > DETAIL: Failed on request of size 148. > > On the server used, I got 3GB of memory and 1 CPU. > The settings specified in the "postgresql.conf" are : > # - Memory - > shared_buffers = 12288 #temp_buffers = 1000 > #max_prepared_transactions = 5 work_mem = 65536 > maintenance_work_mem = 262144 max_stack_depth = 24574 > > Are some of these values false? > Is the "out of memory" error due to smaller memory available ? > Has somenone ever seen this problem ? We have seen this problem... It's *probably* related to the memory model you're using. I have thus far evaded *fully* understanding the details (and hope that can persist!), but here are some of the things to consider: - By default, AIX really prefers to build 32 bit binaries - The sorta-hacks that IBM put in place on library segmentation (and this stuff is quite ghastly) mean that any backend will likely have quite a bit less than 2GB of even theoretically-available memory space. The problem is probably that the memory model is throttling you to *WAY* less than 2GB of memory. You may want to try a 64 bit build. With GCC, this requires something like the following ./configure incantation... CC="gcc -maix64" LDFLAGS="-Wl,-bbigtoc" ./configure -- (reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc")) http://cbbrowne.com/info/sgml.html Rules of the Evil Overlord #86. "I will make sure that my doomsday device is up to code and properly grounded." <http://www.eviloverlord.com/>
Chris Browne wrote: > The problem is probably that the memory model is throttling you to > *WAY* less than 2GB of memory. > > You may want to try a 64 bit build. With GCC, this requires something > like the following ./configure incantation... > > CC="gcc -maix64" LDFLAGS="-Wl,-bbigtoc" ./configure You'll also want to export OBJECT_MODE=64 before the ./configure line. -- Seneca Cunningham scunning@ca.afilias.info
DANTE ALEXANDRA <ALEXANDRA.DANTE@BULL.NET> writes: > I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3, with > 300GB of datas. > Some of the queries launched on this database finish with an "*out of > memory*". The queries which have failed contain a lot of join (between 6 > tables), sub-select and aggregate. For these queries, the log file > contains : > psql:Q9.sql:40: ERROR: out of memory > DETAIL: Failed on request of size 148. Hmm ... what ulimit settings are you running the postmaster under? Could we see the EXPLAIN plans for some of the failing queries? regards, tom lane
Hello, Thank you for your answer. The person is charge of building PostGreSQL 8.1.0 has done a 32 bit build and has used the "cc_r" compiler. This person does not succeed to build PostGreSQL 8.1.0 with "gcc" and 64 bits. Unfortunatly, I don't have the errors or the logs of the 64 bits build and I can't tell you what error occurs. The build done was realized in 32 bits, with the cc_r compiler. To build POstGreSQL, a rpm was done and the ".spec" file contained the following instructions : export OBJECT_MODE=32 ./configure CC=/usr/vac/bin/cc_r CFLAGS="-O2 -qmaxmem=-1 -qsrcmsg -qlargepage" --enable-thread-safety --without-readline --prefix=%{buildroot}%{prefix} gmake -j 4 unset OBJECT_MODE Do you think that my problems of "out of memory" are due to the 32 bits build ? Do you think that I must build PostGreSQL wih 64 bits to solve this error ? Thank you for your help. Regards, Alexandra DANTE Chris Browne a écrit : >ALEXANDRA.DANTE@BULL.NET (DANTE ALEXANDRA) writes: > > >>I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3, >>with 300GB of datas. >>Some of the queries launched on this database finish with an "*out of >>memory*". The queries which have failed contain a lot of join (between >>6 tables), sub-select and aggregate. For these queries, the log file >>contains : >>psql:Q9.sql:40: ERROR: out of memory >>DETAIL: Failed on request of size 148. >> >>On the server used, I got 3GB of memory and 1 CPU. >>The settings specified in the "postgresql.conf" are : >># - Memory - >>shared_buffers = 12288 #temp_buffers = 1000 >>#max_prepared_transactions = 5 work_mem = 65536 >>maintenance_work_mem = 262144 max_stack_depth = 24574 >> >>Are some of these values false? >>Is the "out of memory" error due to smaller memory available ? >>Has somenone ever seen this problem ? >> >> > >We have seen this problem... > >It's *probably* related to the memory model you're using. > >I have thus far evaded *fully* understanding the details (and hope >that can persist!), but here are some of the things to consider: > >- By default, AIX really prefers to build 32 bit binaries > >- The sorta-hacks that IBM put in place on library segmentation (and >this stuff is quite ghastly) mean that any backend will likely have >quite a bit less than 2GB of even theoretically-available memory space. > >The problem is probably that the memory model is throttling you to >*WAY* less than 2GB of memory. > >You may want to try a 64 bit build. With GCC, this requires something >like the following ./configure incantation... > > CC="gcc -maix64" LDFLAGS="-Wl,-bbigtoc" ./configure > >
Hello, The postmaster is launched by the user "pg_810" who is not the root user. When I launch the "ulimit -a" command, I've got : $ ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) unlimited memory(kbytes) unlimited coredump(blocks) unlimited nofiles(descriptors) 2000 You will find below the explain plan of one of the queries which has finished with "out of memory". This query contains aggregate and a sub-select with 6 joins : QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ --------- GroupAggregate (cost=103283274.03..103283274.07 rows=1 width=76) -> Sort (cost=103283274.03..103283274.04 rows=1 width=76) Sort Key: nation.n_name, date_part('year'::text, (orders.o_orderdate)::timestamp without time zone) -> Nested Loop (cost=2447049.00..103283274.02 rows=1 width=76) Join Filter: ("outer".s_nationkey = "inner".n_nationkey) -> Nested Loop (cost=2447049.00..103283272.45 rows=1 width=55) -> Nested Loop (cost=2447049.00..103283267.25 rows=1 width=59) -> Hash Join (cost=2447049.00..103256685.03 rows=4800 width=80) Hash Cond: ("outer".l_suppkey = "inner".s_suppkey) -> Hash Join (cost=2311445.00..102985544.04 rows=2880228 width=64) Hash Cond: ("outer".l_partkey = "inner".p_partkey) -> Seq Scan on lineitem (cost=0.00..69142803.64 rows=1800142464 width=56) -> Hash (cost=2311205.00..2311205.00 rows=96000 width=8) -> Seq Scan on part (cost=0.00..2311205.00 rows=96000 width=8) Filter: ((p_name)::text ~~ '%green%'::text) -> Hash (cost=110525.00..110525.00 rows=3000000 width=16) -> Seq Scan on supplier (cost=0.00..110525.00 rows=3000000 width=16) -> Index Scan using i_ps_partkey_suppkey on partsupp (cost=0.00..5.52 rows=1 width=27) Index Cond: ((partsupp.ps_partkey = "outer".l_partkey) AND (partsupp.ps_suppkey = "outer".l_s uppkey)) -> Index Scan using i_o_orderkey on orders (cost=0.00..5.19 rows=1 width=12) Index Cond: (orders.o_orderkey = "outer".l_orderkey) -> Seq Scan on nation (cost=0.00..1.25 rows=25 width=37) (22 rows) Regards, Alexandra DANTE Tom Lane a écrit : >DANTE ALEXANDRA <ALEXANDRA.DANTE@BULL.NET> writes: > > >>I am a PostGreSQL newbie. I work with the 8.1.0 release on AIX 5.3, with >>300GB of datas. >>Some of the queries launched on this database finish with an "*out of >>memory*". The queries which have failed contain a lot of join (between 6 >>tables), sub-select and aggregate. For these queries, the log file >>contains : >>psql:Q9.sql:40: ERROR: out of memory >>DETAIL: Failed on request of size 148. >> >> > >Hmm ... what ulimit settings are you running the postmaster under? >Could we see the EXPLAIN plans for some of the failing queries? > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > > >
On Tue, Dec 20, 2005 at 01:35:03PM +0100, DANTE ALEXANDRA wrote: > You will find below the explain plan of one of the queries which has > finished with "out of memory". This query contains aggregate and a > sub-select with 6 joins : 1. Firstly, it could be the Hash node. Does the estimated number of matches in part (96000 rows) match reality? 2. Secondly, looks like lineitem could use an index on partkey. Maybe it could then use a more efficient join? Do you have indexes on the relevent columns? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
DANTE ALEXANDRA wrote: > The person is charge of building PostGreSQL 8.1.0 has done a 32 bit > build and has used the "cc_r" compiler. > This person does not succeed to build PostGreSQL 8.1.0 with "gcc" and 64 > bits. Unfortunatly, I don't have the errors or the logs of the 64 bits > build and I can't tell you what error occurs. Too bad, I may have been able to determine what had happened with the gcc build. > The build done was realized in 32 bits, with the cc_r compiler. > To build POstGreSQL, a rpm was done and the ".spec" file contained the > following instructions : > export OBJECT_MODE=32 > ./configure CC=/usr/vac/bin/cc_r CFLAGS="-O2 -qmaxmem=-1 -qsrcmsg > -qlargepage" --enable-thread-safety > --without-readline --prefix=%{buildroot}%{prefix} > gmake -j 4 > unset OBJECT_MODE > > Do you think that my problems of "out of memory" are due to the 32 bits > build ? > Do you think that I must build PostGreSQL wih 64 bits to solve this error ? It is quite likely that the out of memory errors are due to your use of the default 32-bit memory model. In that model, a single 256MB memory segment contains your heap, stack, thread stacks, and other per-process, non-shared-library data. Switching to 64-bit would stop the errors if this is true. It is also possible to adjust the amount of space available to a 32-bit process' heap with the -bmaxdata linker option, but the largest heap size that I would consider safe with 32-bit is 2GB and comes with the cost of reducing the amount of shared memory available to the process. Setting OBJECT_MODE to 64 before the ./configure and gmake should result in a 64-bit build, but I don't have a copy of IBM's compiler to test with. I would be interested in seeing the errors output by the 64-bit gcc build if another build is attempted. -- Seneca Cunningham scunning@ca.afilias.info
Martijn van Oosterhout <kleptog@svana.org> writes: > On Tue, Dec 20, 2005 at 01:35:03PM +0100, DANTE ALEXANDRA wrote: >> You will find below the explain plan of one of the queries which has >> finished with "out of memory". This query contains aggregate and a >> sub-select with 6 joins : > 1. Firstly, it could be the Hash node. Does the estimated number of > matches in part (96000 rows) match reality? Actually, the hash on "supplier" (3000000 rows) looks like a bigger risk. But if this is 8.1 then there is code in there to spill oversize hash tables to disk, so I don't understand where the memory is going. The "out of memory" failure should have provoked a MemoryContextStats report in the postmaster log. Are there a bunch of lines like %s: %ld total in %ld blocks; %ld free (%ld chunks); %ld used and if so could we see 'em? regards, tom lane
I'm certainly not an AIX expert, but I remember my 32-bit AIX programs being limited to 256MB of heap by default. When I linked, I think I had to ask for more maximum data page space using something like: -bmaxdata:0x40000000 (which asks for 1GB, I believe) -Kevin Murphy
Hello, The part table contains 60000000 rows, so I think that the 96000 rows estimated matches in part could match reality. Currently, the lineitem table contains only one index : TPCH=# \d lineitem Table "public.lineitem" Column | Type | Modifiers -----------------+-----------------------+----------- l_orderkey | bigint | not null l_partkey | bigint | not null l_suppkey | bigint | not null l_linenumber | bigint | not null l_quantity | numeric | l_extendedprice | numeric | l_discount | numeric | l_tax | numeric | not null l_returnflag | character(1) | l_linestatus | character(1) | l_shipdate | date | l_commitdate | date | l_receiptdate | date | l_shipinstruct | character(25) | l_shipmode | character(10) | l_comment | character varying(44) | Indexes: "i_l_orderkey" btree (l_orderkey), tablespace "tb_index" Tablespace: "tb_lit" I think I will try to optimize PostGreSQL in a second time by creating appropriate indexes. I don't think that this index is on relevent column for this query. Regards, Alexandra DANTE Martijn van Oosterhout a écrit : >On Tue, Dec 20, 2005 at 01:35:03PM +0100, DANTE ALEXANDRA wrote: > > >>You will find below the explain plan of one of the queries which has >>finished with "out of memory". This query contains aggregate and a >>sub-select with 6 joins : >> >> > >1. Firstly, it could be the Hash node. Does the estimated number of >matches in part (96000 rows) match reality? > >2. Secondly, looks like lineitem could use an index on partkey. Maybe it >could then use a more efficient join? > >Do you have indexes on the relevent columns? > >Have a nice day, > >
Kevin Murphy <murphy@genome.chop.edu> writes: > I'm certainly not an AIX expert, but I remember my 32-bit AIX programs > being limited to 256MB of heap by default. Hmm ... if that's the case then it'd probably explain the problem. Alexandra had work_mem set to 64MB, so the two hashes and sort would think they could use 3/4ths of the available heap; given that there are other needs and our management of memory-use limitations is fairly sloppy, that could easily translate into running out. So the answer is either to increase the available heap or reduce work_mem to a smaller fraction of it. regards, tom lane
Hello, Thank you for all the answers I've got on this problem. Tom, I've checked this morning if the logfile contains lines like %s: %ld total in %ld blocks; %ld free (%ld chunks); %ld used and it is the case. As I launch a series of queries, I hope that the following lines are the exact lines for the query which has finished with an "out of memory" : TopMemoryContext: 40960 total in 4 blocks; 11248 free (9 chunks); 29712 used Operator class cache: 8192 total in 1 blocks; 4936 free (0 chunks); 3256 used Type information cache: 8192 total in 1 blocks; 1864 free (0 chunks); 6328 used TopTransactionContext: 8192 total in 1 blocks; 7856 free (0 chunks); 336 used MessageContext: 253952 total in 5 blocks; 7784 free (8 chunks); 246168 used smgr relation table: 8192 total in 1 blocks; 2872 free (0 chunks); 5320 used Portal hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used PortalHeapMemory: 1024 total in 1 blocks; 880 free (0 chunks); 144 used ExecutorState: 21586616 total in 25 blocks; 14624 free (33 chunks); 21571992 used HashTableContext: 8192 total in 1 blocks; 8128 free (2 chunks); 64 used HashBatchContext: 134152352 total in 22 blocks; 14859064 free (55434 chunks); 119293288 used HashTableContext: 8192 total in 1 blocks; 8128 free (0 chunks); 64 used HashBatchContext: 108527800 total in 16 blocks; 1016512 free (24 chunks); 107511288 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8128 free (0 chunks); 64 used AggContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used Relcache by OID: 8192 total in 1 blocks; 3376 free (0 chunks); 4816 used CacheMemoryContext: 516096 total in 6 blocks; 108608 free (0 chunks); 407488 used i_o_orderkey: 1024 total in 1 blocks; 392 free (0 chunks); 632 used i_ps_partkey_suppkey: 1024 total in 1 blocks; 328 free (0 chunks); 696 used i_l_orderkey: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_index_indrelid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_type_typname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_type_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_statistic_relid_att_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_auth_members_member_role_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_auth_members_role_member_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_proc_proname_args_nsp_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_proc_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_operator_oprname_l_r_n_index: 1024 total in 1 blocks; 192 free (0 chunks); 832 used pg_operator_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_opclass_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_opclass_am_name_nsp_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_namespace_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_namespace_nspname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_language_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_language_name_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_inherits_relid_seqno_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_index_indexrelid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_authid_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_authid_rolname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_database_datname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_conversion_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_conversion_name_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_conversion_default_index: 1024 total in 1 blocks; 192 free (0 chunks); 832 used pg_class_relname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_class_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_cast_source_target_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_amproc_opc_proc_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_amop_opr_opc_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_amop_opc_strat_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_aggregate_fnoid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used MdSmgr: 24576 total in 2 blocks; 9712 free (1 chunks); 14864 used LockTable (locallock hash): 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used Timezones: 48408 total in 2 blocks; 5968 free (0 chunks); 42440 used ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used ERROR: out of memory DETAIL: Failed on request of size 148. I hope this will answer to your question. Thank you for your help. Regards, Alexandra DANTE Tom Lane a écrit : >Martijn van Oosterhout <kleptog@svana.org> writes: > > >>On Tue, Dec 20, 2005 at 01:35:03PM +0100, DANTE ALEXANDRA wrote: >> >> >>>You will find below the explain plan of one of the queries which has >>>finished with "out of memory". This query contains aggregate and a >>>sub-select with 6 joins : >>> >>> > > > >>1. Firstly, it could be the Hash node. Does the estimated number of >>matches in part (96000 rows) match reality? >> >> > >Actually, the hash on "supplier" (3000000 rows) looks like a bigger >risk. But if this is 8.1 then there is code in there to spill oversize >hash tables to disk, so I don't understand where the memory is going. > >The "out of memory" failure should have provoked a MemoryContextStats >report in the postmaster log. Are there a bunch of lines like > %s: %ld total in %ld blocks; %ld free (%ld chunks); %ld used >and if so could we see 'em? > > regards, tom lane > > >
Hello, According to several answers, my problem could be due to the 32 bits build. I hope I will try to make a 64 bits build in January, in order to give you more explanations about the possible errors I will encounter. I keep preciously your advice, and I hope I will be able to give you an answer in January. Thank you. Regards, Alexandra DANTE Seneca Cunningham a écrit : >DANTE ALEXANDRA wrote: > > >>The person is charge of building PostGreSQL 8.1.0 has done a 32 bit >>build and has used the "cc_r" compiler. >>This person does not succeed to build PostGreSQL 8.1.0 with "gcc" and 64 >>bits. Unfortunatly, I don't have the errors or the logs of the 64 bits >>build and I can't tell you what error occurs. >> >> > >Too bad, I may have been able to determine what had happened with the >gcc build. > > > >>The build done was realized in 32 bits, with the cc_r compiler. >>To build POstGreSQL, a rpm was done and the ".spec" file contained the >>following instructions : >>export OBJECT_MODE=32 >>./configure CC=/usr/vac/bin/cc_r CFLAGS="-O2 -qmaxmem=-1 -qsrcmsg >>-qlargepage" --enable-thread-safety >>--without-readline --prefix=%{buildroot}%{prefix} >>gmake -j 4 >>unset OBJECT_MODE >> >>Do you think that my problems of "out of memory" are due to the 32 bits >>build ? >>Do you think that I must build PostGreSQL wih 64 bits to solve this error ? >> >> > >It is quite likely that the out of memory errors are due to your use of >the default 32-bit memory model. In that model, a single 256MB memory >segment contains your heap, stack, thread stacks, and other per-process, >non-shared-library data. Switching to 64-bit would stop the errors if >this is true. It is also possible to adjust the amount of space >available to a 32-bit process' heap with the -bmaxdata linker option, >but the largest heap size that I would consider safe with 32-bit is 2GB >and comes with the cost of reducing the amount of shared memory >available to the process. > >Setting OBJECT_MODE to 64 before the ./configure and gmake should result >in a 64-bit build, but I don't have a copy of IBM's compiler to test >with. I would be interested in seeing the errors output by the 64-bit >gcc build if another build is attempted. > > >
On Wed, Dec 21, 2005 at 10:22:05AM +0100, DANTE ALEXANDRA wrote: > Hello, > > Thank you for all the answers I've got on this problem. > > Tom, I've checked this morning if the logfile contains lines like > %s: %ld total in %ld blocks; %ld free (%ld chunks); %ld used > and it is the case. <snip> > ExecutorState: 21586616 total in 25 blocks; 14624 free (33 chunks); 21571992 used > HashTableContext: 8192 total in 1 blocks; 8128 free (2 chunks); 64 used > HashBatchContext: 134152352 total in 22 blocks; 14859064 free (55434 chunks); 119293288 used > HashTableContext: 8192 total in 1 blocks; 8128 free (0 chunks); 64 used > HashBatchContext: 108527800 total in 16 blocks; 1016512 free (24 chunks); 107511288 used These are by far the largest and seem to confirm that your total memory usage is limited to not much more than 256MB, so what that other poster said may be relevent. Whether that as reasonable amount for a hash table to use in your context, I'll leave that to someone else... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Hello, Just to give you explanation, I will try to explain you why I have set the "work_mem" to 64MB. I noticed that some of the queries have finished with an "out of memory" because the file system on which temporary files were created was to small. Consequently, I have increase the size of this file system and I have set work_mem to 64MB. The query still continued to swap into temporary files but did not finish with an "out of memory". The next steps for me will be to : - decrease the work_mem - try to build PostGreSQL with 32 bits and the option "-bmaxdata:0x40000000" - try to build PostGreSQL with 64 bits. I will give you the results as soon as possible. Thank you for your help. Regards, Alexandra DANTE Tom Lane a écrit : >Kevin Murphy <murphy@genome.chop.edu> writes: > > >>I'm certainly not an AIX expert, but I remember my 32-bit AIX programs >>being limited to 256MB of heap by default. >> >> > >Hmm ... if that's the case then it'd probably explain the problem. >Alexandra had work_mem set to 64MB, so the two hashes and sort would >think they could use 3/4ths of the available heap; given that there >are other needs and our management of memory-use limitations is fairly >sloppy, that could easily translate into running out. > >So the answer is either to increase the available heap or reduce >work_mem to a smaller fraction of it. > > regards, tom lane > > >
Hello Tom, I've got others questions on work-mem parameter. On the "http://www.powerpostgresql.com/Downloads/annotated_conf_80.html" web site, I've read that the work-mem specifies the amount of memory to be used by internal sort operations ans hash tables before switching to temporary disk files. Moreover, for a complex query, several sort or hash operations might be running in parallel; each one will be allowed to use as much memory as this value specifies before it starts to put into temporary files. In my case, does this mean that each one of the two hashs and sorts will take 64MB, so 192MB ? What do you want to say with "so the two hashes and sort would think they could use 3/4ths of the available heap" ? Last question, how can I see that my 32-bit AIX program being limited to 256MB of heap, as the user "pg_810" used to launch the postmaster got when I execute the "ulimit -a" command : $ ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) unlimited memory(kbytes) unlimited coredump(blocks) unlimited nofiles(descriptors) 2000 Thank you very much for your help. Regards, Alexandra DANTE Tom Lane a écrit : >Kevin Murphy <murphy@genome.chop.edu> writes: > > >>I'm certainly not an AIX expert, but I remember my 32-bit AIX programs >>being limited to 256MB of heap by default. >> >> > >Hmm ... if that's the case then it'd probably explain the problem. >Alexandra had work_mem set to 64MB, so the two hashes and sort would >think they could use 3/4ths of the available heap; given that there >are other needs and our management of memory-use limitations is fairly >sloppy, that could easily translate into running out. > >So the answer is either to increase the available heap or reduce >work_mem to a smaller fraction of it. > > regards, tom lane > > >
DANTE ALEXANDRA <ALEXANDRA.DANTE@BULL.NET> writes: > In my case, does this mean that each one of the two hashs and sorts will > take 64MB, so 192MB ? > What do you want to say with "so the two hashes and sort would think > they could use 3/4ths of the available heap" ? Right, exactly. In this particular case I think the top-level sort is not going to be using much memory because it won't see very many rows, but potentially it could try to eat 64Mb just like each of the hashes. Your log entries show that the hashes are actually eating over 100Mb apiece. The memory space estimation for work_mem is not completely accurate, and is not intended to be, but I would have liked to think it would be closer than a factor-of-2 error. Might be worth looking into exactly what's happening there. > Last question, how can I see that my 32-bit AIX program being limited to > 256MB of heap, For that you need to talk to an AIX expert, which I'm not. regards, tom lane
DANTE ALEXANDRA wrote: > Last question, how can I see that my 32-bit AIX program being limited to > 256MB of heap, as the user "pg_810" used to launch the postmaster got > when I execute the "ulimit -a" command : > $ ulimit -a > time(seconds) unlimited > file(blocks) unlimited > data(kbytes) unlimited > stack(kbytes) unlimited > memory(kbytes) unlimited > coredump(blocks) unlimited > nofiles(descriptors) 2000 It's actually less than 256MB of heap. When I saw your ./configure, I noticed that no special options were passed to the linker, so you're using the default 32-bit memory model. Even an unlimited ulimit does not allow for more memory than the model sets aside. You can try setting the environment variable LDR_CNTRL to MAXDATA=0x40000000 (where the first digit is the number of 256MB segments to allocate to heap, max 8) before starting the postmaster, at the cost of reducing the amount of shared memory addressable by postgres. Diagram of default memory model (Figure 3-3): <http://www.redbooks.ibm.com/redbooks/SG245674/images/11-08-05.jpg> The redbook it's from, "Developing and Porting C and C++ Applications on AIX": <http://www.redbooks.ibm.com/redbooks/SG245674/> -- Seneca Cunningham scunning@ca.afilias.info