Thread: Prevent out of memory errors by reducing work_mem?

Prevent out of memory errors by reducing work_mem?

From
Jan Strube
Date:
Hi,

I´m getting an out of memory error running the following query over 6
tables (the *BASE* tables have over 1 million rows each) on Postgresql
9.1. The machine has 4GB RAM:

SELECT *
FROM dtng."Z_BASE" zb
LEFT JOIN dtng."Z_BASE_COUNTRY" zbc ON zb."ISIN" = zbc."ISIN"
LEFT JOIN dtng."PRODUCT_TYPES" pt ON zb."ID_PRODUCT_TYPE" =
pt."ID_PRODUCT_TYPE"
JOIN voola.export_product_groups epg ON pt."ID_PRODUCT_GROUP" =
epg.id_product_group
FULL OUTER JOIN warrants."W_BASE" wb ON zb."ISIN" = wb."ISIN"
LEFT JOIN warrants."W_BASE_COUNTRY" wbc ON wb."ISIN" = wbc."ISIN"
WHERE coalesce(zbc."ID_COUNTRY", wbc."ID_COUNTRY") = 'DE'

This is the query plan:

Hash Right Join (cost=1498106.22..2102918.77 rows=7487 width=2708)
Hash Cond: ((wbc."ISIN")::bpchar = (wb."ISIN")::bpchar)
Filter: ((COALESCE(zbc."ID_COUNTRY", wbc."ID_COUNTRY"))::bpchar =
'DE'::bpchar)
-> Seq Scan on "W_BASE_COUNTRY" wbc (cost=0.00..45668.41 rows=1497341
width=160)
-> Hash (cost=1015864.28..1015864.28 rows=1474955 width=2548)
-> Hash Full Join (cost=420009.31..1015864.28 rows=1474955 width=2548)
Hash Cond: ((zb."ISIN")::bpchar = (wb."ISIN")::bpchar)
-> Hash Right Join (cost=266400.82..518612.27 rows=505517 width=2341)
Hash Cond: ((zbc."ISIN")::bpchar = (zb."ISIN")::bpchar)
-> Seq Scan on "Z_BASE_COUNTRY" zbc (cost=0.00..47831.60 rows=1614860
width=106)
-> Hash (cost=120372.86..120372.86 rows=505517 width=2235)
-> Hash Join (cost=662.44..120372.86 rows=505517 width=2235)
Hash Cond: (zb."ID_PRODUCT_TYPE" = pt."ID_PRODUCT_TYPE")
-> Seq Scan on "Z_BASE" zb (cost=0.00..106484.75 rows=1634275 width=1377)
-> Hash (cost=631.12..631.12 rows=2506 width=858)
-> Hash Join (cost=2.64..631.12 rows=2506 width=858)
Hash Cond: (pt."ID_PRODUCT_GROUP" = (epg.id_product_group)::bpchar)
-> Seq Scan on "PRODUCT_TYPES" pt (cost=0.00..573.03 rows=8103 width=853)
-> Hash (cost=1.73..1.73 rows=73 width=5)
-> Seq Scan on export_product_groups epg (cost=0.00..1.73 rows=73 width=5)
-> Hash (cost=93399.55..93399.55 rows=1474955 width=207)
-> Seq Scan on "W_BASE" wb (cost=0.00..93399.55 rows=1474955 width=207)

I tried reducing work_mem from 8MB to 64kB to force usage of temporary
files for the hash joins instead of working memory, as written in
chapter 18.4.1 of the documentation. But that didn´t help.

I know that the query is poorly written and already rewrote it to use
less memory. My question is if/how it is possible to prevent the out of
memory error at the price of speed (disk usage).I already searched the
internet for an answer but without luck...

Thanks a lot,
Jan

P.S.: Here is the Postgres log:

TopMemoryContext: 2181968 total in 13 blocks; 9888 free (12 chunks);
2172080 used
TopTransactionContext: 8192 total in 1 blocks; 7112 free (0 chunks);
1080 used
Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
Record information cache: 24576 total in 2 blocks; 15984 free (5
chunks); 8592 used
TableSpace cache: 8192 total in 1 blocks; 3216 free (0 chunks); 4976 used
Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
Prepared Queries: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used
MessageContext: 8192 total in 1 blocks; 6976 free (0 chunks); 1216 used
Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512
used
smgr relation table: 24576 total in 2 blocks; 5696 free (4 chunks);
18880 used
TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks);
32 used
Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used
PortalHoldContext: 24576 total in 2 blocks; 15888 free (4 chunks); 8688 used
PortalHeapMemory: 534144 total in 68 blocks; 6656 free (14 chunks);
527488 used
ExecutorState: 3442985408 total in 412394 blocks; 5173848 free (16
chunks); 3437811560 used
HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
HashBatchContext: 16384 total in 1 blocks; 2304 free (0 chunks); 14080 used
HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
HashBatchContext: 262144 total in 5 blocks; 74976 free (3 chunks);
187168 used
HashTableContext: 8192 total in 1 blocks; 6080 free (0 chunks); 2112 used
HashBatchContext: 4194304 total in 9 blocks; 79984 free (11 chunks);
4114320 used
HashTableContext: 8192 total in 1 blocks; 7104 free (0 chunks); 1088 used
HashBatchContext: 8421424 total in 12 blocks; 2017200 free (13 chunks);
6404224 used
HashTableContext: 67166304 total in 5 blocks; 57248 free (15 chunks);
67109056 used
HashBatchContext: 234881024 total in 37 blocks; 259488 free (62 chunks);
234621536 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Relcache by OID: 24576 total in 2 blocks; 12832 free (3 chunks); 11744 used
CacheMemoryContext: 1390112 total in 23 blocks; 70696 free (1 chunks);
1319416 used
unnamed prepared statement: 516096 total in 6 blocks; 192440 free (4
chunks); 323656 used
tmp_idx: 2048 total in 1 blocks; 72 free (0 chunks); 1976 used
PRODUCT_TYPES_pkey: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used
PRODUCT_TYPES_ID_PRODUCT_TYPE_key: 2048 total in 1 blocks; 712 free (0
chunks); 1336 used
W_BASE_COUNTRY_NSIN: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used
W_BASE_COUNTRY_ISIN: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used
W_BASE_COUNTRY_pkey: 2048 total in 1 blocks; 712 free (0 chunks); 1336 used
W_BASE_pkey: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used
Z_BASE_COUNTRY_STATUS: 2048 total in 1 blocks; 776 free (0 chunks); 1272
used
Z_BASE_COUNTRY_NSIN_ID_COUNTRY: 2048 total in 1 blocks; 176 free (0
chunks); 1872 used
Z_BASE_COUNTRY_NSIN: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used
Z_BASE_COUNTRY_ISIN: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used
Z_BASE_COUNTRY_ID_COUNTRY: 2048 total in 1 blocks; 776 free (0 chunks);
1272 used
Z_BASE_COUNTRY_pkey: 2048 total in 1 blocks; 712 free (0 chunks); 1336 used
Z_BASE_IS_INDICATIVE: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used
Z_BASE_ID_PRODUCT_TYPE_index: 2048 total in 1 blocks; 728 free (0
chunks); 1320 used
Z_BASE_ID_ISSUER_GROUP_index: 2048 total in 1 blocks; 776 free (0
chunks); 1272 used
Z_BASE_ID_ISSUER_GROUP_ID_PRODUCT_TYPE_index: 2048 total in 1 blocks;
712 free (0 chunks); 1336 used
Z_BASE_pkey: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used
pg_constraint_conrelid_index: 2048 total in 1 blocks; 728 free (0
chunks); 1320 used
CachedPlan: 1024 total in 1 blocks; 784 free (0 chunks); 240 used
CachedPlanSource: 1024 total in 1 blocks; 656 free (0 chunks); 368 used
pg_attrdef_oid_index: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used
pg_attrdef_adrelid_adnum_index: 2048 total in 1 blocks; 664 free (0
chunks); 1384 used
pg_settings: 23552 total in 5 blocks; 3040 free (1 chunks); 20512 used
pg_toast_2619_index: 2048 total in 1 blocks; 664 free (0 chunks); 1384 used
pg_index_indrelid_index: 2048 total in 1 blocks; 728 free (0 chunks);
1320 used
CachedPlan: 1024 total in 1 blocks; 784 free (0 chunks); 240 used
CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used
pg_db_role_setting_databaseid_rol_index: 2048 total in 1 blocks; 664
free (0 chunks); 1384 used
pg_opclass_am_name_nsp_index: 3072 total in 2 blocks; 1496 free (2
chunks); 1576 used
pg_foreign_data_wrapper_name_index: 3072 total in 2 blocks; 1768 free (4
chunks); 1304 used
pg_enum_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used
pg_class_relname_nsp_index: 3072 total in 2 blocks; 1656 free (3
chunks); 1416 used
pg_foreign_server_oid_index: 3072 total in 2 blocks; 1768 free (4
chunks); 1304 used
pg_statistic_relid_att_inh_index: 3072 total in 2 blocks; 1496 free (2
chunks); 1576 used
pg_cast_source_target_index: 3072 total in 2 blocks; 1656 free (3
chunks); 1416 used
pg_language_name_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_collation_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_amop_fam_strat_index: 3072 total in 2 blocks; 1496 free (2 chunks);
1576 used
pg_index_indexrelid_index: 3072 total in 2 blocks; 1720 free (3 chunks);
1352 used
pg_ts_template_tmplname_index: 3072 total in 2 blocks; 1704 free (4
chunks); 1368 used
pg_ts_config_map_index: 3072 total in 2 blocks; 1544 free (2 chunks);
1528 used
pg_opclass_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352
used
pg_foreign_data_wrapper_oid_index: 3072 total in 2 blocks; 1768 free (4
chunks); 1304 used
pg_ts_dict_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304
used
pg_conversion_default_index: 3072 total in 2 blocks; 1544 free (2
chunks); 1528 used
pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1496 free (2
chunks); 1576 used
pg_trigger_tgrelid_tgname_index: 3072 total in 2 blocks; 1656 free (3
chunks); 1416 used
pg_enum_typid_label_index: 3072 total in 2 blocks; 1704 free (4 chunks);
1368 used
pg_ts_config_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_user_mapping_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1544 free (2
chunks); 1528 used
pg_foreign_table_relid_index: 3072 total in 2 blocks; 1768 free (4
chunks); 1304 used
pg_type_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used
pg_aggregate_fnoid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_constraint_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1656 free (3
chunks); 1416 used
pg_ts_parser_prsname_index: 3072 total in 2 blocks; 1704 free (4
chunks); 1368 used
pg_ts_config_cfgname_index: 3072 total in 2 blocks; 1704 free (4
chunks); 1368 used
pg_ts_parser_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_operator_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks);
1352 used
pg_namespace_nspname_index: 3072 total in 2 blocks; 1720 free (3
chunks); 1352 used
pg_ts_template_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_amop_opr_fam_index: 3072 total in 2 blocks; 1496 free (2 chunks);
1576 used
pg_default_acl_role_nsp_obj_index: 3072 total in 2 blocks; 1544 free (2
chunks); 1528 used
pg_collation_name_enc_nsp_index: 3072 total in 2 blocks; 1544 free (2
chunks); 1528 used
pg_ts_dict_dictname_index: 3072 total in 2 blocks; 1704 free (4 chunks);
1368 used
pg_type_typname_nsp_index: 3072 total in 2 blocks; 1656 free (3 chunks);
1416 used
pg_opfamily_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_class_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used
pg_proc_proname_args_nsp_index: 3072 total in 2 blocks; 1496 free (2
chunks); 1576 used
pg_attribute_relid_attnum_index: 3072 total in 2 blocks; 1656 free (3
chunks); 1416 used
pg_proc_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used
pg_language_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_namespace_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks);
1352 used
pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1496 free (2 chunks);
1576 used
pg_foreign_server_name_index: 3072 total in 2 blocks; 1768 free (4
chunks); 1304 used
pg_attribute_relid_attnam_index: 3072 total in 2 blocks; 1704 free (4
chunks); 1368 used
pg_conversion_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks);
1304 used
pg_user_mapping_user_server_index: 3072 total in 2 blocks; 1704 free (4
chunks); 1368 used
pg_conversion_name_nsp_index: 3072 total in 2 blocks; 1704 free (4
chunks); 1368 used
pg_authid_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used
pg_auth_members_member_role_index: 3072 total in 2 blocks; 1656 free (3
chunks); 1416 used
pg_tablespace_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks);
1352 used
pg_database_datname_index: 3072 total in 2 blocks; 1720 free (3 chunks);
1352 used
pg_auth_members_role_member_index: 3072 total in 2 blocks; 1704 free (4
chunks); 1368 used
pg_database_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks);
1352 used
pg_authid_rolname_index: 3072 total in 2 blocks; 1720 free (3 chunks);
1352 used
MdSmgr: 8192 total in 1 blocks; 5024 free (0 chunks); 3168 used
LOCALLOCK hash: 24576 total in 2 blocks; 13920 free (4 chunks); 10656 used
Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728 used
ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
<2013-01-25 10:17:20 CET - strube> ERROR: out of memory
<2013-01-25 10:17:20 CET - strube> DETAIL: Failed on request of size 8240.



Re: Prevent out of memory errors by reducing work_mem?

From
Tom Lane
Date:
Jan Strube <js@deriva.de> writes:
> I'm getting an out of memory error running the following query over 6
> tables (the *BASE* tables have over 1 million rows each) on Postgresql
> 9.1. The machine has 4GB RAM:

It looks to me like you're suffering an executor memory leak that's
probably unrelated to the hash joins as such.  The leak is in the
ExecutorState context:

> ExecutorState: 3442985408 total in 412394 blocks; 5173848 free (16
> chunks); 3437811560 used

while the subsidiary HashXYZ contexts don't look like they're going
beyond what they've been told to.

So the first question is 9.1.what?  We've fixed execution-time memory
leaks as recently as 9.1.7.

If you're on 9.1.7, or if after updating you can still reproduce the
problem, please see if you can create a self-contained test case.
My guess is it would have to do with the specific data types and
operators being used in the query, but not so much with the specific
data, so you probably could create a test case that just uses tables
filled with generated random data.

            regards, tom lane


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Prevent out of memory errors by reducing work_mem?

From
Tom Lane
Date:
Jan Strube <js@deriva.de> writes:
> I'm getting an out of memory error running the following query over 6
> tables (the *BASE* tables have over 1 million rows each) on Postgresql
> 9.1. The machine has 4GB RAM:

It looks to me like you're suffering an executor memory leak that's
probably unrelated to the hash joins as such.  The leak is in the
ExecutorState context:

> ExecutorState: 3442985408 total in 412394 blocks; 5173848 free (16
> chunks); 3437811560 used

while the subsidiary HashXYZ contexts don't look like they're going
beyond what they've been told to.

So the first question is 9.1.what?  We've fixed execution-time memory
leaks as recently as 9.1.7.

If you're on 9.1.7, or if after updating you can still reproduce the
problem, please see if you can create a self-contained test case.
My guess is it would have to do with the specific data types and
operators being used in the query, but not so much with the specific
data, so you probably could create a test case that just uses tables
filled with generated random data.

            regards, tom lane


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Prevent out of memory errors by reducing work_mem?

From
Jan Strube
Date:
Hi,

you are right.
We were running 9.1.4 and after upgrading to 9.1.7 the error disappeared.

Thanks a lot,
JanStrube


>> I'm getting an out of memory error running the following query over 6
>> tables (the *BASE* tables have over 1 million rows each) on Postgresql
>> 9.1. The machine has 4GB RAM:
> It looks to me like you're suffering an executor memory leak that's
> probably unrelated to the hash joins as such.  The leak is in the
> ExecutorState context:
>
>> ExecutorState: 3442985408 total in 412394 blocks; 5173848 free (16
>> chunks); 3437811560 used
> while the subsidiary HashXYZ contexts don't look like they're going
> beyond what they've been told to.
>
> So the first question is 9.1.what?  We've fixed execution-time memory
> leaks as recently as 9.1.7.
>
> If you're on 9.1.7, or if after updating you can still reproduce the
> problem, please see if you can create a self-contained test case.
> My guess is it would have to do with the specific data types and
> operators being used in the query, but not so much with the specific
> data, so you probably could create a test case that just uses tables
> filled with generated random data.
>
>             regards, tom lane
>