Thread: Prevent out of memory errors by reducing work_mem?
Hi, I´m getting an out of memory error running the following query over 6 tables (the *BASE* tables have over 1 million rows each) on Postgresql 9.1. The machine has 4GB RAM: SELECT * FROM dtng."Z_BASE" zb LEFT JOIN dtng."Z_BASE_COUNTRY" zbc ON zb."ISIN" = zbc."ISIN" LEFT JOIN dtng."PRODUCT_TYPES" pt ON zb."ID_PRODUCT_TYPE" = pt."ID_PRODUCT_TYPE" JOIN voola.export_product_groups epg ON pt."ID_PRODUCT_GROUP" = epg.id_product_group FULL OUTER JOIN warrants."W_BASE" wb ON zb."ISIN" = wb."ISIN" LEFT JOIN warrants."W_BASE_COUNTRY" wbc ON wb."ISIN" = wbc."ISIN" WHERE coalesce(zbc."ID_COUNTRY", wbc."ID_COUNTRY") = 'DE' This is the query plan: Hash Right Join (cost=1498106.22..2102918.77 rows=7487 width=2708) Hash Cond: ((wbc."ISIN")::bpchar = (wb."ISIN")::bpchar) Filter: ((COALESCE(zbc."ID_COUNTRY", wbc."ID_COUNTRY"))::bpchar = 'DE'::bpchar) -> Seq Scan on "W_BASE_COUNTRY" wbc (cost=0.00..45668.41 rows=1497341 width=160) -> Hash (cost=1015864.28..1015864.28 rows=1474955 width=2548) -> Hash Full Join (cost=420009.31..1015864.28 rows=1474955 width=2548) Hash Cond: ((zb."ISIN")::bpchar = (wb."ISIN")::bpchar) -> Hash Right Join (cost=266400.82..518612.27 rows=505517 width=2341) Hash Cond: ((zbc."ISIN")::bpchar = (zb."ISIN")::bpchar) -> Seq Scan on "Z_BASE_COUNTRY" zbc (cost=0.00..47831.60 rows=1614860 width=106) -> Hash (cost=120372.86..120372.86 rows=505517 width=2235) -> Hash Join (cost=662.44..120372.86 rows=505517 width=2235) Hash Cond: (zb."ID_PRODUCT_TYPE" = pt."ID_PRODUCT_TYPE") -> Seq Scan on "Z_BASE" zb (cost=0.00..106484.75 rows=1634275 width=1377) -> Hash (cost=631.12..631.12 rows=2506 width=858) -> Hash Join (cost=2.64..631.12 rows=2506 width=858) Hash Cond: (pt."ID_PRODUCT_GROUP" = (epg.id_product_group)::bpchar) -> Seq Scan on "PRODUCT_TYPES" pt (cost=0.00..573.03 rows=8103 width=853) -> Hash (cost=1.73..1.73 rows=73 width=5) -> Seq Scan on export_product_groups epg (cost=0.00..1.73 rows=73 width=5) -> Hash (cost=93399.55..93399.55 rows=1474955 width=207) -> Seq Scan on "W_BASE" wb (cost=0.00..93399.55 rows=1474955 width=207) I tried reducing work_mem from 8MB to 64kB to force usage of temporary files for the hash joins instead of working memory, as written in chapter 18.4.1 of the documentation. But that didn´t help. I know that the query is poorly written and already rewrote it to use less memory. My question is if/how it is possible to prevent the out of memory error at the price of speed (disk usage).I already searched the internet for an answer but without luck... Thanks a lot, Jan P.S.: Here is the Postgres log: TopMemoryContext: 2181968 total in 13 blocks; 9888 free (12 chunks); 2172080 used TopTransactionContext: 8192 total in 1 blocks; 7112 free (0 chunks); 1080 used Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used Record information cache: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used TableSpace cache: 8192 total in 1 blocks; 3216 free (0 chunks); 4976 used Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used Prepared Queries: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used MessageContext: 8192 total in 1 blocks; 6976 free (0 chunks); 1216 used Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used smgr relation table: 24576 total in 2 blocks; 5696 free (4 chunks); 18880 used TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 used Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used PortalHoldContext: 24576 total in 2 blocks; 15888 free (4 chunks); 8688 used PortalHeapMemory: 534144 total in 68 blocks; 6656 free (14 chunks); 527488 used ExecutorState: 3442985408 total in 412394 blocks; 5173848 free (16 chunks); 3437811560 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 16384 total in 1 blocks; 2304 free (0 chunks); 14080 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 262144 total in 5 blocks; 74976 free (3 chunks); 187168 used HashTableContext: 8192 total in 1 blocks; 6080 free (0 chunks); 2112 used HashBatchContext: 4194304 total in 9 blocks; 79984 free (11 chunks); 4114320 used HashTableContext: 8192 total in 1 blocks; 7104 free (0 chunks); 1088 used HashBatchContext: 8421424 total in 12 blocks; 2017200 free (13 chunks); 6404224 used HashTableContext: 67166304 total in 5 blocks; 57248 free (15 chunks); 67109056 used HashBatchContext: 234881024 total in 37 blocks; 259488 free (62 chunks); 234621536 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used Relcache by OID: 24576 total in 2 blocks; 12832 free (3 chunks); 11744 used CacheMemoryContext: 1390112 total in 23 blocks; 70696 free (1 chunks); 1319416 used unnamed prepared statement: 516096 total in 6 blocks; 192440 free (4 chunks); 323656 used tmp_idx: 2048 total in 1 blocks; 72 free (0 chunks); 1976 used PRODUCT_TYPES_pkey: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used PRODUCT_TYPES_ID_PRODUCT_TYPE_key: 2048 total in 1 blocks; 712 free (0 chunks); 1336 used W_BASE_COUNTRY_NSIN: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used W_BASE_COUNTRY_ISIN: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used W_BASE_COUNTRY_pkey: 2048 total in 1 blocks; 712 free (0 chunks); 1336 used W_BASE_pkey: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used Z_BASE_COUNTRY_STATUS: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used Z_BASE_COUNTRY_NSIN_ID_COUNTRY: 2048 total in 1 blocks; 176 free (0 chunks); 1872 used Z_BASE_COUNTRY_NSIN: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used Z_BASE_COUNTRY_ISIN: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used Z_BASE_COUNTRY_ID_COUNTRY: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used Z_BASE_COUNTRY_pkey: 2048 total in 1 blocks; 712 free (0 chunks); 1336 used Z_BASE_IS_INDICATIVE: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used Z_BASE_ID_PRODUCT_TYPE_index: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used Z_BASE_ID_ISSUER_GROUP_index: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used Z_BASE_ID_ISSUER_GROUP_ID_PRODUCT_TYPE_index: 2048 total in 1 blocks; 712 free (0 chunks); 1336 used Z_BASE_pkey: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used pg_constraint_conrelid_index: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used CachedPlan: 1024 total in 1 blocks; 784 free (0 chunks); 240 used CachedPlanSource: 1024 total in 1 blocks; 656 free (0 chunks); 368 used pg_attrdef_oid_index: 2048 total in 1 blocks; 776 free (0 chunks); 1272 used pg_attrdef_adrelid_adnum_index: 2048 total in 1 blocks; 664 free (0 chunks); 1384 used pg_settings: 23552 total in 5 blocks; 3040 free (1 chunks); 20512 used pg_toast_2619_index: 2048 total in 1 blocks; 664 free (0 chunks); 1384 used pg_index_indrelid_index: 2048 total in 1 blocks; 728 free (0 chunks); 1320 used CachedPlan: 1024 total in 1 blocks; 784 free (0 chunks); 240 used CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used pg_db_role_setting_databaseid_rol_index: 2048 total in 1 blocks; 664 free (0 chunks); 1384 used pg_opclass_am_name_nsp_index: 3072 total in 2 blocks; 1496 free (2 chunks); 1576 used pg_foreign_data_wrapper_name_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_enum_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_class_relname_nsp_index: 3072 total in 2 blocks; 1656 free (3 chunks); 1416 used pg_foreign_server_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_statistic_relid_att_inh_index: 3072 total in 2 blocks; 1496 free (2 chunks); 1576 used pg_cast_source_target_index: 3072 total in 2 blocks; 1656 free (3 chunks); 1416 used pg_language_name_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_collation_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_amop_fam_strat_index: 3072 total in 2 blocks; 1496 free (2 chunks); 1576 used pg_index_indexrelid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_ts_template_tmplname_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_ts_config_map_index: 3072 total in 2 blocks; 1544 free (2 chunks); 1528 used pg_opclass_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_foreign_data_wrapper_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_ts_dict_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_conversion_default_index: 3072 total in 2 blocks; 1544 free (2 chunks); 1528 used pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1496 free (2 chunks); 1576 used pg_trigger_tgrelid_tgname_index: 3072 total in 2 blocks; 1656 free (3 chunks); 1416 used pg_enum_typid_label_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_ts_config_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_user_mapping_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1544 free (2 chunks); 1528 used pg_foreign_table_relid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_type_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_aggregate_fnoid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_constraint_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1656 free (3 chunks); 1416 used pg_ts_parser_prsname_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_ts_config_cfgname_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_ts_parser_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_operator_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_namespace_nspname_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_ts_template_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_amop_opr_fam_index: 3072 total in 2 blocks; 1496 free (2 chunks); 1576 used pg_default_acl_role_nsp_obj_index: 3072 total in 2 blocks; 1544 free (2 chunks); 1528 used pg_collation_name_enc_nsp_index: 3072 total in 2 blocks; 1544 free (2 chunks); 1528 used pg_ts_dict_dictname_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_type_typname_nsp_index: 3072 total in 2 blocks; 1656 free (3 chunks); 1416 used pg_opfamily_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_class_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_proc_proname_args_nsp_index: 3072 total in 2 blocks; 1496 free (2 chunks); 1576 used pg_attribute_relid_attnum_index: 3072 total in 2 blocks; 1656 free (3 chunks); 1416 used pg_proc_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_language_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_namespace_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1496 free (2 chunks); 1576 used pg_foreign_server_name_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_attribute_relid_attnam_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_conversion_oid_index: 3072 total in 2 blocks; 1768 free (4 chunks); 1304 used pg_user_mapping_user_server_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_conversion_name_nsp_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_authid_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_auth_members_member_role_index: 3072 total in 2 blocks; 1656 free (3 chunks); 1416 used pg_tablespace_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_database_datname_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_auth_members_role_member_index: 3072 total in 2 blocks; 1704 free (4 chunks); 1368 used pg_database_oid_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used pg_authid_rolname_index: 3072 total in 2 blocks; 1720 free (3 chunks); 1352 used MdSmgr: 8192 total in 1 blocks; 5024 free (0 chunks); 3168 used LOCALLOCK hash: 24576 total in 2 blocks; 13920 free (4 chunks); 10656 used Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728 used ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used <2013-01-25 10:17:20 CET - strube> ERROR: out of memory <2013-01-25 10:17:20 CET - strube> DETAIL: Failed on request of size 8240.
Jan Strube <js@deriva.de> writes: > I'm getting an out of memory error running the following query over 6 > tables (the *BASE* tables have over 1 million rows each) on Postgresql > 9.1. The machine has 4GB RAM: It looks to me like you're suffering an executor memory leak that's probably unrelated to the hash joins as such. The leak is in the ExecutorState context: > ExecutorState: 3442985408 total in 412394 blocks; 5173848 free (16 > chunks); 3437811560 used while the subsidiary HashXYZ contexts don't look like they're going beyond what they've been told to. So the first question is 9.1.what? We've fixed execution-time memory leaks as recently as 9.1.7. If you're on 9.1.7, or if after updating you can still reproduce the problem, please see if you can create a self-contained test case. My guess is it would have to do with the specific data types and operators being used in the query, but not so much with the specific data, so you probably could create a test case that just uses tables filled with generated random data. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Jan Strube <js@deriva.de> writes: > I'm getting an out of memory error running the following query over 6 > tables (the *BASE* tables have over 1 million rows each) on Postgresql > 9.1. The machine has 4GB RAM: It looks to me like you're suffering an executor memory leak that's probably unrelated to the hash joins as such. The leak is in the ExecutorState context: > ExecutorState: 3442985408 total in 412394 blocks; 5173848 free (16 > chunks); 3437811560 used while the subsidiary HashXYZ contexts don't look like they're going beyond what they've been told to. So the first question is 9.1.what? We've fixed execution-time memory leaks as recently as 9.1.7. If you're on 9.1.7, or if after updating you can still reproduce the problem, please see if you can create a self-contained test case. My guess is it would have to do with the specific data types and operators being used in the query, but not so much with the specific data, so you probably could create a test case that just uses tables filled with generated random data. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Hi, you are right. We were running 9.1.4 and after upgrading to 9.1.7 the error disappeared. Thanks a lot, JanStrube >> I'm getting an out of memory error running the following query over 6 >> tables (the *BASE* tables have over 1 million rows each) on Postgresql >> 9.1. The machine has 4GB RAM: > It looks to me like you're suffering an executor memory leak that's > probably unrelated to the hash joins as such. The leak is in the > ExecutorState context: > >> ExecutorState: 3442985408 total in 412394 blocks; 5173848 free (16 >> chunks); 3437811560 used > while the subsidiary HashXYZ contexts don't look like they're going > beyond what they've been told to. > > So the first question is 9.1.what? We've fixed execution-time memory > leaks as recently as 9.1.7. > > If you're on 9.1.7, or if after updating you can still reproduce the > problem, please see if you can create a self-contained test case. > My guess is it would have to do with the specific data types and > operators being used in the query, but not so much with the specific > data, so you probably could create a test case that just uses tables > filled with generated random data. > > regards, tom lane >