Thread: Out of memory on SELECT in 8.3.5
We've been having persistent out-of-memory errors occur in our production 8.3 deployment, which is now running 8.3.5. I'm not sure the query here is the cause of the problem, but this is our most-recent example which triggered an out-of-memory error for us. Perhaps our configuration needs tweaking somewhere... shared_buffers = 4000MB #temp_buffers = 8MB work_mem = 128MB maintenance_work_mem = 256MB #max_stack_depth = 2MB I was wondering if somebody could spot anything odd looking in the query plan or the memory parameters dumped to the log that might be causing this? explain SELECT s.sale_date, s.vin, s.dealer_region as sale_region, s.dealer_area as sale_area, s.dealer_code as sale_dealer, d.name as sale_dealer_name, s.model as sale_model, s.sale_type, ml.lead_id, l.created, lrd.reporting_date as lead_date, ml.lead_pos as lead_order, ml.lost_sale, ml.range, lrd.dealer_region as lead_region, lrd.dealer_area as lead_area, lrd.dealer_code as lead_dealer, lrd.model as lead_model, l.source as lead_source, lrd.oem_type as lead_type, lrd.oem_subtype as lead_subtype, lrd.unique_all as lead_unique_all, lrd.unique_region as lead_unique_region, lrd.unique_area as lead_unique_area, lrd.unique_dealer as lead_unique_dealer, lrd.customer_fname, lrd.customer_lname, lrdm.item_value as lead_tracking_code, address_street.street as owner_street, address.city as owner_city, address.state as owner_state, address.postal_code as owner_postal_code, address.country as owner_country, email.list_value as owner_email, phone.list_value as owner_phone FROM mb_sale s LEFT OUTER JOIN dealer d on d.dealer_code = s.dealer_code LEFT OUTER JOIN mb_lead ml on ml.sale_id = s.id LEFT OUTER JOIN lead_reporting_data lrd ON lrd.lead_id = ml.lead_id LEFT OUTER JOIN lead_reporting_meta lrdm ON lrdm.lead_id = ml.lead_id AND lrdm.item_key = '[ALS:prospectid]TrackingCode' AND lrdm.pos=1 LEFT OUTER JOIN lead l ON l.id = ml.lead_id LEFT OUTER JOIN lead_reporting_address address ON address.lead_id = ml.lead_id LEFT OUTER JOIN lead_reporting_street address_street ON address_street.address_id = address.id AND address_street.pos = 0 LEFT OUTER JOIN lead_reporting_list_data phone ON phone.lead_id = ml.lead_id AND phone.list_type = 'p' AND phone.pos = 0 LEFT OUTER JOIN lead_reporting_list_data email ON email.lead_id = ml.lead_id AND email.list_type = 'e' AND email.pos = 0 WHERE ((s.sale_type = 'd' and ml.lost_sale = FALSE) OR (s.sale_type = 'c' and ml.lead_pos = 0) OR (s.sale_type = '0')) AND s.sale_date BETWEEN date('2009-01-01') AND date('2009-01-31') ORDER BY s.sale_date, s.vin QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=2297525.76..2297626.60 rows=40336 width=234) Sort Key: s.sale_date, s.vin -> Hash Left Join (cost=1450193.63..2294440.10 rows=40336 width=234) Hash Cond: (ml.lead_id = phone.lead_id) -> Hash Left Join (cost=1341176.18..2140101.73 rows=40336 width=219) Hash Cond: (ml.lead_id = email.lead_id) -> Nested Loop Left Join (cost=1237098.64..2000317.19 rows=40336 width=204) -> Nested Loop Left Join (cost=1237098.64..1806747.26 rows=40336 width=141) -> Nested Loop Left Join (cost=1237098.64..1633400.88 rows=40336 width=133) -> Nested Loop Left Join (cost=1237098.64..1413899.40 rows=40336 width=118) -> Merge Left Join (cost=1237098.64..1238143.69 rows=40336 width=89) Merge Cond: (ml.lead_id = lrdm.lead_id) -> Sort (cost=46384.08..46484.92 rows=40336 width=78) Sort Key: ml.lead_id -> Hash Left Join (cost=9430.33..43298.42 rows=40336 width=78) Hash Cond: ((s.dealer_code)::text = (d.dealer_code)::text) -> Hash Left Join (cost=9094.04..42407.50 rows=40336 width=60) Hash Cond: (s.id = ml.sale_id) Filter: (((s.sale_type = 'd'::bpchar) AND (NOT ml.lost_sale)) OR ((s.sale_type = 'c'::bpchar) AND (ml.lead_pos = 0)) OR (s.sale_type = '0'::bpchar)) -> Index Scan using mb_sale_sale_date_idx on mb_sale s (cost=0.00..14027.94 rows=43804 width=50) Index Cond: ((sale_date >= '2009-01-01'::date) AND (sale_date <= '2009-01-31'::date)) -> Hash (cost=5577.35..5577.35 rows=281335 width=26) -> Seq Scan on mb_lead ml (cost=0.00..5577.35 rows=281335 width=26) -> Hash (cost=321.13..321.13 rows=1213 width=23) -> Seq Scan on dealer d (cost=0.00..321.13 rows=1213 width=23) -> Sort (cost=1190714.57..1191036.34 rows=128709 width=19) Sort Key: lrdm.lead_id -> Bitmap Heap Scan on lead_reporting_meta lrdm (cost=118675.76..1179791.19 rows=128709 width=19) Recheck Cond: (item_key = '[ALS:prospectid]TrackingCode'::text) Filter: (pos = 1) -> Bitmap Index Scan on lead_reporting_meta_item_key_idx (cost=0.00..118643.58 rows=1476580 width=0) Index Cond: (item_key = '[ALS:prospectid]TrackingCode'::text) -> Index Scan using lead_reporting_address_lead_id_idx on lead_reporting_address address (cost=0.00..4.34 rows=1 width=37) Index Cond: (address.lead_id = ml.lead_id) -> Index Scan using lead_pkey on lead l (cost=0.00..5.43 rows=1 width=23) Index Cond: (l.id = ml.lead_id) -> Index Scan using lead_reporting_street_address_id_idx on lead_reporting_street address_street (cost=0.00..4.29 rows=1 width=24) Index Cond: (address_street.address_id = address.id) Filter: (address_street.pos = 0) -> Index Scan using lead_reporting_data_pkey on lead_reporting_data lrd (cost=0.00..4.79 rows=1 width=71) Index Cond: (lrd.lead_id = ml.lead_id) -> Hash (cost=85837.99..85837.99 rows=1459164 width=23) -> Seq Scan on lead_reporting_list_data email (cost=0.00..85837.99 rows=1459164 width=23) Filter: ((list_type = 'e'::bpchar) AND (pos = 0)) -> Hash (cost=85837.99..85837.99 rows=1854357 width=23) -> Seq Scan on lead_reporting_list_data phone (cost=0.00..85837.99 rows=1854357 width=23) Filter: ((list_type = 'p'::bpchar) AND (pos = 0)) (47 rows) TopMemoryContext: 154528 total in 18 blocks; 5240 free (20 chunks); 149288 used TopTransactionContext: 8192 total in 1 blocks; 7624 free (0 chunks); 568 used PL/PgSQL function context: 8192 total in 1 blocks; 5512 free (4 chunks); 2680 used PL/PgSQL function context: 57344 total in 3 blocks; 42720 free (11 chunks); 14624 used PL/PgSQL function context: 8192 total in 1 blocks; 4648 free (7 chunks); 3544 used Rendezvous variable hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PLpgSQL function cache: 24520 total in 2 blocks; 3744 free (0 chunks); 20776 used Record information cache: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used RI compare cache: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used RI query cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used Prepared Queries: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used MessageContext: 8192 total in 1 blocks; 6976 free (0 chunks); 1216 used smgr relation table: 24576 total in 2 blocks; 3584 free (4 chunks); 20992 used TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 used Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used PortalHeapMemory: 1021952 total in 126 blocks; 11008 free (181 chunks); 1010944 used ExecutorState: 4972936 total in 6 blocks; 38456 free (11 chunks); 4934480 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 148889648 total in 28 blocks; 2532544 free (5 chunks); 146357104 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 144695344 total in 27 blocks; 828368 free (12 chunks); 143866976 used TIDBitmap: 50356272 total in 16 blocks; 7517392 free (41 chunks); 42838880 used TupleSort: 142649352 total in 25 blocks; 92237584 free (1339345 chunks); 50411768 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 131072 total in 4 blocks; 20064 free (3 chunks); 111008 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 25419824 total in 13 blocks; 2650448 free (5 chunks); 22769376 used TupleSort: 9953328 total in 11 blocks; 1277840 free (9 chunks); 8675488 used TupleSort: 1138736 total in 8 blocks; 768 free (8 chunks); 1137968 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used JoinRelHashTable: 8192 total in 1 blocks; 624 free (0 chunks); 7568 used Relcache by OID: 24576 total in 2 blocks; 10752 free (3 chunks); 13824 used CacheMemoryContext: 1341680 total in 21 blocks; 174632 free (83 chunks); 1167048 used unnamed prepared statement: 516096 total in 6 blocks; 231360 free (2 chunks); 284736 used CachedPlan: 3072 total in 2 blocks; 1160 free (0 chunks); 1912 used pg_statistic_relid_att_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used locks_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_list_data_value_lower_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_list_data_value_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_list_data_subtype_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_list_data_type_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_list_data_lead_id_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_list_data_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_street_street_lower_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_street_address_id_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_reporting_street_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_address_postal_code_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_address_state_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_address_city_lower_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_address_lead_id_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_reporting_address_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_lead_sale_id_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_lead_lead_id_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_lead_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_sale_dealer_region_area_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_sale_dealer_code_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_sale_sale_date_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used mb_sale_type_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_sale_model_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_sale_dealer_region_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_sale_dealer_area_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_sale_vin_key: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used mb_sale_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used CachedPlan: 3072 total in 2 blocks; 648 free (0 chunks); 2424 used pg_toast_17933_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used config_xml_name_key: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used config_xml_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used CachedPlanSource: 1024 total in 1 blocks; 56 free (0 chunks); 968 used SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlanSource: 1024 total in 1 blocks; 24 free (0 chunks); 1000 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 7168 total in 3 blocks; 3888 free (1 chunks); 3280 used CachedPlan: 7168 total in 3 blocks; 3920 free (0 chunks); 3248 used CachedPlanSource: 3072 total in 2 blocks; 888 free (0 chunks); 2184 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 7168 total in 3 blocks; 3704 free (1 chunks); 3464 used CachedPlanSource: 3072 total in 2 blocks; 896 free (0 chunks); 2176 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 3072 total in 2 blocks; 1456 free (1 chunks); 1616 used CachedPlanSource: 3072 total in 2 blocks; 1664 free (0 chunks); 1408 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 3072 total in 2 blocks; 312 free (0 chunks); 2760 used CachedPlanSource: 3072 total in 2 blocks; 1568 free (0 chunks); 1504 used SPI Plan: 1024 total in 1 blocks; 800 free (0 chunks); 224 used cl_event_first_response_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 3072 total in 2 blocks; 1728 free (0 chunks); 1344 used CachedPlanSource: 3072 total in 2 blocks; 1536 free (0 chunks); 1536 used SPI Plan: 1024 total in 1 blocks; 800 free (0 chunks); 224 used CachedPlan: 15360 total in 4 blocks; 7104 free (1 chunks); 8256 used CachedPlanSource: 7168 total in 3 blocks; 2976 free (2 chunks); 4192 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 3072 total in 2 blocks; 1568 free (0 chunks); 1504 used CachedPlanSource: 3072 total in 2 blocks; 1352 free (0 chunks); 1720 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 3072 total in 2 blocks; 368 free (0 chunks); 2704 used CachedPlanSource: 3072 total in 2 blocks; 40 free (0 chunks); 3032 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used CachedPlan: 7168 total in 3 blocks; 3704 free (1 chunks); 3464 used CachedPlanSource: 3072 total in 2 blocks; 896 free (0 chunks); 2176 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used cl_event_crm_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used cl_event_lead_id_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used cl_event_event_name_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used cl_event_crm_lead_id_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used cl_event_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used locality_data_postal_code_norm_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used locality_data_postal_code_key: 2048 total in 1 blocks; 392 free (0 chunks); 1656 used locality_data_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_meta_item_value_key_idx: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used lead_reporting_meta_item_key_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_reporting_meta_item_type_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_meta_lead_id_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_meta_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used CachedPlanSource: 1024 total in 1 blocks; 56 free (0 chunks); 968 used SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used CachedPlanSource: 1024 total in 1 blocks; 56 free (0 chunks); 968 used SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used CachedPlanSource: 3072 total in 2 blocks; 1248 free (1 chunks); 1824 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used pg_toast_18404_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used pg_attrdef_adrelid_adnum_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used CachedPlan: 3072 total in 2 blocks; 936 free (0 chunks); 2136 used CachedPlanSource: 3072 total in 2 blocks; 1424 free (0 chunks); 1648 used SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used CachedPlanSource: 3072 total in 2 blocks; 360 free (0 chunks); 2712 used SPI Plan: 1024 total in 1 blocks; 800 free (0 chunks); 224 used CachedPlan: 7168 total in 3 blocks; 3888 free (1 chunks); 3280 used CachedPlanSource: 3072 total in 2 blocks; 360 free (0 chunks); 2712 used SPI Plan: 1024 total in 1 blocks; 800 free (0 chunks); 224 used CachedPlan: 7168 total in 3 blocks; 3704 free (1 chunks); 3464 used CachedPlanSource: 3072 total in 2 blocks; 896 free (0 chunks); 2176 used SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used region_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used pg_toast_18005_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used request_state_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_id_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_billing_contention_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_reporting_data_delivery_meth_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_typeflags_idx: 2048 total in 1 blocks; 600 free (0 chunks); 1448 used lead_reporting_data_unique_dealer_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_unique_area_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_unique_region_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_unique_all_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_oem_subtype_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_oem_type_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_dealer_area_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_customer_lname_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_customer_fname_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_customer_postal_code_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_dealer_region_area_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_dealer_region_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_dealer_code_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_reporting_data_reporting_date_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_reporting_data_model_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_reporting: 23552 total in 5 blocks; 7240 free (0 chunks); 16312 used lead_processing_state_step_idx: 2048 total in 1 blocks; 656 free (0 chunks); 1392 used reporting_modified_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used reporting_last_processed_date_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used reporting_date_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_created_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_processing_step_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_destination_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_modified_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_source_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used metadata_key_value_idx: 2048 total in 1 blocks; 656 free (0 chunks); 1392 used metadata_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used dealer_metadata_pkey: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used xml_schema_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used ehcache_expiration_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used ehcache_pkey: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used external_system_user_name_key: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used external_system_name_key: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used external_system_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used dealer_external_system_id_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used dealer_dealer_code_key: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used dealer_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used xslt_style_sheet_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used pg_index_indrelid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used pg_ts_dict_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_aggregate_fnoid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_language_name_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_ts_dict_dictname_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_namespace_nspname_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_opfamily_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_opclass_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_ts_parser_prsname_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_amop_fam_strat_index: 3072 total in 2 blocks; 1384 free (2 chunks); 1688 used pg_opclass_am_name_nsp_index: 3072 total in 2 blocks; 1576 free (3 chunks); 1496 used pg_trigger_tgrelid_tgname_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_cast_source_target_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_auth_members_role_member_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_attribute_relid_attnum_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_ts_config_cfgname_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_authid_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_ts_config_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_conversion_default_index: 3072 total in 2 blocks; 1432 free (3 chunks); 1640 used pg_language_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_enum_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_proc_proname_args_nsp_index: 3072 total in 2 blocks; 1576 free (3 chunks); 1496 used pg_ts_parser_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_database_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_conversion_name_nsp_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_class_relname_nsp_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_attribute_relid_attnam_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_class_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1384 free (2 chunks); 1688 used pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1384 free (2 chunks); 1688 used pg_index_indexrelid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_type_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_authid_rolname_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_auth_members_member_role_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_enum_typid_label_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_constraint_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_conversion_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_ts_template_tmplname_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_ts_config_map_index: 3072 total in 2 blocks; 1624 free (3 chunks); 1448 used pg_namespace_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_type_typname_nsp_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_operator_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_amop_opr_fam_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_proc_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1624 free (3 chunks); 1448 used pg_ts_template_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used MdSmgr: 24576 total in 2 blocks; 16192 free (0 chunks); 8384 used LOCALLOCK hash: 24576 total in 2 blocks; 11872 free (5 chunks); 12704 used Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used 2009-02-08 20:11:27 CST lms_nna ERROR: out of memory 2009-02-08 20:11:27 CST lms_nna DETAIL: Failed on request of size 243.
* Matt Magoffin (postgresql.org@msqr.us) wrote: > We've been having persistent out-of-memory errors occur in our production > 8.3 deployment, which is now running 8.3.5. I'm not sure the query here is > the cause of the problem, but this is our most-recent example which > triggered an out-of-memory error for us. erm.. How much memory do you have in the system? > Perhaps our configuration needs tweaking somewhere... > > shared_buffers = 4000MB I hope you've got a fair bit more than 4G of memory if you're going to use 4G for shared buffers... Once that memory is dedicated to shared buffers it's not going to be available for other usage. > #temp_buffers = 8MB > work_mem = 128MB > maintenance_work_mem = 256MB > #max_stack_depth = 2MB Those are bumped up from the defaults, but don't look too bad to me. Do realize that 'work_mem' is the amount PG is allowed to use for each action and that complex queries can end up using many times that amount (but I doubt that's the real issue here..). Thanks, Stephen
Attachment
> erm.. How much memory do you have in the system? This system has 16GB of RAM, and Postgres is basically the only service running on the box. >> shared_buffers = 4000MB > > I hope you've got a fair bit more than 4G of memory if you're going to > use 4G for shared buffers... Once that memory is dedicated to shared > buffers it's not going to be available for other usage. Yep, we've got 16GB to work with here. I should have also mentioned the architecture in my original post, sorry. SELECT version() returns this: PostgreSQL 8.3.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14) Regards, Matt
* Matt Magoffin (postgresql.org@msqr.us) wrote: > Yep, we've got 16GB to work with here. I should have also mentioned the > architecture in my original post, sorry. SELECT version() returns this: > > PostgreSQL 8.3.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) > 4.1.2 20070626 (Red Hat 4.1.2-14) Does the result from 'free' look reasonable on this box? How many PG processes are running, generally? Do you see the PG process running this query growing to fill the available memory? That query really didn't look like it was chewing up all that much memory to me.. Have you tried reducing your work_mem setting to see if that makes the errors go away? It might cause a different plan which might be less efficient for some queries, not sure if you have anything which would be dramatically affected by that.. Do the row counts in the plan match up with what you'd expect? Can you provide output from 'explain analyze' on a successful run? Thanks, Stephen
Attachment
>> PostgreSQL 8.3.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) >> 4.1.2 20070626 (Red Hat 4.1.2-14) > > Does the result from 'free' look reasonable on this box? I think so: total used free shared buffers cached Mem: 16432296 16273964 158332 0 173536 14321340 -/+ buffers/cache: 1779088 14653208 Swap: 2096440 560 2095880 > How many PG processes are running, generally? I see about 90 currently, of which 24 are Slony connections and the rest are client connections. > Do you see the PG process running this query growing to fill the available memory? That query really > didn't look like it was chewing up all that much memory to me.. Just running top, it does appear to chew through a fair amount of memory. Here's a snapshot from top of the postgres processing running this query from just before it ran out of memory: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4486 postgres 18 0 4576m 3.6g 3.3g R 90 23.1 0:34.23 postgres: postgres lms_nna [local] EXPLAIN These values did start out low, for example the RES memory started in the 130MB range, then climbed to the 3.6GB you see here. > Have > you tried reducing your work_mem setting to see if that makes the errors > go away? It might cause a different plan which might be less efficient > for some queries, not sure if you have anything which would be > dramatically affected by that..Do the row counts in the plan match up > with what you'd expect? Can you provide output from 'explain analyze' > on a successful run? I set the work_mem to 2MB, and the query does actually complete (explain analyze output below), so does this mean that the query simply uses too much memory for sorting/joining, and we'd have to either allocate enough work_mem to allow the query to complete, or a smaller work_mem as shown here to make the query use slower disk-based sorting? The row counts are matching what we'd expect from this query. QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Sort (cost=2345301.47..2345402.31 rows=40336 width=234) (actual time=203429.526..203599.840 rows=49317 loops=1) Sort Key: s.sale_date, s.vin Sort Method: external merge Disk: 4776kB -> Hash Left Join (cost=1469244.28..2337665.81 rows=40336 width=234) (actual time=162362.791..202883.728 rows=49317 loops=1) Hash Cond: (ml.lead_id = phone.lead_id) -> Hash Left Join (cost=1349360.82..2159151.44 rows=40336 width=219) (actual time=155499.666..194941.423 rows=49317 loops=1) Hash Cond: (ml.lead_id = email.lead_id) -> Nested Loop Left Join (cost=1236733.28..1999980.90 rows=40336 width=204) (actual time=141890.479..186344.717 rows=49317 loops=1) -> Nested Loop Left Join (cost=1236733.28..1806406.02 rows=40336 width=141) (actual time=141890.419..166782.070 rows=49317 loops=1) -> Nested Loop Left Join (cost=1236733.28..1586899.03 rows=40336 width=126) (actual time=141890.368..166124.253 rows=49317 loops=1) -> Nested Loop Left Join (cost=1236733.28..1413543.83 rows=40336 width=118) (actual time=141890.281..156284.612 rows=49317 loops=1) -> Merge Left Join (cost=1236733.28..1237778.33 rows=40336 width=89) (actual time=141890.184..143717.900 rows=49317 loops=1) Merge Cond: (ml.lead_id = lrdm.lead_id) -> Sort (cost=43356.21..43457.05 rows=40336 width=78) (actual time=1722.385..1794.668 rows=49317 loops=1) Sort Key: ml.lead_id Sort Method: external merge Disk: 4152kB -> Hash Left Join (cost=11354.33..38475.05 rows=40336 width=78) (actual time=919.319..1578.556 rows=49317 loops=1) Hash Cond: ((s.dealer_code)::text = (d.dealer_code)::text) -> Hash Left Join (cost=11018.04..37584.13 rows=40336 width=60) (actual time=914.936..1445.926 rows=49317 loops=1) Hash Cond: (s.id = ml.sale_id) Filter: (((s.sale_type = 'd'::bpchar) AND (NOT ml.lost_sale)) OR ((s.sale_type = 'c'::bpchar) AND (ml.lead_pos = 0)) OR (s.sale_type = '0'::bpchar)) -> Index Scan using mb_sale_sale_date_idx on mb_sale s (cost=0.00..14027.94 rows=43804 width=50) (actual time=0.149..126.835 rows=49152 loops=1) Index Cond: ((sale_date >= '2009-01-01'::date) AND (sale_date <= '2009-01-31'::date)) -> Hash (cost=5577.35..5577.35 rows=281335 width=26) (actual time=914.089..914.089 rows=281335 loops=1) -> Seq Scan on mb_lead ml (cost=0.00..5577.35 rows=281335 width=26) (actual time=0.026..394.384 rows=281335 loops=1) -> Hash (cost=321.13..321.13 rows=1213 width=23) (actual time=4.343..4.343 rows=1213 loops=1) -> Seq Scan on dealer d (cost=0.00..321.13 rows=1213 width=23) (actual time=0.013..2.451 rows=1213 loops=1) -> Materialize (cost=1193377.08..1194985.94 rows=128709 width=19) (actual time=135323.691..140326.709 rows=1442527 loops=1) -> Sort (cost=1193377.08..1193698.85 rows=128709 width=19) (actual time=135323.681..137742.900 rows=1442488 loops=1) Sort Key: lrdm.lead_id Sort Method: external merge Disk: 50312kB -> Bitmap Heap Scan on lead_reporting_meta lrdm (cost=118695.77..1179811.20 rows=128709 width=19) (actual time=90238.108..126737.875 rows=1460904 loops=1) Recheck Cond: (item_key = '[ALS:prospectid]TrackingCode'::text) Filter: (pos = 1) -> Bitmap Index Scan on lead_reporting_meta_item_key_idx (cost=0.00..118663.60 rows=1476580 width=0) (actual time=90223.734..90223.734 rows=1476483 loops=1) Index Cond: (item_key = '[ALS:prospectid]TrackingCode'::text) -> Index Scan using lead_reporting_address_lead_id_idx on lead_reporting_address address (cost=0.00..4.35 rows=1 width=37) (actual time=0.251..0.252 rows=0 loops=49317) Index Cond: (address.lead_id = ml.lead_id) -> Index Scan using lead_reporting_street_address_id_idx on lead_reporting_street address_street (cost=0.00..4.29 rows=1 width=24) (actual time=0.196..0.197 rows=0 loops=49317) Index Cond: (address_street.address_id = address.id) Filter: (address_street.pos = 0) -> Index Scan using lead_pkey on lead l (cost=0.00..5.43 rows=1 width=23) (actual time=0.010..0.010 rows=0 loops=49317) Index Cond: (l.id = ml.lead_id) -> Index Scan using lead_reporting_data_pkey on lead_reporting_data lrd (cost=0.00..4.79 rows=1 width=71) (actual time=0.393..0.393 rows=0 loops=49317) Index Cond: (lrd.lead_id = ml.lead_id) -> Hash (cost=85837.99..85837.99 rows=1459164 width=23) (actual time=7607.947..7607.947 rows=1521294 loops=1) -> Seq Scan on lead_reporting_list_data email (cost=0.00..85837.99 rows=1459164 width=23) (actual time=0.030..5063.069 rows=1521294 loops=1) Filter: ((list_type = 'e'::bpchar) AND (pos = 0)) -> Hash (cost=85837.99..85837.99 rows=1854357 width=23) (actual time=6863.050..6863.050 rows=1803678 loops=1) -> Seq Scan on lead_reporting_list_data phone (cost=0.00..85837.99 rows=1854357 width=23) (actual time=0.020..3920.096 rows=1803678 loops=1) Filter: ((list_type = 'p'::bpchar) AND (pos = 0)) Total runtime: 203657.960 ms (52 rows)
* Matt Magoffin (postgresql.org@msqr.us) wrote: > > Does the result from 'free' look reasonable on this box? > > I think so: > > total used free shared buffers cached > Mem: 16432296 16273964 158332 0 173536 14321340 > -/+ buffers/cache: 1779088 14653208 > Swap: 2096440 560 2095880 That certainly looks fine.. And you've got 14G or so which should be available for this query. Was this near the time the query was running? Could you give us what 'free' returns when the query is close to the out-of-memory error? I'd expect the 2nd row under 'free' to be getting low for the allocation to fail. > Just running top, it does appear to chew through a fair amount of memory. > Here's a snapshot from top of the postgres processing running this query > from just before it ran out of memory: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 4486 postgres 18 0 4576m 3.6g 3.3g R 90 23.1 0:34.23 postgres: > postgres lms_nna [local] EXPLAIN > > These values did start out low, for example the RES memory started in the > 130MB range, then climbed to the 3.6GB you see here. Uhh.. I saw that your system was 64-bit, but is your PG process compiled as 64bit? Maybe you're hitting an artificial 32-bit limit, which isn't exactly helped by your shared_buffers being set up so high to begin with? Run 'file' on your postgres binary, like so: sfrost@snowman:/home/sfrost> file /usr/lib/postgresql/8.3/bin/postgres /usr/lib/postgresql/8.3/bin/postgres: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.8, stripped (that's on Debian, your postgres binary will probably be somewhere else). > I set the work_mem to 2MB, and the query does actually complete (explain > analyze output below), so does this mean that the query simply uses too > much memory for sorting/joining, and we'd have to either allocate enough > work_mem to allow the query to complete, or a smaller work_mem as shown > here to make the query use slower disk-based sorting? The row counts are > matching what we'd expect from this query. Allocating more work_mem won't help in this situation. PG's out-of-memory error is only ever caused by an actual allocation failure. The work_mem, to PG, is more of a guideline than any kind of hard limit. Based on the explain analyze, I continue to feel that your query isn't actually all that big in terms of memory usage (the disk-based sorting taking place was in the 50M range or so, from what I saw, and you should have plenty of memory for such a query..). Check your binary.. I suspect that's where your issue is. Now, I don't recall offhand if you can just recompile-in-place or if you need to do a dump/restore, but I'd definitely do a backup just in case and keep it handy (or maybe use it just to be safe) and see if PG complains when it's started up. Thanks, Stephen
Attachment
> * Matt Magoffin (postgresql.org@msqr.us) wrote: >> Just running top, it does appear to chew through a fair amount of memory. >> Here's a snapshot from top of the postgres processing running this query >> from just before it ran out of memory: >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 4486 postgres 18 0 4576m 3.6g 3.3g R 90 23.1 0:34.23 postgres: >> postgres lms_nna [local] EXPLAIN >> >> These values did start out low, for example the RES memory started in the >> 130MB range, then climbed to the 3.6GB you see here. That is almost certainly meaningless; it just reflects the process touching a larger and larger fraction of shared buffers over its existence. The number to pay attention to is the non-shared memory size (VIRT - SHR is probably the right number here). Stephen Frost <sfrost@snowman.net> writes: > Uhh.. I saw that your system was 64-bit, but is your PG process > compiled as 64bit? Maybe you're hitting an artificial 32-bit limit, > which isn't exactly helped by your shared_buffers being set up so high > to begin with? Run 'file' on your postgres binary, like so: I think it must be compiled 64-bit, or he'd not be able to get shared_buffers that high to start with. However, it's possible that the postmaster's been started under a ulimit setting that constrains each backend to just a few hundred meg of per-process memory. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Uhh.. I saw that your system was 64-bit, but is your PG process > > compiled as 64bit? Maybe you're hitting an artificial 32-bit limit, > > which isn't exactly helped by your shared_buffers being set up so high > > to begin with? Run 'file' on your postgres binary, like so: > > I think it must be compiled 64-bit, or he'd not be able to get > shared_buffers that high to start with. However, it's possible that the > postmaster's been started under a ulimit setting that constrains each > backend to just a few hundred meg of per-process memory. I'm not so sure.. He has it as '4000MB', which would leave 96M free, which doesn't seem that far from where his query is ending up at based on the memory info he provided and the size of the sorts being done. The ulimit is certainly something else to check tho. Thanks, Stephen
Attachment
>> total used free shared buffers >> cached >> Mem: 16432296 16273964 158332 0 173536 >> 14321340 >> -/+ buffers/cache: 1779088 14653208 >> Swap: 2096440 560 2095880 > > That certainly looks fine.. And you've got 14G or so which should be > available for this query. Was this near the time the query was running? > Could you give us what 'free' returns when the query is close to the > out-of-memory error? I'd expect the 2nd row under 'free' to be getting > low for the allocation to fail. It was near the time... here's free just before executing the query: total used free shared buffers cached Mem: 16432296 16342056 90240 0 116868 14561200 -/+ buffers/cache: 1663988 14768308 Swap: 2096440 448 2095992 And then here is free close to the time the query runs out of memory (but while the query is still running): total used free shared buffers cached Mem: 16432296 16348260 84036 0 41344 14167384 -/+ buffers/cache: 2139532 14292764 Swap: 2096440 448 2095992 > Uhh.. I saw that your system was 64-bit, but is your PG process > compiled as 64bit? Maybe you're hitting an artificial 32-bit limit, > which isn't exactly helped by your shared_buffers being set up so high > to begin with? Yes, it's compiled as 64-bit: /opt/lms/postgres-8.3_64/bin/postgres: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped We've had nagging memory-related issues with 8.3 that manifest themselves like memory leaks... some posts I've made in the past have led to some leaks getting fixed... but I've not been able to track down more specific causes. It's just that over time Postgres seems to accumulate memory and then we start to see out-of-memory errors like this one. Again, this query could have nothing to do with the root cause, but this is just the example I have to work with at the moment. Regards, Matt
> I think it must be compiled 64-bit, or he'd not be able to get > shared_buffers that high to start with. However, it's possible that the > postmaster's been started under a ulimit setting that constrains each > backend to just a few hundred meg of per-process memory. Here's the output of ulimit -a by the "postgres" user the database is running under: [postgres@170226-db7 ~]$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited max nice (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 139264 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 max rt priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 139264 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited I think this means it does not have an artificial memory limit imposed, but is there a specific setting beyond these I could check do you think? Regards, Matt
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> I think it must be compiled 64-bit, or he'd not be able to get >> shared_buffers that high to start with. > I'm not so sure.. He has it as '4000MB', which would leave 96M free, > which doesn't seem that far from where his query is ending up at based > on the memory info he provided and the size of the sorts being done. > The ulimit is certainly something else to check tho. Hmm ... a gig here, a gig there, pretty soon you're talking about real memory? He's got several sorts and hashes that are each taking over 100MB according to the memory context dump, so it seems impossible that it all fits into a strict 32-bit address space. There's surely no harm in double-checking the executable's file type though. regards, tom lane
> Hmm ... a gig here, a gig there, pretty soon you're talking about real > memory? He's got several sorts and hashes that are each taking over > 100MB according to the memory context dump, so it seems impossible that > it all fits into a strict 32-bit address space. There's surely no harm > in double-checking the executable's file type though. I did reply to this in a different email in this thread, but just to be safe, yes, I did verify Postgres is a 64-bit binary: [postgres@170226-db7 ~]$ file /opt/lms/postgres-8.3_64/bin/postgres /opt/lms/postgres-8.3_64/bin/postgres: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped Regards, Matt
"Matt Magoffin" <postgresql.org@msqr.us> writes: > Here's the output of ulimit -a by the "postgres" user the database is > running under: > ... > I think this means it does not have an artificial memory limit imposed, Agreed, that ulimit isn't reflecting any such limit, but is that really the same environment the postmaster gets started in? I wouldn't trust a system startup script to be launched in the same environment that a login shell gets. You might try adding ulimit -a >/tmp/something to the startup script to find out for sure. regards, tom lane
* Matt Magoffin (postgresql.org@msqr.us) wrote: > > I think it must be compiled 64-bit, or he'd not be able to get > > shared_buffers that high to start with. However, it's possible that the > > postmaster's been started under a ulimit setting that constrains each > > backend to just a few hundred meg of per-process memory. > > Here's the output of ulimit -a by the "postgres" user the database is > running under: [...] > I think this means it does not have an artificial memory limit imposed, > but is there a specific setting beyond these I could check do you think? How about cat /proc/<pid>/limits for the postmaster? And maybe: status stat maps Though I'm kinda grasping at straws here, to be honest. I've had PG up and running through >16G of memory at a time before. Thanks, Stephen
Attachment
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Hmm ... a gig here, a gig there, pretty soon you're talking about real > memory? He's got several sorts and hashes that are each taking over > 100MB according to the memory context dump, so it seems impossible that > it all fits into a strict 32-bit address space. There's surely no harm > in double-checking the executable's file type though. Argh, it's late and I'm not thinking. Yeah, I had seen the 100M hashes and just wasn't thinking clearly. Sorry for the noise. Stephen
Attachment
"Matt Magoffin" <postgresql.org@msqr.us> writes: > We've had nagging memory-related issues with 8.3 that manifest themselves > like memory leaks... some posts I've made in the past have led to some > leaks getting fixed... but I've not been able to track down more specific > causes. It's just that over time Postgres seems to accumulate memory and > then we start to see out-of-memory errors like this one. Again, this query > could have nothing to do with the root cause, but this is just the example > I have to work with at the moment. Yeah, I remember we found a few xml-related leaks based on your reports. However, there's not anything here to suggest that this query is straining the capabilities of a 64-bit system with lots o RAM. It seems certain you're hitting some artificial process-size limit, and the only one I know about is ulimit. I wasn't aware of /proc/<pid>/limits before, but now that I've heard of it, checking that for the postmaster and/or a backend seems like a great idea. regards, tom lane
> How about cat /proc/<pid>/limits for the postmaster? > And maybe: > status > stat > maps > > Though I'm kinda grasping at straws here, to be honest. I've had PG up > and running through >16G of memory at a time before. There is no /prod/<pid>/limits file, but here are status: Name: postmaster State: S (sleeping) SleepAVG: 95% Tgid: 4698 Pid: 4698 PPid: 1 TracerPid: 0 Uid: 26 26 26 26 Gid: 26 26 26 26 FDSize: 1024 Groups: 26 VmPeak: 4301216 kB VmSize: 4301204 kB VmLck: 0 kB VmHWM: 85980 kB VmRSS: 85980 kB VmData: 940 kB VmStk: 84 kB VmExe: 4480 kB VmLib: 3512 kB VmPTE: 284 kB StaBrk: 00ade000 kB Brk: 0c3db000 kB StaStk: 7fff21fecf40 kB Threads: 1 SigQ: 0/139264 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000001303000 SigCgt: 0000000000014a07 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 Cpus_allowed: 7fffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff Mems_allowed: 00000000,00000001 stat: 4698 (postmaster) S 1 4686 4686 0 -1 4202496 27092033 7777059343 31 2809 6684 47834 13037706 1828097 15 0 1 0 9296 4404432896 21495 18446744073709551615 4194304 8778268 140733763735360 18446744073709551615 232627404115 0 0 19935232 84487 0 0 0 17 2 0 0 69 maps: 00400000-00860000 r-xp 00000000 08:03 1120168 /opt/lms/64/postgres/8.3/bin/postgres 00a5f000-00a6b000 rw-p 0045f000 08:03 1120168 /opt/lms/64/postgres/8.3/bin/postgres 00a6b000-00ade000 rw-p 00a6b000 00:00 0 0c39c000-0c3db000 rw-p 0c39c000 00:00 0 3012200000-3012234000 r-xp 00000000 08:07 5237058 /usr/lib64/libxslt.so.1.1.17 3012234000-3012433000 ---p 00034000 08:07 5237058 /usr/lib64/libxslt.so.1.1.17 3012433000-3012435000 rw-p 00033000 08:07 5237058 /usr/lib64/libxslt.so.1.1.17 3012a00000-3012a05000 r-xp 00000000 08:07 457029 /lib64/libcrypt-2.5.so 3012a05000-3012c04000 ---p 00005000 08:07 457029 /lib64/libcrypt-2.5.so 3012c04000-3012c05000 r--p 00004000 08:07 457029 /lib64/libcrypt-2.5.so 3012c05000-3012c06000 rw-p 00005000 08:07 457029 /lib64/libcrypt-2.5.so 3012c06000-3012c34000 rw-p 3012c06000 00:00 0 3629600000-362961a000 r-xp 00000000 08:07 457227 /lib64/ld-2.5.so 3629819000-362981a000 r--p 00019000 08:07 457227 /lib64/ld-2.5.so 362981a000-362981b000 rw-p 0001a000 08:07 457227 /lib64/ld-2.5.so 3629a00000-3629b46000 r-xp 00000000 08:07 457228 /lib64/libc-2.5.so 3629b46000-3629d46000 ---p 00146000 08:07 457228 /lib64/libc-2.5.so 3629d46000-3629d4a000 r--p 00146000 08:07 457228 /lib64/libc-2.5.so 3629d4a000-3629d4b000 rw-p 0014a000 08:07 457228 /lib64/libc-2.5.so 3629d4b000-3629d50000 rw-p 3629d4b000 00:00 0 3629e00000-3629e02000 r-xp 00000000 08:07 457229 /lib64/libdl-2.5.so 3629e02000-362a002000 ---p 00002000 08:07 457229 /lib64/libdl-2.5.so 362a002000-362a003000 r--p 00002000 08:07 457229 /lib64/libdl-2.5.so 362a003000-362a004000 rw-p 00003000 08:07 457229 /lib64/libdl-2.5.so 362aa00000-362aa82000 r-xp 00000000 08:07 457235 /lib64/libm-2.5.so 362aa82000-362ac81000 ---p 00082000 08:07 457235 /lib64/libm-2.5.so 362ac81000-362ac82000 r--p 00081000 08:07 457235 /lib64/libm-2.5.so 362ac82000-362ac83000 rw-p 00082000 08:07 457235 /lib64/libm-2.5.so 362b600000-362b614000 r-xp 00000000 08:07 5231431 /usr/lib64/libz.so.1.2.3 362b614000-362b813000 ---p 00014000 08:07 5231431 /usr/lib64/libz.so.1.2.3 362b813000-362b814000 rw-p 00013000 08:07 5231431 /usr/lib64/libz.so.1.2.3 362ba00000-362bb33000 r-xp 00000000 08:07 5235374 /usr/lib64/libxml2.so.2.6.26 362bb33000-362bd33000 ---p 00133000 08:07 5235374 /usr/lib64/libxml2.so.2.6.26 362bd33000-362bd3c000 rw-p 00133000 08:07 5235374 /usr/lib64/libxml2.so.2.6.26 362bd3c000-362bd3d000 rw-p 362bd3c000 00:00 0 2aaaaaaab000-2aaaaaaac000 rw-p 2aaaaaaab000 00:00 0 2aaaaaabd000-2aaaaaac1000 rw-p 2aaaaaabd000 00:00 0 2aaaaaac1000-2aaaaaacb000 r-xp 00000000 08:07 456988 /lib64/libnss_files-2.5.so 2aaaaaacb000-2aaaaacca000 ---p 0000a000 08:07 456988 /lib64/libnss_files-2.5.so 2aaaaacca000-2aaaaaccb000 r--p 00009000 08:07 456988 /lib64/libnss_files-2.5.so 2aaaaaccb000-2aaaaaccc000 rw-p 0000a000 08:07 456988 /lib64/libnss_files-2.5.so 2aaaaaccc000-2aaaae299000 r--p 00000000 08:07 5229018 /usr/lib/locale/locale-archive 2aaaae299000-2aabafc41000 rw-s 00000000 00:08 0 /SYSV0063da81 (deleted) 7fff21fda000-7fff21fef000 rw-p 7fff21fda000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] Hope this helps, Matt
> Agreed, that ulimit isn't reflecting any such limit, but is that really > the same environment the postmaster gets started in? I wouldn't trust > a system startup script to be launched in the same environment that a > login shell gets. You might try adding > ulimit -a >/tmp/something > to the startup script to find out for sure. Our startup script is calling su -l postgres ... So I thought the limits would be the same (from the -l flag)? I then tried to mimic this with the following: [root@170226-db7 ~]# su -l postgres -c "ulimit -a" core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited max nice (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 139264 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 max rt priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 139264 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited which I think should accurately reflect what the postmaster environment should be seeing. Regards, Matt
* Matt Magoffin (postgresql.org@msqr.us) wrote: > There is no /prod/<pid>/limits file, but here are erp, that stinks. Must be on an older kernel? I've got it under (a Debian-built) 2.6.26. I can't recall if there's another way to get limit info for an active process.. Could use Tom's suggestion of echo'ing ulimit -a out to a file somewhere during database start-up. Thanks, Stephen
Attachment
> Yeah, I remember we found a few xml-related leaks based on your reports. > However, there's not anything here to suggest that this query is > straining the capabilities of a 64-bit system with lots o RAM. It seems > certain you're hitting some artificial process-size limit, and the only > one I know about is ulimit. > > I wasn't aware of /proc/<pid>/limits before, but now that I've heard > of it, checking that for the postmaster and/or a backend seems like > a great idea. This doesn't seem to exist for any process on this box: [root@170226-db7 ~]# ls /proc/*/limit* ls: /proc/*/limit*: No such file or directory If this were a system-defined process-size limit, then should the query still run out of memory after restarting Postgres? Most likely we'll have to restart Postgres soon, and I'll retry this query after doing so. Based on past experience, I'd expect the query to complete at that time. From what we experience, Postgres seems to be slowly accumulating memory in the fashion of a small memory leak and things start to fail with out-of-memory errors after the server has been running for some time (e.g. roughly 4-6 weeks). Restarting Postgres clears out the problems (after a restart we can immediately run queries that were failing before the restart)... but then the cycle starts again. I just bring this up wondering if there is something possibly accumulating within Postgres that isn't getting freed and might cause an out-of-memory error like this in some way. Regards, Matt
> erp, that stinks. Must be on an older kernel? I've got it under (a > Debian-built) 2.6.26. I can't recall if there's another way to get > limit info for an active process.. Could use Tom's suggestion of > echo'ing ulimit -a out to a file somewhere during database start-up. Yes, this is a RHEL 5 box with kernel 2.6.18... must not exist for this ancient kernel :-) Regards, Matt
> [root@170226-db7 ~]# su -l postgres -c "ulimit -a" > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > max nice (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 139264 > max locked memory (kbytes, -l) 32 > max memory size (kbytes, -m) unlimited > open files (-n) 1024 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > max rt priority (-r) 0 > stack size (kbytes, -s) 10240 > cpu time (seconds, -t) unlimited > max user processes (-u) 139264 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited I just noticed something: the "open files" limit lists 1024, which is the default for this system. A quick count of open data files currently in use by Postgres returns almost 7000, though. [root@170226-db7 ~]# lsof -u postgres |egrep '(/pg_data|/pg_index|/pg_log)' |wc -l 6749 We have 100+ postgres processes running, so for an individual process, could the 1024 file limit be doing anything to this query? Or would I see an explicit error message regarding this condition? Regards, Matt
On Mon, Feb 9, 2009 at 8:23 AM, Matt Magoffin <postgresql.org@msqr.us> wrote: > I just noticed something: the "open files" limit lists 1024, which is the > default for this system. A quick count of open data files currently in use > by Postgres returns almost 7000, though. > > [root@170226-db7 ~]# lsof -u postgres |egrep > '(/pg_data|/pg_index|/pg_log)' |wc -l > 6749 > > We have 100+ postgres processes running, so for an individual process, > could the 1024 file limit be doing anything to this query? Or would I see > an explicit error message regarding this condition? you would get one of "Open files rlimit 1024 reached for uid xxxx" in syslog (which you should checkout anyhow). I wonder if it isn't just another one of those 'this only happends on 64bit machine' problems :) -- GJ
>> We have 100+ postgres processes running, so for an individual process, >> could the 1024 file limit be doing anything to this query? Or would I >> see >> an explicit error message regarding this condition? > > you would get one of "Open files rlimit 1024 reached for uid xxxx" in > syslog (which you should checkout anyhow). No... nothing like this in syslog. Regards, Matt
On Mon, Feb 9, 2009 at 8:53 AM, Matt Magoffin <postgresql.org@msqr.us> wrote: >>> We have 100+ postgres processes running, so for an individual process, >>> could the 1024 file limit be doing anything to this query? Or would I >>> see >>> an explicit error message regarding this condition? >> >> you would get one of "Open files rlimit 1024 reached for uid xxxx" in >> syslog (which you should checkout anyhow). > > No... nothing like this in syslog. > So did the backend crashed on this one, or just produced 'out of memory ' message ? -- GJ
Matt Magoffin wrote: > We have 100+ postgres processes running, so for an individual process, > could the 1024 file limit be doing anything to this query? Or would I see > an explicit error message regarding this condition? > > with 100 concurrent postgres connections, if they all did something requiring large amounts of work_mem, you could allocate 100 * 125MB (I believe thats what you said it was set to?) which is like 12GB :-O in fact a single query thats doing multiple sorts of large datasets for a messy join (or other similar activity) can involve several instances of workmem. multiply that by 100 queries, and ouch. have you considered using a connection pool to reduce the postgres process count?
On Mon, Feb 9, 2009 at 2:17 AM, John R Pierce <pierce@hogranch.com> wrote: > Matt Magoffin wrote: >> >> We have 100+ postgres processes running, so for an individual process, >> could the 1024 file limit be doing anything to this query? Or would I see >> an explicit error message regarding this condition? >> >> > > with 100 concurrent postgres connections, if they all did something > requiring large amounts of work_mem, you could allocate 100 * 125MB (I > believe thats what you said it was set to?) which is like 12GB :-O > > in fact a single query thats doing multiple sorts of large datasets for a > messy join (or other similar activity) can involve several instances of > workmem. multiply that by 100 queries, and ouch. > > have you considered using a connection pool to reduce the postgres process > count? No matter what I am pretty conservative with work_mem for these reasons. Plus, I tested most of our queries and raising work_mem above 16Meg had no real positive effect on most queries. If I have a single reporting query that can use work_mem over that I set it and run that query by itself (from things like cron jobs) rather than just leaving work_mem really high. High work_mem is a bit of a foot gun.
"Matt Magoffin" <postgresql.org@msqr.us> writes: > I just noticed something: the "open files" limit lists 1024, which is the > default for this system. A quick count of open data files currently in use > by Postgres returns almost 7000, though. In principle, any problem with that ought to result in explicit error messages like "too many open files". In practice, what can happen is that Postgres can run the kernel out of file table slots (which are a finite system-wide resource) and then everything else on the box starts to fall over. Postgres itself will often survive just fine, since it has a preprogrammed response to ENFILE to close some other file it doesn't need right now and try again. Most other userland code will just croak if open() starts failing. Which is a longwinded way of saying "that doesn't seem to match the symptoms, but ..." If you're not dead certain that your kernel is configured to allow *well* north of 7000 open files, you might consider cutting max_files_per_process in half at your next restart. I am wondering if there could be some kind of memory-pressure effect inside the kernel proper that causes lots of open files to translate into unexpected denials of requests for more process memory. It's kind of grasping at straws, admittedly. But it seems like we've eliminated the possibility of a material memory leak inside the Postgres backend process (even if you don't trust the memory context dump on its own, the reported process VSIZE backs it up), and the other printout you showed said that there was hardly any usage of swap space, which seems to defeat the theory that there's system-wide memory pressure in the normal sense. OTOH ... wait a minute. Have you maybe got the system configured to start denying memory requests before it gets into significant swapping? We typically suggest setting vm.overcommit_memory=2 on Linux, but I'm not sure whether that results in the kernel trying to avoid using swap at all. Maybe this really is an issue with system-wide total memory use rather than any one process. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > OTOH ... wait a minute. Have you maybe got the system configured to > start denying memory requests before it gets into significant swapping? > We typically suggest setting vm.overcommit_memory=2 on Linux, but > I'm not sure whether that results in the kernel trying to avoid using > swap at all. Maybe this really is an issue with system-wide total > memory use rather than any one process. Err, yea, overcommit_memory=2, a small amount of swap space, and a low overcommit_ratio could cause this to happen... The default ratio is 50 though, which should mean, on this system, there is about 10G available for user processes, but his usage shows only a bit over 2G being used outside of buffers/caches (based on free output).. Matt, can you provide the output from these: cat /proc/sys/vm/overcommit_memory cat /proc/sys/vm/overcommit_ratio cat /proc/meminfo ? Thanks, Stephen
Attachment
> So did the backend crashed on this one, or just produced 'out of > memory ' message ? No crash, just the error message. -- m@
> Err, yea, overcommit_memory=2, a small amount of swap space, and a low > overcommit_ratio could cause this to happen... The default > ratio is 50 though, which should mean, on this system, there is about > 10G available for user processes, but his usage shows only a bit over 2G > being used outside of buffers/caches (based on free output).. > > Matt, can you provide the output from these: > cat /proc/sys/vm/overcommit_memory > > cat /proc/sys/vm/overcommit_ratio > > cat /proc/meminfo Sure, here you go: [root@170226-db7 ~]# cat /proc/sys/vm/overcommit_memory 2 [root@170226-db7 ~]# cat /proc/sys/vm/overcommit_ratio 50 [root@170226-db7 ~]# cat /proc/meminfo MemTotal: 16432296 kB MemFree: 160188 kB Buffers: 164764 kB Cached: 14006456 kB SwapCached: 0 kB Active: 6984144 kB Inactive: 8471456 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 16432296 kB LowFree: 160188 kB SwapTotal: 2096440 kB SwapFree: 2095992 kB Dirty: 540 kB Writeback: 0 kB AnonPages: 1270076 kB Mapped: 4258192 kB Slab: 211448 kB PageTables: 559248 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 10312588 kB Committed_AS: 9760756 kB VmallocTotal: 34359738367 kB VmallocUsed: 263556 kB VmallocChunk: 34359474475 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB Regards, Matt
* Matt Magoffin (postgresql.org@msqr.us) wrote: > [root@170226-db7 ~]# cat /proc/meminfo > CommitLimit: 10312588 kB > Committed_AS: 9760756 kB I suspect this may be it... Apparently, while you're only using about 2G, you've got 10G or so of outstanding commitments, and Linux is refusing to allocate more. You probably want to up your overcommit_ratio, esp. in light of the fact that you've only got 2G of swap on this box. I'd probably say up it to 80, which would give you 14.8G of commitable memory, leaving some room in-memory (1.2G) for cache/buffers and whatnot. Alternativly, you could go for 90, which would allow commits up to 16.4G, so if everyone used all their memory, you'd be into swap. Enjoy, Stephen
Attachment
> Which is a longwinded way of saying "that doesn't seem to match the > symptoms, but ..." If you're not dead certain that your kernel is > configured to allow *well* north of 7000 open files, you might consider > cutting max_files_per_process in half at your next restart. I think it is, looking at /proc/sys/fs/file-max I get [root@170226-db7 64]# cat /proc/sys/fs/file-max 1573080 > OTOH ... wait a minute. Have you maybe got the system configured to > start denying memory requests before it gets into significant swapping? > We typically suggest setting vm.overcommit_memory=2 on Linux, but > I'm not sure whether that results in the kernel trying to avoid using > swap at all. Maybe this really is an issue with system-wide total > memory use rather than any one process. We do have vm.overcommit_memory set to 2 on this system. I should add that now many queries and just transaction initiations are failing on this system since yesterday. Now I just saw one of our applications attempt to initiate a transaction (this is a Java app, so through JDBC here) and it resulted in an out-of-memory error. This is all I see in the logs: [2009-02-09 13:52:13 CST lms_kia FATAL: out of memory TopMemoryContext: 53952 total in 6 blocks; 7048 free (7 chunks); 46904 used Relcache by OID: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used CacheMemoryContext: 91024 total in 4 blocks; 3856 free (9 chunks); 87168 used MdSmgr: 0 total in 0 blocks; 0 free (0 chunks); 0 used LOCALLOCK hash: 8192 total in 1 blocks; 3744 free (0 chunks); 4448 used Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used ErrorContext: 8192 total in 1 blocks; 8160 free (4 chunks); 32 used 2009-02-09 13:52:29 CST lms_kia FATAL: out of memory 2009-02-09 13:52:29 CST lms_kia DETAIL: Failed on request of size 2456. TopMemoryContext: 97184 total in 11 blocks; 2216 free (7 chunks); 94968 used TopTransactionContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544 used Prepared Queries: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used MessageContext: 8192 total in 1 blocks; 6976 free (0 chunks); 1216 used smgr relation table: 24576 total in 2 blocks; 11840 free (4 chunks); 12736 used TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 used Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used PortalHeapMemory: 122880 total in 17 blocks; 7640 free (102 chunks); 115240 used ExecutorState: 89312 total in 4 blocks; 1960 free (2 chunks); 87352 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 125820976 total in 26 blocks; 496 free (6 chunks); 125820480 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 524336 total in 1 blocks; 0 free (0 chunks); 524336 used TIDBitmap: 516096 total in 6 blocks; 222528 free (20 chunks); 293568 used TupleSort: 32816 total in 2 blocks; 7360 free (0 chunks); 25456 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8096 free (0 chunks); 96 used ExprContext: 8192 total in 1 blocks; 8160 free (1 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used AggContext: 8192 total in 1 blocks; 8016 free (0 chunks); 176 used TupleHashTable: 8192 total in 1 blocks; 3744 free (0 chunks); 4448 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used Relcache by OID: 24576 total in 2 blocks; 13872 free (3 chunks); 10704 used CacheMemoryContext: 1341680 total in 21 blocks; 502840 free (2 chunks); 838840 used unnamed prepared statement: 57344 total in 3 blocks; 2008 free (2 chunks); 55336 used cl_event_crm_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used cl_event_lead_id_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used cl_event_event_name_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used cl_event_crm_lead_id_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used cl_event_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used cl_event_first_response_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_delivery_meth_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_typeflags_idx: 2048 total in 1 blocks; 600 free (0 chunks); 1448 used lead_reporting_data_unique_dealer_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_unique_area_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_unique_region_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_unique_all_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_oem_subtype_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_oem_type_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_dealer_area_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_customer_lname_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_customer_fname_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_customer_postal_code_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_dealer_region_area_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_dealer_region_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_dealer_code_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_reporting_date_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used lead_reporting_data_model_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used lead_reporting_data_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used config_xml_name_key: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used config_xml_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used ehcache_expiration_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used ehcache_pkey: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used region_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used dealer_external_system_id_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used dealer_dealer_code_key: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used dealer_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used xslt_style_sheet_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used pg_index_indrelid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used pg_ts_dict_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_aggregate_fnoid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_language_name_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_statistic_relid_att_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_ts_dict_dictname_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_namespace_nspname_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_opfamily_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_opclass_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_ts_parser_prsname_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_amop_fam_strat_index: 3072 total in 2 blocks; 1384 free (2 chunks); 1688 used pg_opclass_am_name_nsp_index: 3072 total in 2 blocks; 1576 free (3 chunks); 1496 used pg_trigger_tgrelid_tgname_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_cast_source_target_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_auth_members_role_member_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_attribute_relid_attnum_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_ts_config_cfgname_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_authid_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_ts_config_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_conversion_default_index: 3072 total in 2 blocks; 1432 free (3 chunks); 1640 used pg_language_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_enum_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_proc_proname_args_nsp_index: 3072 total in 2 blocks; 1576 free (3 chunks); 1496 used pg_ts_parser_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_database_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_conversion_name_nsp_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_class_relname_nsp_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_attribute_relid_attnam_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_class_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1384 free (2 chunks); 1688 used pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1384 free (2 chunks); 1688 used pg_index_indexrelid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_type_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_authid_rolname_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_auth_members_member_role_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_enum_typid_label_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_constraint_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_conversion_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used pg_ts_template_tmplname_index: 3072 total in 2 blocks; 1648 free (2 chunks); 1424 used pg_ts_config_map_index: 3072 total in 2 blocks; 1624 free (3 chunks); 1448 used pg_namespace_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_type_typname_nsp_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_operator_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_amop_opr_fam_index: 3072 total in 2 blocks; 1600 free (2 chunks); 1472 used pg_proc_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1624 free (3 chunks); 1448 used pg_ts_template_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used MdSmgr: 8192 total in 1 blocks; 6752 free (0 chunks); 1440 used LOCALLOCK hash: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used Does this provide any useful information? I have other queries that are failing as well, and I can provide explain output for those if it might help. Regards, Matt
> * Matt Magoffin (postgresql.org@msqr.us) wrote: >> [root@170226-db7 ~]# cat /proc/meminfo >> CommitLimit: 10312588 kB >> Committed_AS: 9760756 kB > > I suspect this may be it... Apparently, while you're only using about > 2G, you've got 10G or so of outstanding commitments, and Linux is > refusing to allocate more. > > You probably want to up your overcommit_ratio, esp. in light of the fact > that you've only got 2G of swap on this box. I'd probably say up it to > 80, which would give you 14.8G of commitable memory, leaving some room > in-memory (1.2G) for cache/buffers and whatnot. Alternativly, you could > go for 90, which would allow commits up to 16.4G, so if everyone used > all their memory, you'd be into swap. Thanks for the advice. Should we have more than 2GB of swap available? I thought the goal for a Postgres system was to avoid swap use at all cost? Would it be better for us to add more swap, or adjust this overcommit_ratio as you discuss? Regards, Matt
> I suspect this may be it... Apparently, while you're only using about > 2G, you've got 10G or so of outstanding commitments, and Linux is > refusing to allocate more. > > You probably want to up your overcommit_ratio, esp. in light of the fact > that you've only got 2G of swap on this box. I'd probably say up it to > 80, which would give you 14.8G of commitable memory, leaving some room > in-memory (1.2G) for cache/buffers and whatnot. Alternativly, you could > go for 90, which would allow commits up to 16.4G, so if everyone used > all their memory, you'd be into swap. Also, by adjusting this, would I possibly just be delaying the problem we currently have (i.e. over time, we start to run out of memory)? I just wonder why the system is reaching this limit at all... do you feel it is quite normal for a system with this memory configuration to be configured with the ratio set to 80? I'm not terribly familiar with these VM parameters, so I apologize if I sound vague. Regards, Matt
* Matt Magoffin (postgresql.org@msqr.us) wrote: > Thanks for the advice. Should we have more than 2GB of swap available? I > thought the goal for a Postgres system was to avoid swap use at all cost? > Would it be better for us to add more swap, or adjust this > overcommit_ratio as you discuss? You do want to avoid swap, but at the same time you don't want to end up with lots of real memory unavailable for use (as here..). After thinking about this for a bit, my suggestion would be to add swap up to the amount of memory you have, for starters. That would give you 24G of commitable memory. I would then pay attention to free, swap usage, and /proc/meminfo. If you start using alot of swap, you might want to adjust PG to use less memory; if you see alot of free memory but you're close to your commit limit, up the ratio some. Basically, the swap would be there just in case all those folks with outstanding committed memory decide to actually use it. That's not all *that* likely to happen, in the end. Thanks, Stephen
Attachment
On Mon, Feb 9, 2009 at 1:25 PM, Matt Magoffin <postgresql.org@msqr.us> wrote: >> * Matt Magoffin (postgresql.org@msqr.us) wrote: >>> [root@170226-db7 ~]# cat /proc/meminfo >>> CommitLimit: 10312588 kB >>> Committed_AS: 9760756 kB >> >> I suspect this may be it... Apparently, while you're only using about >> 2G, you've got 10G or so of outstanding commitments, and Linux is >> refusing to allocate more. >> >> You probably want to up your overcommit_ratio, esp. in light of the fact >> that you've only got 2G of swap on this box. I'd probably say up it to >> 80, which would give you 14.8G of commitable memory, leaving some room >> in-memory (1.2G) for cache/buffers and whatnot. Alternativly, you could >> go for 90, which would allow commits up to 16.4G, so if everyone used >> all their memory, you'd be into swap. > > Thanks for the advice. Should we have more than 2GB of swap available? I > thought the goal for a Postgres system was to avoid swap use at all cost? > Would it be better for us to add more swap, or adjust this > overcommit_ratio as you discuss? I'd do both. But only after I'd reduced work_mem. Given that reducing work_mem removed the problem, it looks to me like pgsql is requesting several large blocks of ram, then only using a small port of them. But overcommit set to 2 means that the OS will not allow an overcommit of memory to these allocations, the allocations fail, and you get your error.
On Mon, Feb 9, 2009 at 1:32 PM, Matt Magoffin <postgresql.org@msqr.us> wrote: >> I suspect this may be it... Apparently, while you're only using about >> 2G, you've got 10G or so of outstanding commitments, and Linux is >> refusing to allocate more. >> >> You probably want to up your overcommit_ratio, esp. in light of the fact >> that you've only got 2G of swap on this box. I'd probably say up it to >> 80, which would give you 14.8G of commitable memory, leaving some room >> in-memory (1.2G) for cache/buffers and whatnot. Alternativly, you could >> go for 90, which would allow commits up to 16.4G, so if everyone used >> all their memory, you'd be into swap. > > Also, by adjusting this, would I possibly just be delaying the problem we > currently have (i.e. over time, we start to run out of memory)? I just > wonder why the system is reaching this limit at all... do you feel it is > quite normal for a system with this memory configuration to be configured > with the ratio set to 80? I'm not terribly familiar with these VM > parameters, so I apologize if I sound vague. I think that you're fixing a symptom, but ignoring the cause. Twiddling VM parameters may help out, but this problem of too much memory allocated is the real issue, so yeah, you're just putting off the inevitable. Test your individual queries with varying settings for work_mem. Find the weighted-averaged knee of diminishing returns and set work_mem closer to that. I'm willing to be it's way less than 128M. 2M to 16M are the numbers I come up with the most.
> with 100 concurrent postgres connections, if they all did something > requiring large amounts of work_mem, you could allocate 100 * 125MB (I > believe thats what you said it was set to?) which is like 12GB :-O > > in fact a single query thats doing multiple sorts of large datasets for > a messy join (or other similar activity) can involve several instances > of workmem. multiply that by 100 queries, and ouch. > > have you considered using a connection pool to reduce the postgres > process count? We do have a connection pool here, and as I mentioned about a quarter of these are Slony-controlled processes for replication. Most connections are not doing complex queries of this sort, in fact when this query runs it is the only query running, most connections are idle (kept open by a connection pool) or performing short insert/update operations. But I see your point about the large work_mem, it was set that high to help speed up big queries such as this one. -- m@
* Scott Marlowe (scott.marlowe@gmail.com) wrote: > I'd do both. But only after I'd reduced work_mem. Given that > reducing work_mem removed the problem, it looks to me like pgsql is > requesting several large blocks of ram, then only using a small port > of them. But overcommit set to 2 means that the OS will not allow an > overcommit of memory to these allocations, the allocations fail, and > you get your error. Reducing work_mem solved the problem by reducing the memory used, sure. In the end, however, we actually want to make the other 14G or so in the box useful for something, not just try to fit everything in under that 10G committed limit. PG allocates the memory it needs, it doesn't just allocate whatever work_mem is set to (in fact, I don't believe work_mem is ever directly used in allocations). I'm not against looking at lowering work_mem to something else, but you don't want to suddenly get shifted over to disk-based sorts or similar when your data set grows too big if there's a bunch of unused memory in the box. Thanks, Stephen
Attachment
* Scott Marlowe (scott.marlowe@gmail.com) wrote: > I think that you're fixing a symptom, but ignoring the cause. > Twiddling VM parameters may help out, but this problem of too much > memory allocated is the real issue, so yeah, you're just putting off > the inevitable. I don't think changing work_mem down is actually going to reduce the memory allocated without changing the plan to something less optimal. In the end, all of this is putting off the inevitable, if you get enough PGs going and enough requests and whatnot, you're going to start running out of memory again. Same if you get larger data sets that take up more hash table space or similar. Eventually you might need a bigger box, but let's try to get everything in the current box to at least be used first.. Thanks, Stephen
Attachment
> I'd do both. But only after I'd reduced work_mem. Given that > reducing work_mem removed the problem, it looks to me like pgsql is > requesting several large blocks of ram, then only using a small port > of them. But overcommit set to 2 means that the OS will not allow an > overcommit of memory to these allocations, the allocations fail, and > you get your error. OK, I've re-configured work_mem, and set vm.overcommit_ratio to 80. Without restarting Postgres, I was now able to run that big query posted earlier in this thread that failed... so the overcommit_ratio adjustment helped there. The EXPLAIN ANALYZE for that I've included below, where I can see it did use in-memory sorting for some of the sorts. QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=2297842.92..2297943.76 rows=40336 width=234) (actual time=370440.041..370483.133 rows=49317 loops=1) Sort Key: s.sale_date, s.vin Sort Method: quicksort Memory: 9357kB -> Hash Left Join (cost=1450365.72..2294757.26 rows=40336 width=234) (actual time=293212.004..369857.956 rows=49317 loops=1) Hash Cond: (ml.lead_id = phone.lead_id) -> Hash Left Join (cost=1341348.27..2140418.89 rows=40336 width=219) (actual time=286374.489..362880.702 rows=49317 loops=1) Hash Cond: (ml.lead_id = email.lead_id) -> Nested Loop Left Join (cost=1237270.73..2000634.35 rows=40336 width=204) (actual time=278652.051..355022.014 rows=49317 loops=1) -> Nested Loop Left Join (cost=1237270.73..1807035.53 rows=40336 width=141) (actual time=278635.414..323774.871 rows=49317 loops=1) -> Nested Loop Left Join (cost=1237270.73..1587513.22 rows=40336 width=126) (actual time=278632.427..317952.620 rows=49317 loops=1) -> Nested Loop Left Join (cost=1237270.73..1414103.19 rows=40336 width=118) (actual time=278606.034..297951.038 rows=49317 loops=1) -> Merge Left Join (cost=1237270.73..1238315.78 rows=40336 width=89) (actual time=278587.812..279498.796 rows=49317 loops=1) Merge Cond: (ml.lead_id = lrdm.lead_id) -> Sort (cost=46384.08..46484.92 rows=40336 width=78) (actual time=6200.810..6240.526 rows=49317 loops=1) Sort Key: ml.lead_id Sort Method: quicksort Memory: 8472kB -> Hash Left Join (cost=9430.33..43298.42 rows=40336 width=78) (actual time=1079.869..6084.010 rows=49317 loops=1) Hash Cond: ((s.dealer_code)::text = (d.dealer_code)::text) -> Hash Left Join (cost=9094.04..42407.50 rows=40336 width=60) (actual time=1074.170..5947.646 rows=49317 loops=1) Hash Cond: (s.id = ml.sale_id) Filter: (((s.sale_type = 'd'::bpchar) AND (NOT ml.lost_sale)) OR ((s.sale_type = 'c'::bpchar) AND (ml.lead_pos = 0)) OR (s.sale_type = '0'::bpchar)) -> Index Scan using mb_sale_sale_date_idx on mb_sale s (cost=0.00..14027.94 rows=43804 width=50) (actual time=55.663..4683.901 rows=49152 loops=1) Index Cond: ((sale_date >= '2009-01-01'::date) AND (sale_date <= '2009-01-31'::date)) -> Hash (cost=5577.35..5577.35 rows=281335 width=26) (actual time=1018.108..1018.108 rows=281335 loops=1) -> Seq Scan on mb_lead ml (cost=0.00..5577.35 rows=281335 width=26) (actual time=3.451..516.245 rows=281335 loops=1) -> Hash (cost=321.13..321.13 rows=1213 width=23) (actual time=5.577..5.577 rows=1213 loops=1) -> Seq Scan on dealer d (cost=0.00..321.13 rows=1213 width=23) (actual time=0.056..3.684 rows=1213 loops=1) -> Sort (cost=1190886.66..1191208.43 rows=128709 width=19) (actual time=270075.460..271851.519 rows=1442527 loops=1) Sort Key: lrdm.lead_id Sort Method: external sort Disk: 56072kB -> Bitmap Heap Scan on lead_reporting_meta lrdm (cost=118847.85..1179963.28 rows=128709 width=19) (actual time=103684.796..261544.708 rows=1462381 loops=1) Recheck Cond: (item_key = '[ALS:prospectid]TrackingCode'::text) Filter: (pos = 1) -> Bitmap Index Scan on lead_reporting_meta_item_key_idx (cost=0.00..118815.67 rows=1476580 width=0) (actual time=102982.150..102982.150 rows=1484068 loops=1) Index Cond: (item_key = '[ALS:prospectid]TrackingCode'::text) -> Index Scan using lead_reporting_address_lead_id_idx on lead_reporting_address address (cost=0.00..4.35 rows=1 width=37) (actual time=0.370..0.371 rows=0 loops=49317) Index Cond: (address.lead_id = ml.lead_id) -> Index Scan using lead_reporting_street_address_id_idx on lead_reporting_street address_street (cost=0.00..4.29 rows=1 width=24) (actual time=0.402..0.403 rows=0 loops=49317) Index Cond: (address_street.address_id = address.id) Filter: (address_street.pos = 0) -> Index Scan using lead_pkey on lead l (cost=0.00..5.43 rows=1 width=23) (actual time=0.114..0.115 rows=0 loops=49317) Index Cond: (l.id = ml.lead_id) -> Index Scan using lead_reporting_data_pkey on lead_reporting_data lrd (cost=0.00..4.79 rows=1 width=71) (actual time=0.630..0.630 rows=0 loops=49317) Index Cond: (lrd.lead_id = ml.lead_id) -> Hash (cost=85837.99..85837.99 rows=1459164 width=23) (actual time=7719.918..7719.918 rows=1522674 loops=1) -> Seq Scan on lead_reporting_list_data email (cost=0.00..85837.99 rows=1459164 width=23) (actual time=6.258..5105.843 rows=1522674 loops=1) Filter: ((list_type = 'e'::bpchar) AND (pos = 0)) -> Hash (cost=85837.99..85837.99 rows=1854357 width=23) (actual time=6834.882..6834.882 rows=1805273 loops=1) -> Seq Scan on lead_reporting_list_data phone (cost=0.00..85837.99 rows=1854357 width=23) (actual time=0.027..3917.300 rows=1805273 loops=1) Filter: ((list_type = 'p'::bpchar) AND (pos = 0)) Total runtime: 370595.083 ms (51 rows)
> I don't think changing work_mem down is actually going to reduce the > memory allocated without changing the plan to something less optimal. > In the end, all of this is putting off the inevitable, if you get enough > PGs going and enough requests and whatnot, you're going to start running > out of memory again. Same if you get larger data sets that take up more > hash table space or similar. Eventually you might need a bigger box, > but let's try to get everything in the current box to at least be used > first.. Yes... and indeed changing vm.overcommit_ratio to 80 does allow that previously-failing query to execute successfully. Do you think this is also what caused the out-of-memory error we saw today just when a transaction was initiated? Regards, Matt
* Matt Magoffin (postgresql.org@msqr.us) wrote: > Also, by adjusting this, would I possibly just be delaying the problem we > currently have (i.e. over time, we start to run out of memory)? I just > wonder why the system is reaching this limit at all... do you feel it is > quite normal for a system with this memory configuration to be configured > with the ratio set to 80? I'm not terribly familiar with these VM > parameters, so I apologize if I sound vague. In terms of 'normal', I tend to build my systems with equal swap and memory, and then leave the overcommit ratio at the default. Still, I think the best thing is to just keep tabs on how much memory has been committed, how much is actually being used, how much swap you're using, and what your commit limit is. You can then fine-tune things as needed. Stephen
Attachment
* Matt Magoffin (postgresql.org@msqr.us) wrote: > Yes... and indeed changing vm.overcommit_ratio to 80 does allow that > previously-failing query to execute successfully. Do you think this is > also what caused the out-of-memory error we saw today just when a > transaction was initiated? Almost certainly.. You were running close to the edge *system-wide* on memory allocations based on your meminfo results. You'll want to keep an eye on the limit vs. the committed amount going forward so as to avoid this happening in the future. It's just about the same as watching how you're doing on overall memory usage in the box or on swap usage. Set up some nagios monitoring, and alarm once Committed_As/CommitLimit goes over 90%. Thanks, Stephen
Attachment
On Mon, Feb 9, 2009 at 2:01 PM, Matt Magoffin <postgresql.org@msqr.us> wrote: >> I don't think changing work_mem down is actually going to reduce the >> memory allocated without changing the plan to something less optimal. >> In the end, all of this is putting off the inevitable, if you get enough >> PGs going and enough requests and whatnot, you're going to start running >> out of memory again. Same if you get larger data sets that take up more >> hash table space or similar. Eventually you might need a bigger box, >> but let's try to get everything in the current box to at least be used >> first.. > > Yes... and indeed changing vm.overcommit_ratio to 80 does allow that > previously-failing query to execute successfully. Do you think this is > also what caused the out-of-memory error we saw today just when a > transaction was initiated? Curious, what's the explain analyze look like for that one?
On Mon, Feb 9, 2009 at 1:57 PM, Stephen Frost <sfrost@snowman.net> wrote: > * Scott Marlowe (scott.marlowe@gmail.com) wrote: >> I think that you're fixing a symptom, but ignoring the cause. >> Twiddling VM parameters may help out, but this problem of too much >> memory allocated is the real issue, so yeah, you're just putting off >> the inevitable. > > I don't think changing work_mem down is actually going to reduce the > memory allocated without changing the plan to something less optimal. > In the end, all of this is putting off the inevitable, if you get enough > PGs going and enough requests and whatnot, you're going to start running > out of memory again. Same if you get larger data sets that take up more > hash table space or similar. Eventually you might need a bigger box, > but let's try to get everything in the current box to at least be used > first.. That's not always true. It's not uncommon to have the same plan just spill to disk later. If the dataset is big enough that you're gonna need to spill to disk anyway, then the difference in performance is often pretty small. Again, it's why I was saying he needs to benchmark ALL his queries, by weight, and see what work_mem gives optimal results without catering to just one or two big queries. If needs be, set work_mem for the connection big and avoid the foot gun that is large global work_mem. Far better to waste a few gigs of memory than to start having midday failures under max load due to oom conditions.
>> Yes... and indeed changing vm.overcommit_ratio to 80 does allow that >> previously-failing query to execute successfully. Do you think this is >> also what caused the out-of-memory error we saw today just when a >> transaction was initiated? > > Curious, what's the explain analyze look like for that one? Do you mean the transaction initiation? I'm not sure how to get an EXPLAIN for that, the application never got to do anything, from the application side it failed with out-of-memory while trying to open the connection. Or, the most precise I have is that in the JDBC driver, it failed at Caused by: org.postgresql.util.PSQLException: FATAL: out of memory at org.postgresql.core.v3.ConnectionFactoryImpl.readStartupMessages(ConnectionFactoryImpl.java:444) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:99) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:124) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30) at org.postgresql.jdbc3g.Jdbc3gConnection.<init>(Jdbc3gConnection.java:24) at org.postgresql.Driver.makeConnection(Driver.java:386) at org.postgresql.Driver.connect(Driver.java:260) And the corresponding error in the Postgres log is 2009-02-09 13:52:13 CST lms_kia FATAL: out of memory TopMemoryContext: 53952 total in 6 blocks; 7048 free (7 chunks); 46904 used Relcache by OID: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used CacheMemoryContext: 91024 total in 4 blocks; 3856 free (9 chunks); 87168 used MdSmgr: 0 total in 0 blocks; 0 free (0 chunks); 0 used LOCALLOCK hash: 8192 total in 1 blocks; 3744 free (0 chunks); 4448 used Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used ErrorContext: 8192 total in 1 blocks; 8160 free (4 chunks); 32 used 2009-02-09 13:52:29 CST lms_kia FATAL: out of memory 2009-02-09 13:52:29 CST lms_kia DETAIL: Failed on request of size 2456.
Stephen Frost <sfrost@snowman.net> writes: > * Matt Magoffin (postgresql.org@msqr.us) wrote: >> Yes... and indeed changing vm.overcommit_ratio to 80 does allow that >> previously-failing query to execute successfully. Do you think this is >> also what caused the out-of-memory error we saw today just when a >> transaction was initiated? > Almost certainly.. You were running close to the edge *system-wide* on > memory allocations based on your meminfo results. You'll want to keep > an eye on the limit vs. the committed amount going forward so as to > avoid this happening in the future. It's just about the same as > watching how you're doing on overall memory usage in the box or on swap > usage. Set up some nagios monitoring, and alarm once > Committed_As/CommitLimit goes over 90%. IIRC, our documentation points out vm.overcommit_memory as a critical parameter, but says nothing about overcommit_ratio nor how to find out how close you are to hitting that limit. Without necessarily being prescriptive about the best strategy, should we add some info to let people know about these things? This thread was mostly news to me, so I suspect it's news to some other folk as well. regards, tom lane
On Mon, Feb 9, 2009 at 2:40 PM, Matt Magoffin <postgresql.org@msqr.us> wrote: >>> Yes... and indeed changing vm.overcommit_ratio to 80 does allow that >>> previously-failing query to execute successfully. Do you think this is >>> also what caused the out-of-memory error we saw today just when a >>> transaction was initiated? >> >> Curious, what's the explain analyze look like for that one? > > Do you mean the transaction initiation? I'm not sure how to get an EXPLAIN > for that, the application never got to do anything, from the application > side it failed with out-of-memory while trying to open the connection. Or, > the most precise I have is that in the JDBC driver, it failed at No, explain analyze for the query that wouldn't execute before but now does, with, I assume, a large work_mem. I'd like to see how it differes from the one with smaller work_mem.
> No, explain analyze for the query that wouldn't execute before but now > does, with, I assume, a large work_mem. I'd like to see how it > differes from the one with smaller work_mem. Ah, I pasted that in an earlier email, sent February 10, 2009 9:58:00 AM GMT+13:00... that plan was the one using still the 128MB of work_mem after changing the overcommit_ratio to 80. Regards, Matt
On Mon, Feb 9, 2009 at 3:57 PM, Matt Magoffin <postgresql.org@msqr.us> wrote: >> No, explain analyze for the query that wouldn't execute before but now >> does, with, I assume, a large work_mem. I'd like to see how it >> differes from the one with smaller work_mem. > > Ah, I pasted that in an earlier email, sent February 10, 2009 9:58:00 AM > GMT+13:00... that plan was the one using still the 128MB of work_mem after > changing the overcommit_ratio to 80. Opps, my bad. I thought that was just explain. I'll go look.
On Mon, Feb 9, 2009 at 1:58 PM, Matt Magoffin <postgresql.org@msqr.us> wrote: I wonder if this is the problem, or part of it. This part of the explain analyze on down, there's 1.4M rows, when the planner seems to expect the number of rows to be chopped down quite a bit more when it goes from the bitmap index scan to the bitmap heap scan. I'm wondering if you're hitting one of those issues where pgsql thinks it can fit some operation into work_mem and it can't and the allocation fails. Just a guess, I'd run that by someone else before I took it as fact. > -> Sort > (cost=1190886.66..1191208.43 > rows=128709 width=19) (actual > time=270075.460..271851.519 > rows=1442527 loops=1) > Sort Key: lrdm.lead_id > Sort Method: external > sort Disk: 56072kB > -> Bitmap Heap Scan on > lead_reporting_meta > lrdm > (cost=118847.85..1179963.28 > rows=128709 width=19) > (actual > time=103684.796..261544.708 > rows=1462381 loops=1) > Recheck Cond: > (item_key = > '[ALS:prospectid]TrackingCode'::text) > Filter: (pos = 1) > -> Bitmap Index > Scan on > lead_reporting_meta_item_key_idx > > (cost=0.00..118815.67 > rows=1476580 > width=0) (actual > time=102982.150..102982.150 > rows=1484068 > loops=1) > Index Cond: > (item_key = > '[ALS:prospectid]TrackingCode'::text) > -> Index Scan using > lead_reporting_address_lead_id_idx > on lead_reporting_address address > (cost=0.00..4.35 rows=1 width=37) > (actual time=0.370..0.371 rows=0 > loops=49317) > Index Cond: (address.lead_id > = ml.lead_id) > -> Index Scan using > lead_reporting_street_address_id_idx on > lead_reporting_street address_street > (cost=0.00..4.29 rows=1 width=24) (actual > time=0.402..0.403 rows=0 loops=49317) > Index Cond: > (address_street.address_id = > address.id) > Filter: (address_street.pos = 0) > -> Index Scan using lead_pkey on lead l > (cost=0.00..5.43 rows=1 width=23) (actual > time=0.114..0.115 rows=0 loops=49317) > Index Cond: (l.id = ml.lead_id) > -> Index Scan using lead_reporting_data_pkey on > lead_reporting_data lrd (cost=0.00..4.79 rows=1 > width=71) (actual time=0.630..0.630 rows=0 > loops=49317) > Index Cond: (lrd.lead_id = ml.lead_id) > -> Hash (cost=85837.99..85837.99 rows=1459164 width=23) > (actual time=7719.918..7719.918 rows=1522674 loops=1) > -> Seq Scan on lead_reporting_list_data email > (cost=0.00..85837.99 rows=1459164 width=23) (actual > time=6.258..5105.843 rows=1522674 loops=1) > Filter: ((list_type = 'e'::bpchar) AND (pos = 0)) > -> Hash (cost=85837.99..85837.99 rows=1854357 width=23) (actual > time=6834.882..6834.882 rows=1805273 loops=1) > -> Seq Scan on lead_reporting_list_data phone > (cost=0.00..85837.99 rows=1854357 width=23) (actual > time=0.027..3917.300 rows=1805273 loops=1) > Filter: ((list_type = 'p'::bpchar) AND (pos = 0)) > Total runtime: 370595.083 ms > (51 rows)