Re: Out of memory on SELECT in 8.3.5 - Mailing list pgsql-general
From | Matt Magoffin |
---|---|
Subject | Re: Out of memory on SELECT in 8.3.5 |
Date | |
Msg-id | 51355.192.168.1.106.1234157867.squirrel@msqr.us Whole thread Raw |
In response to | Re: Out of memory on SELECT in 8.3.5 (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Out of memory on SELECT in 8.3.5
|
List | pgsql-general |
>> PostgreSQL 8.3.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) >> 4.1.2 20070626 (Red Hat 4.1.2-14) > > Does the result from 'free' look reasonable on this box? I think so: total used free shared buffers cached Mem: 16432296 16273964 158332 0 173536 14321340 -/+ buffers/cache: 1779088 14653208 Swap: 2096440 560 2095880 > How many PG processes are running, generally? I see about 90 currently, of which 24 are Slony connections and the rest are client connections. > Do you see the PG process running this query growing to fill the available memory? That query really > didn't look like it was chewing up all that much memory to me.. Just running top, it does appear to chew through a fair amount of memory. Here's a snapshot from top of the postgres processing running this query from just before it ran out of memory: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4486 postgres 18 0 4576m 3.6g 3.3g R 90 23.1 0:34.23 postgres: postgres lms_nna [local] EXPLAIN These values did start out low, for example the RES memory started in the 130MB range, then climbed to the 3.6GB you see here. > Have > you tried reducing your work_mem setting to see if that makes the errors > go away? It might cause a different plan which might be less efficient > for some queries, not sure if you have anything which would be > dramatically affected by that..Do the row counts in the plan match up > with what you'd expect? Can you provide output from 'explain analyze' > on a successful run? I set the work_mem to 2MB, and the query does actually complete (explain analyze output below), so does this mean that the query simply uses too much memory for sorting/joining, and we'd have to either allocate enough work_mem to allow the query to complete, or a smaller work_mem as shown here to make the query use slower disk-based sorting? The row counts are matching what we'd expect from this query. QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Sort (cost=2345301.47..2345402.31 rows=40336 width=234) (actual time=203429.526..203599.840 rows=49317 loops=1) Sort Key: s.sale_date, s.vin Sort Method: external merge Disk: 4776kB -> Hash Left Join (cost=1469244.28..2337665.81 rows=40336 width=234) (actual time=162362.791..202883.728 rows=49317 loops=1) Hash Cond: (ml.lead_id = phone.lead_id) -> Hash Left Join (cost=1349360.82..2159151.44 rows=40336 width=219) (actual time=155499.666..194941.423 rows=49317 loops=1) Hash Cond: (ml.lead_id = email.lead_id) -> Nested Loop Left Join (cost=1236733.28..1999980.90 rows=40336 width=204) (actual time=141890.479..186344.717 rows=49317 loops=1) -> Nested Loop Left Join (cost=1236733.28..1806406.02 rows=40336 width=141) (actual time=141890.419..166782.070 rows=49317 loops=1) -> Nested Loop Left Join (cost=1236733.28..1586899.03 rows=40336 width=126) (actual time=141890.368..166124.253 rows=49317 loops=1) -> Nested Loop Left Join (cost=1236733.28..1413543.83 rows=40336 width=118) (actual time=141890.281..156284.612 rows=49317 loops=1) -> Merge Left Join (cost=1236733.28..1237778.33 rows=40336 width=89) (actual time=141890.184..143717.900 rows=49317 loops=1) Merge Cond: (ml.lead_id = lrdm.lead_id) -> Sort (cost=43356.21..43457.05 rows=40336 width=78) (actual time=1722.385..1794.668 rows=49317 loops=1) Sort Key: ml.lead_id Sort Method: external merge Disk: 4152kB -> Hash Left Join (cost=11354.33..38475.05 rows=40336 width=78) (actual time=919.319..1578.556 rows=49317 loops=1) Hash Cond: ((s.dealer_code)::text = (d.dealer_code)::text) -> Hash Left Join (cost=11018.04..37584.13 rows=40336 width=60) (actual time=914.936..1445.926 rows=49317 loops=1) Hash Cond: (s.id = ml.sale_id) Filter: (((s.sale_type = 'd'::bpchar) AND (NOT ml.lost_sale)) OR ((s.sale_type = 'c'::bpchar) AND (ml.lead_pos = 0)) OR (s.sale_type = '0'::bpchar)) -> Index Scan using mb_sale_sale_date_idx on mb_sale s (cost=0.00..14027.94 rows=43804 width=50) (actual time=0.149..126.835 rows=49152 loops=1) Index Cond: ((sale_date >= '2009-01-01'::date) AND (sale_date <= '2009-01-31'::date)) -> Hash (cost=5577.35..5577.35 rows=281335 width=26) (actual time=914.089..914.089 rows=281335 loops=1) -> Seq Scan on mb_lead ml (cost=0.00..5577.35 rows=281335 width=26) (actual time=0.026..394.384 rows=281335 loops=1) -> Hash (cost=321.13..321.13 rows=1213 width=23) (actual time=4.343..4.343 rows=1213 loops=1) -> Seq Scan on dealer d (cost=0.00..321.13 rows=1213 width=23) (actual time=0.013..2.451 rows=1213 loops=1) -> Materialize (cost=1193377.08..1194985.94 rows=128709 width=19) (actual time=135323.691..140326.709 rows=1442527 loops=1) -> Sort (cost=1193377.08..1193698.85 rows=128709 width=19) (actual time=135323.681..137742.900 rows=1442488 loops=1) Sort Key: lrdm.lead_id Sort Method: external merge Disk: 50312kB -> Bitmap Heap Scan on lead_reporting_meta lrdm (cost=118695.77..1179811.20 rows=128709 width=19) (actual time=90238.108..126737.875 rows=1460904 loops=1) Recheck Cond: (item_key = '[ALS:prospectid]TrackingCode'::text) Filter: (pos = 1) -> Bitmap Index Scan on lead_reporting_meta_item_key_idx (cost=0.00..118663.60 rows=1476580 width=0) (actual time=90223.734..90223.734 rows=1476483 loops=1) Index Cond: (item_key = '[ALS:prospectid]TrackingCode'::text) -> Index Scan using lead_reporting_address_lead_id_idx on lead_reporting_address address (cost=0.00..4.35 rows=1 width=37) (actual time=0.251..0.252 rows=0 loops=49317) Index Cond: (address.lead_id = ml.lead_id) -> Index Scan using lead_reporting_street_address_id_idx on lead_reporting_street address_street (cost=0.00..4.29 rows=1 width=24) (actual time=0.196..0.197 rows=0 loops=49317) Index Cond: (address_street.address_id = address.id) Filter: (address_street.pos = 0) -> Index Scan using lead_pkey on lead l (cost=0.00..5.43 rows=1 width=23) (actual time=0.010..0.010 rows=0 loops=49317) Index Cond: (l.id = ml.lead_id) -> Index Scan using lead_reporting_data_pkey on lead_reporting_data lrd (cost=0.00..4.79 rows=1 width=71) (actual time=0.393..0.393 rows=0 loops=49317) Index Cond: (lrd.lead_id = ml.lead_id) -> Hash (cost=85837.99..85837.99 rows=1459164 width=23) (actual time=7607.947..7607.947 rows=1521294 loops=1) -> Seq Scan on lead_reporting_list_data email (cost=0.00..85837.99 rows=1459164 width=23) (actual time=0.030..5063.069 rows=1521294 loops=1) Filter: ((list_type = 'e'::bpchar) AND (pos = 0)) -> Hash (cost=85837.99..85837.99 rows=1854357 width=23) (actual time=6863.050..6863.050 rows=1803678 loops=1) -> Seq Scan on lead_reporting_list_data phone (cost=0.00..85837.99 rows=1854357 width=23) (actual time=0.020..3920.096 rows=1803678 loops=1) Filter: ((list_type = 'p'::bpchar) AND (pos = 0)) Total runtime: 203657.960 ms (52 rows)
pgsql-general by date: