Thread: Help with query plan inconsistencies
Hello, I am using postgres 7.4.2 as a backend for geocode data for a mapping application. My question is why can't I get a consistent use of my indexes during a query, I tend to get a lot of seq scan results. I use a standard query: SELECT lat, long, mac, status FROM ( SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d' )AS FOO WHERE (long>=X1) AND (long<=X2) AND (lat>=Y1) AND (lat<=Y2) Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map viewing area. QUERY PLAN #1 & #2 are from when I get a view from 10 miles out, sometimes it uses the index(#1) and most of the time not(#2). I do run into plans that seq scan both sides of the join. QUERY PLAN #3 is when I view from 5 miles out, and I have much greater chance of getting index scans ( about 90% of the time). I have listed information about the database below. Cable_billing ~500,000 rows updated once per day Davic ~500,000 rows, about 100 rows update per minute Any info or suggestions would be appreciated. Woody twc-ral-overview=# \d cable_billing; Table "public.cable_billing" Column | Type | Modifiers -----------------+------------------------+----------- cable_billingid | integer | not null mac | macaddr | not null account | integer | number | character varying(10) | address | character varying(200) | region | character varying(30) | division | integer | franchise | integer | node | character varying(10) | lat | numeric | long | numeric | trunk | character varying(5) | ps | character varying(5) | fd | character varying(5) | le | character varying(5) | update | integer | boxtype | character(1) | Indexes: cable_billing_pkey primary key btree (mac), cable_billing_account_index btree (account), cable_billing_lat_long_idx btree (lat, long), cable_billing_node_index btree (node), cable_billing_region_index btree (region) twc-ral-overview=# \d davic Table "public.davic" Column | Type | Modifiers ---------+-----------------------+----------- davicid | integer | not null mac | macaddr | not null source | character varying(20) | status | smallint | updtime | integer | type | character varying(10) | avail1 | integer | Indexes: davic_pkey primary key btree (mac) twc-ral-overview=# vacuum analyze; VACUUM twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo WHERE (long>=-78.70723462816063) AND (long<=-78.53096764204116) AND (lat>=35.57411187866667) AND (lat<=35.66366331376857); QUERY PLAN #1 ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ----- Nested Loop Left Join (cost=0.00..23433.18 rows=1871 width=34) (actual time=0.555..5095.434 rows=3224 loops=1) -> Index Scan using cable_billing_lat_long_idx on cable_billing (cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931 rows=3224 loops=1) Index Cond: ((lat >= 35.57411187866667) AND (lat <= 35.66366331376857) AND (long >= -78.70723462816063) AND (long <= -78.53096764204116)) Filter: (boxtype = 'd'::bpchar) -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 width=8) (actual time=1.476..1.480 rows=1 loops=3224) Index Cond: ("outer".mac = davic.mac) Total runtime: 5100.028 ms (7 rows) twc-ral-overview=# vacuum analyze; VACUUM twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo WHERE (long>=-78.87878592206046) AND (long<=-78.70220280717479) AND (lat>=35.71703190638861) AND (lat<=35.80658335998006); QUERY PLAN #2 ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ------------------- Nested Loop Left Join (cost=0.00..76468.90 rows=9223 width=34) (actual time=0.559..17387.427 rows=19997 loops=1) -> Seq Scan on cable_billing (cost=0.00..20837.76 rows=9223 width=32) (actual time=0.290..7117.799 rows=19997 loops=1) Filter: ((boxtype = 'd'::bpchar) AND (long >= -78.87878592206046) AND (long <= -78.70220280717479) AND (lat >= 35.71703190638861) AND (lat <= 35.80658335998006)) -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 width=8) (actual time=0.455..0.461 rows=1 loops=19997) Index Cond: ("outer".mac = davic.mac) Total runtime: 17416.501 ms (6 rows) twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo WHERE (long>=-78.83419423836857) AND (long<=-78.7467945148866) AND (lat>=35.73964586635293) AND (lat<=35.783969313080604); QUERY PLAN #3 ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ----- Nested Loop Left Join (cost=0.00..29160.02 rows=2327 width=34) (actual time=0.279..510.773 rows=5935 loops=1) -> Index Scan using cable_billing_lat_long_idx on cable_billing (cost=0.00..15130.08 rows=2326 width=32) (actual time=0.197..274.115 rows=5935 loops=1) Index Cond: ((lat >= 35.73964586635293) AND (lat <= 35.783969313080604) AND (long >= -78.83419423836857) AND (long <= -78.7467945148866)) Filter: (boxtype = 'd'::bpchar) -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 width=8) (actual time=0.021..0.024 rows=1 loops=5935) Index Cond: ("outer".mac = davic.mac) Total runtime: 516.782 ms (7 rows) ---------------------------------------------------------------------------- ----------------------------------- iglass Networks 211-A S. Salem St. (919) 387-3550 x813 P.O. Box 651 (919) 387-3570 fax Apex, NC 27502 http://www.iglass.net
I'm going to ask because someone else surely will: Do you regularily vacuum/analyze the database? Woody Woodring wrote: > Hello, > > I am using postgres 7.4.2 as a backend for geocode data for a mapping > application. My question is why can't I get a consistent use of my indexes > during a query, I tend to get a lot of seq scan results. > > I use a standard query: > > SELECT lat, long, mac, status FROM ( > SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 THEN > 1 ELSE -1 END > as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE > boxtype='d' > )AS FOO WHERE (long>=X1) AND (long<=X2) AND (lat>=Y1) AND (lat<=Y2) > > Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map viewing > area. > > QUERY PLAN #1 & #2 are from when I get a view from 10 miles out, sometimes > it uses the index(#1) and most of the time not(#2). I do run into plans > that seq scan both sides of the join. > > QUERY PLAN #3 is when I view from 5 miles out, and I have much greater > chance of getting index scans ( about 90% of the time). > > I have listed information about the database below. > > Cable_billing ~500,000 rows updated once per day > Davic ~500,000 rows, about 100 rows update per minute > > Any info or suggestions would be appreciated. > > Woody > > > twc-ral-overview=# \d cable_billing; > Table "public.cable_billing" > Column | Type | Modifiers > -----------------+------------------------+----------- > cable_billingid | integer | not null > mac | macaddr | not null > account | integer | > number | character varying(10) | > address | character varying(200) | > region | character varying(30) | > division | integer | > franchise | integer | > node | character varying(10) | > lat | numeric | > long | numeric | > trunk | character varying(5) | > ps | character varying(5) | > fd | character varying(5) | > le | character varying(5) | > update | integer | > boxtype | character(1) | > Indexes: cable_billing_pkey primary key btree (mac), > cable_billing_account_index btree (account), > cable_billing_lat_long_idx btree (lat, long), > cable_billing_node_index btree (node), > cable_billing_region_index btree (region) > > twc-ral-overview=# \d davic > Table "public.davic" > Column | Type | Modifiers > ---------+-----------------------+----------- > davicid | integer | not null > mac | macaddr | not null > source | character varying(20) | > status | smallint | > updtime | integer | > type | character varying(10) | > avail1 | integer | > Indexes: davic_pkey primary key btree (mac) > > > > twc-ral-overview=# vacuum analyze; > VACUUM > twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM > (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 > ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) > WHERE boxtype='d') AS foo WHERE (long>=-78.70723462816063) AND > (long<=-78.53096764204116) AND (lat>=35.57411187866667) AND > (lat<=35.66366331376857); > QUERY PLAN #1 > > ---------------------------------------------------------------------------- > ---------------------------------------------------------------------------- > ----- > Nested Loop Left Join (cost=0.00..23433.18 rows=1871 width=34) (actual > time=0.555..5095.434 rows=3224 loops=1) > -> Index Scan using cable_billing_lat_long_idx on cable_billing > (cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931 > rows=3224 loops=1) > Index Cond: ((lat >= 35.57411187866667) AND (lat <= > 35.66366331376857) AND (long >= -78.70723462816063) AND (long <= > -78.53096764204116)) > Filter: (boxtype = 'd'::bpchar) > -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 > width=8) (actual time=1.476..1.480 rows=1 loops=3224) > Index Cond: ("outer".mac = davic.mac) > Total runtime: 5100.028 ms > (7 rows) > > > > twc-ral-overview=# vacuum analyze; > VACUUM > twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM > (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 > ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) > WHERE boxtype='d') AS foo WHERE (long>=-78.87878592206046) AND > (long<=-78.70220280717479) AND (lat>=35.71703190638861) AND > (lat<=35.80658335998006); > QUERY PLAN #2 > > ---------------------------------------------------------------------------- > ---------------------------------------------------------------------------- > ------------------- > Nested Loop Left Join (cost=0.00..76468.90 rows=9223 width=34) (actual > time=0.559..17387.427 rows=19997 loops=1) > -> Seq Scan on cable_billing (cost=0.00..20837.76 rows=9223 width=32) > (actual time=0.290..7117.799 rows=19997 loops=1) > Filter: ((boxtype = 'd'::bpchar) AND (long >= -78.87878592206046) > AND (long <= -78.70220280717479) AND (lat >= 35.71703190638861) AND (lat <= > 35.80658335998006)) > -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 > width=8) (actual time=0.455..0.461 rows=1 loops=19997) > Index Cond: ("outer".mac = davic.mac) > Total runtime: 17416.501 ms > (6 rows) > > > > twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM > (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 > ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) > WHERE boxtype='d') AS foo WHERE (long>=-78.83419423836857) AND > (long<=-78.7467945148866) AND (lat>=35.73964586635293) AND > (lat<=35.783969313080604); > QUERY PLAN #3 > > ---------------------------------------------------------------------------- > ---------------------------------------------------------------------------- > ----- > Nested Loop Left Join (cost=0.00..29160.02 rows=2327 width=34) (actual > time=0.279..510.773 rows=5935 loops=1) > -> Index Scan using cable_billing_lat_long_idx on cable_billing > (cost=0.00..15130.08 rows=2326 width=32) (actual time=0.197..274.115 > rows=5935 loops=1) > Index Cond: ((lat >= 35.73964586635293) AND (lat <= > 35.783969313080604) AND (long >= -78.83419423836857) AND (long <= > -78.7467945148866)) > Filter: (boxtype = 'd'::bpchar) > -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 > width=8) (actual time=0.021..0.024 rows=1 loops=5935) > Index Cond: ("outer".mac = davic.mac) > Total runtime: 516.782 ms > (7 rows) > > ---------------------------------------------------------------------------- > ----------------------------------- > > iglass Networks > 211-A S. Salem St. > (919) 387-3550 x813 > P.O. Box 651 > (919) 387-3570 fax > Apex, NC 27502 > http://www.iglass.net > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
I currently have it set up to vacuum/analyze every 2 hours. However my QUERY PLAN #1 & #2 in my example I ran my explain immediately after a vacuum/analyze. -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Joseph Shraibman Sent: Tuesday, March 23, 2004 2:17 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Help with query plan inconsistencies I'm going to ask because someone else surely will: Do you regularily vacuum/analyze the database? Woody Woodring wrote: > Hello, > > I am using postgres 7.4.2 as a backend for geocode data for a mapping > application. My question is why can't I get a consistent use of my > indexes during a query, I tend to get a lot of seq scan results. > > I use a standard query: > > SELECT lat, long, mac, status FROM ( > SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 > THEN 1 ELSE -1 END > as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) > WHERE boxtype='d' )AS FOO WHERE (long>=X1) AND (long<=X2) AND > (lat>=Y1) AND (lat<=Y2) > > Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map > viewing area. > > QUERY PLAN #1 & #2 are from when I get a view from 10 miles out, > sometimes it uses the index(#1) and most of the time not(#2). I do > run into plans that seq scan both sides of the join. > > QUERY PLAN #3 is when I view from 5 miles out, and I have much greater > chance of getting index scans ( about 90% of the time). > > I have listed information about the database below. > > Cable_billing ~500,000 rows updated once per day > Davic ~500,000 rows, about 100 rows update per minute > > Any info or suggestions would be appreciated. > > Woody > > > twc-ral-overview=# \d cable_billing; > Table "public.cable_billing" > Column | Type | Modifiers > -----------------+------------------------+----------- > cable_billingid | integer | not null > mac | macaddr | not null > account | integer | > number | character varying(10) | > address | character varying(200) | > region | character varying(30) | > division | integer | > franchise | integer | > node | character varying(10) | > lat | numeric | > long | numeric | > trunk | character varying(5) | > ps | character varying(5) | > fd | character varying(5) | > le | character varying(5) | > update | integer | > boxtype | character(1) | > Indexes: cable_billing_pkey primary key btree (mac), > cable_billing_account_index btree (account), > cable_billing_lat_long_idx btree (lat, long), > cable_billing_node_index btree (node), > cable_billing_region_index btree (region) > > twc-ral-overview=# \d davic > Table "public.davic" > Column | Type | Modifiers > ---------+-----------------------+----------- > davicid | integer | not null > mac | macaddr | not null > source | character varying(20) | > status | smallint | > updtime | integer | > type | character varying(10) | > avail1 | integer | > Indexes: davic_pkey primary key btree (mac) > > > > twc-ral-overview=# vacuum analyze; > VACUUM > twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM > (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 > THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic > USING(mac) WHERE boxtype='d') AS foo WHERE (long>=-78.70723462816063) > AND > (long<=-78.53096764204116) AND (lat>=35.57411187866667) AND > (lat<=35.66366331376857); > QUERY PLAN #1 > > ---------------------------------------------------------------------- > ------ > ---------------------------------------------------------------------------- > ----- > Nested Loop Left Join (cost=0.00..23433.18 rows=1871 width=34) (actual > time=0.555..5095.434 rows=3224 loops=1) > -> Index Scan using cable_billing_lat_long_idx on cable_billing > (cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931 > rows=3224 loops=1) > Index Cond: ((lat >= 35.57411187866667) AND (lat <= > 35.66366331376857) AND (long >= -78.70723462816063) AND (long <= > -78.53096764204116)) > Filter: (boxtype = 'd'::bpchar) > -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 > width=8) (actual time=1.476..1.480 rows=1 loops=3224) > Index Cond: ("outer".mac = davic.mac) > Total runtime: 5100.028 ms > (7 rows) > > > > twc-ral-overview=# vacuum analyze; > VACUUM > twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM > (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 > THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic > USING(mac) WHERE boxtype='d') AS foo WHERE (long>=-78.87878592206046) > AND > (long<=-78.70220280717479) AND (lat>=35.71703190638861) AND > (lat<=35.80658335998006); > QUERY PLAN #2 > > ---------------------------------------------------------------------- > ------ > ---------------------------------------------------------------------------- > ------------------- > Nested Loop Left Join (cost=0.00..76468.90 rows=9223 width=34) (actual > time=0.559..17387.427 rows=19997 loops=1) > -> Seq Scan on cable_billing (cost=0.00..20837.76 rows=9223 width=32) > (actual time=0.290..7117.799 rows=19997 loops=1) > Filter: ((boxtype = 'd'::bpchar) AND (long >= -78.87878592206046) > AND (long <= -78.70220280717479) AND (lat >= 35.71703190638861) AND (lat <= > 35.80658335998006)) > -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 > width=8) (actual time=0.455..0.461 rows=1 loops=19997) > Index Cond: ("outer".mac = davic.mac) > Total runtime: 17416.501 ms > (6 rows) > > > > twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM > (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 > THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic > USING(mac) WHERE boxtype='d') AS foo WHERE (long>=-78.83419423836857) > AND > (long<=-78.7467945148866) AND (lat>=35.73964586635293) AND > (lat<=35.783969313080604); > QUERY PLAN #3 > > ---------------------------------------------------------------------- > ------ > ---------------------------------------------------------------------------- > ----- > Nested Loop Left Join (cost=0.00..29160.02 rows=2327 width=34) (actual > time=0.279..510.773 rows=5935 loops=1) > -> Index Scan using cable_billing_lat_long_idx on cable_billing > (cost=0.00..15130.08 rows=2326 width=32) (actual time=0.197..274.115 > rows=5935 loops=1) > Index Cond: ((lat >= 35.73964586635293) AND (lat <= > 35.783969313080604) AND (long >= -78.83419423836857) AND (long <= > -78.7467945148866)) > Filter: (boxtype = 'd'::bpchar) > -> Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 > width=8) (actual time=0.021..0.024 rows=1 loops=5935) > Index Cond: ("outer".mac = davic.mac) > Total runtime: 516.782 ms > (7 rows) > > ---------------------------------------------------------------------- > ------ > ----------------------------------- > > iglass Networks > 211-A S. Salem St. > (919) 387-3550 x813 > P.O. Box 651 > (919) 387-3570 fax > Apex, NC 27502 > http://www.iglass.net > > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
On Tuesday 23 March 2004 18:49, Woody Woodring wrote: > Hello, > > I am using postgres 7.4.2 as a backend for geocode data for a mapping > application. My question is why can't I get a consistent use of my indexes > during a query, I tend to get a lot of seq scan results. I'm not sure it wants to be using the indexes all of the time. > Nested Loop Left Join (cost=0.00..23433.18 rows=1871 width=34) (actual > time=0.555..5095.434 rows=3224 loops=1) > Total runtime: 5100.028 ms > Nested Loop Left Join (cost=0.00..76468.90 rows=9223 width=34) (actual > time=0.559..17387.427 rows=19997 loops=1) > Total runtime: 17416.501 ms > Nested Loop Left Join (cost=0.00..29160.02 rows=2327 width=34) (actual > time=0.279..510.773 rows=5935 loops=1) > Total runtime: 516.782 ms #1 = 630 rows/sec (with index on cable_billing) #2 = 1,148 rows/sec (without index) #3 = 11,501 rows/sec (with index) The third case is so much faster, I suspect the data wasn't cached at the beginning of this run. In any case #2 is faster than #1. If the planner is getting things wrong, you're not showing it here. -- Richard Huxton Archonet Ltd