Thread: Nested loop in simple query taking long time
Hello list, Usually I can see what is wrong with queries but I can't figure out why this query is slow. Below is query and explain analyze output. Any help would be appreciated. EXPLAIN ANALYZE SELECT computer_name FROM tbl_computer INNER JOIN tbl_share ON pk_computer_id = tbl_share.fk_computer_id INNER JOIN tbl_archive ON pk_share_id = tbl_archive.fk_share_id LEFT OUTER JOIN tbl_job ON fk_job_id = pk_job_id LEFT OUTER JOIN tbl_job_group ON fk_job_group_id = pk_job_group_id WHERE archive_complete IS TRUE AND (job_group_type != 'R' OR job_group_type IS NULL) GROUP BY computer_name ORDER BY computer_name; Group (cost=1171.27..19099.89 rows=16 width=11) (actual time=4184.019..40929.159 rows=14 loops=1) -> Nested Loop Left Join (cost=1171.27..19099.61 rows=112 width=11) (actual time=4184.018..40928.994 rows=192 loops=1) Join Filter: (tbl_archive.fk_job_id = tbl_job.pk_job_id) Filter: ((tbl_job_group.job_group_type <> 'R'::bpchar) OR (tbl_job_group.job_group_type IS NULL)) -> Nested Loop (cost=1151.65..18960.06 rows=123 width=19) (actual time=4080.070..40821.217 rows=192 loops=1) -> Nested Loop (cost=1.18..43.19 rows=16 width=19) (actual time=59.167..91.739 rows=16 loops=1) Join Filter: (tbl_computer.pk_computer_id = tbl_share.fk_computer_id) -> Index Scan using computer_name on tbl_computer (cost=0.00..36.25 rows=16 width=19) (actual time=48.844..81.210 rows=16 loops=1) -> Materialize (cost=1.18..1.34 rows=16 width=16) (actual time=0.645..0.650 rows=16 loops=16) -> Seq Scan on tbl_share (cost=0.00..1.16 rows=16 width=16) (actual time=10.312..10.318 rows=16 loops=1) -> Bitmap Heap Scan on tbl_archive (cost=1150.47..1182.20 rows=8 width=16) (actual time=1564.082..2545.570 rows=12 loops=16) Recheck Cond: (tbl_share.pk_share_id = tbl_archive.fk_share_id) Filter: (archive_complete IS TRUE) -> Bitmap Index Scan on tbl_archive_idx1 (cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456 rows=86053 loops=16) Index Cond: (tbl_share.pk_share_id = tbl_archive.fk_share_id) -> Materialize (cost=19.62..20.01 rows=39 width=13) (actual time=0.385..0.549 rows=39 loops=192) -> Hash Left Join (cost=11.65..19.58 rows=39 width=13) (actual time=73.747..103.889 rows=39 loops=1) Hash Cond: (tbl_job.fk_job_group_id = tbl_job_group.pk_job_group_id) -> Seq Scan on tbl_job (cost=0.00..7.39 rows=39 width=16) (actual time=54.284..84.396 rows=39 loops=1) -> Hash (cost=11.29..11.29 rows=29 width=13) (actual time=0.119..0.119 rows=29 loops=1) -> Seq Scan on tbl_job_group (cost=0.00..11.29 rows=29 width=13) (actual time=0.013..0.087 rows=29 loops=1) Total runtime: 40929.241 ms Cheer, Henke
Henrik Zagerholm <henke@mac.se> writes: > Usually I can see what is wrong with queries but I can't figure out > why this query is slow. Seems the main problem is here: > -> Bitmap Index Scan on tbl_archive_idx1 > (cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456 > rows=86053 loops=16) > Index Cond: (tbl_share.pk_share_id = > tbl_archive.fk_share_id) Why is this scan finding so many more rows than the planner expects? Perhaps increasing the stats targets on these columns would help it make a better estimate. regards, tom lane
5 dec 2007 kl. 16.25 skrev Tom Lane: > Henrik Zagerholm <henke@mac.se> writes: >> Usually I can see what is wrong with queries but I can't figure out >> why this query is slow. > > Seems the main problem is here: > >> -> Bitmap Index Scan on tbl_archive_idx1 >> (cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456 >> rows=86053 loops=16) >> Index Cond: (tbl_share.pk_share_id = >> tbl_archive.fk_share_id) > > Why is this scan finding so many more rows than the planner expects? > Perhaps increasing the stats targets on these columns would help it > make a better estimate. This is really weird. That tables primary key sequence is at 1220 and the number of rows right now is 139. There have never been that many rows in tbl_archive. Could the index or stat be really really corrupt? Thanks, Henke > > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq
Henrik Zagerholm <henke@mac.se> writes: > 5 dec 2007 kl. 16.25 skrev Tom Lane: >> Henrik Zagerholm <henke@mac.se> writes: >>> -> Bitmap Index Scan on tbl_archive_idx1 >>> (cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456 >>> rows=86053 loops=16) >>> Index Cond: (tbl_share.pk_share_id = >>> tbl_archive.fk_share_id) >> Why is this scan finding so many more rows than the planner expects? > This is really weird. That tables primary key sequence is at 1220 > and the number of rows right now is 139. There have never been that > many rows in tbl_archive. Could the index or stat be really really > corrupt? I wonder how long it's been since you vacuumed that table? The rowcount from the bitmap indexscan would include tuple IDs that are in the index but prove to be dead upon arrival at the heap. regards, tom lane
6 dec 2007 kl. 18.12 skrev Tom Lane: > Henrik Zagerholm <henke@mac.se> writes: >> 5 dec 2007 kl. 16.25 skrev Tom Lane: >>> Henrik Zagerholm <henke@mac.se> writes: >>>> -> Bitmap Index Scan on tbl_archive_idx1 >>>> (cost=0.00..1150.47 rows=8 width=0) (actual time=1505.456..1505.456 >>>> rows=86053 loops=16) >>>> Index Cond: (tbl_share.pk_share_id = >>>> tbl_archive.fk_share_id) > >>> Why is this scan finding so many more rows than the planner expects? > >> This is really weird. That tables primary key sequence is at 1220 >> and the number of rows right now is 139. There have never been that >> many rows in tbl_archive. Could the index or stat be really really >> corrupt? > > I wonder how long it's been since you vacuumed that table? The > rowcount > from the bitmap indexscan would include tuple IDs that are in the > index > but prove to be dead upon arrival at the heap. I actually have autovacuum on the whole database but I may have been a little too restrictive on the autovacuum settings. I did a vacuum full and a reindex and now its fine again. I think I have a clue why its so off. We update a value in that table about 2 - 3 million times per night and as update creates a new row it becomes bloated pretty fast. The table hade a size of 765 MB including indexes and after vacuum full and reindex it went down to 80kB... I guess I need routine reindex on this table. Thank god is not big. :) Thanks Tom! Cheers, Henke > > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: explain analyze is your friend
Henrik wrote: > I think I have a clue why its so off. We update a value in that table about > 2 - 3 million times per night and as update creates a new row it becomes > bloated pretty fast. The table hade a size of 765 MB including indexes and > after vacuum full and reindex it went down to 80kB... I guess I need > routine reindex on this table. Thank god is not big. :) I suggest you put a lone VACUUM on that table in cron, say once every 5 minutes, and you should be fine. You shouldn't need a reindex at all. -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ "Right now the sectors on the hard disk run clockwise, but I heard a rumor that you can squeeze 0.2% more throughput by running them counterclockwise. It's worth the effort. Recommended." (Gerry Pourwelle)
Future Enhancement? If the column's new value can fit in the space already being used by the existing value, just change the column value in place and leave the record alone. Would reduce the need for vacuum in many cases. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alvaro Herrera Sent: Thursday, December 06, 2007 2:26 PM To: Henrik Cc: Tom Lane; pgsql-general@postgresql.org Subject: Re: [GENERAL] Nested loop in simple query taking long time Henrik wrote: > I think I have a clue why its so off. We update a value in that table about > 2 - 3 million times per night and as update creates a new row it becomes > bloated pretty fast. The table hade a size of 765 MB including indexes and > after vacuum full and reindex it went down to 80kB... I guess I need > routine reindex on this table. Thank god is not big. :) I suggest you put a lone VACUUM on that table in cron, say once every 5 minutes, and you should be fine. You shouldn't need a reindex at all. -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ "Right now the sectors on the hard disk run clockwise, but I heard a rumor that you can squeeze 0.2% more throughput by running them counterclockwise. It's worth the effort. Recommended." (Gerry Pourwelle) ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
Gauthier, Dave wrote: > Future Enhancement? > If the column's new value can fit in the space already being used by the > existing value, just change the column value in place and leave the > record alone. Would reduce the need for vacuum in many cases. That's in 8.3. Not exactly like that (because it doesn't work in Postgres due to MVCC), but something close. Look for HOT in release notes. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4" Syntax error: function hell() needs an argument. Please choose what hell you want to involve.
6 dec 2007 kl. 22.18 skrev Alvaro Herrera: > Gauthier, Dave wrote: >> Future Enhancement? >> If the column's new value can fit in the space already being used >> by the >> existing value, just change the column value in place and leave the >> record alone. Would reduce the need for vacuum in many cases. > > That's in 8.3. Not exactly like that (because it doesn't work in > Postgres due to MVCC), but something close. Look for HOT in release > notes. > I want 8.3 now! =) I did a test last night. Table grew from 16 kb to 648 kb. The index grew from 64 kb to 6056 MB. A REINDEX took the index back down to 64 kb. The good thing is that REINDEX takes about 1 sek so I can do that quite often. :) Thanks for all the input Henke > -- > Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º > 13' 56.4" > Syntax error: function hell() needs an argument. > Please choose what hell you want to involve. > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match
6 dec 2007 kl. 20.26 skrev Alvaro Herrera: > Henrik wrote: > >> I think I have a clue why its so off. We update a value in that >> table about >> 2 - 3 million times per night and as update creates a new row it >> becomes >> bloated pretty fast. The table hade a size of 765 MB including >> indexes and >> after vacuum full and reindex it went down to 80kB... I guess I need >> routine reindex on this table. Thank god is not big. :) > > I suggest you put a lone VACUUM on that table in cron, say once > every 5 > minutes, and you should be fine. You shouldn't need a reindex at all. Instead of cron can't I just have really aggressive autovacuum settings on this table? Thanks, Henke > > > -- > Alvaro Herrera http://www.PlanetPostgreSQL.org/ > "Right now the sectors on the hard disk run clockwise, but I heard a > rumor that > you can squeeze 0.2% more throughput by running them counterclockwise. > It's worth the effort. Recommended." (Gerry Pourwelle) > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your > message can get through to the mailing list cleanly
Henrik wrote: > > 6 dec 2007 kl. 20.26 skrev Alvaro Herrera: > >> Henrik wrote: >> >>> I think I have a clue why its so off. We update a value in that table >>> about >>> 2 - 3 million times per night and as update creates a new row it becomes >>> bloated pretty fast. The table hade a size of 765 MB including indexes >>> and >>> after vacuum full and reindex it went down to 80kB... I guess I need >>> routine reindex on this table. Thank god is not big. :) >> >> I suggest you put a lone VACUUM on that table in cron, say once every 5 >> minutes, and you should be fine. You shouldn't need a reindex at all. > Instead of cron can't I just have really aggressive autovacuum settings on > this table? Not on 8.2 (or earlier), because it can only process one table at a time, so if it ever takes much longer than 5 minutes for vacuuming other tables, this table will be neglected for at least that long -- and probably a lot longer actually. On 8.3 this should be somewhat less of a problem because autovacuum can process more than one table at a time. It will still be a problem if there are several tables that take much longer than 5 minutes, but it should be much less severe. -- Alvaro Herrera http://www.flickr.com/photos/alvherre/ "XML!" Exclaimed C++. "What are you doing here? You're not a programming language." "Tell that to the people who use me," said XML.