Re: sequence scan problem - Mailing list pgsql-performance
From | John Beaver |
---|---|
Subject | Re: sequence scan problem |
Date | |
Msg-id | 4868BC74.6020006@gmail.com Whole thread Raw |
In response to | Re: sequence scan problem (Jeremy Harris <jgh@wizmail.org>) |
Responses |
Re: sequence scan problem
|
List | pgsql-performance |
Ok, here's the explain analyze result. Again, this is Postgres 8.3.3 and I vacuumed-analyzed both tables directly after they were created. # explain analyze select fls.function_verified, fls.score, fls.go_category_group_ref, fs1.gene_ref, fs1.function_verified_exactly, fs2.gene_ref, fs2.function_verified_exactly from functional_linkage_scores fls, gene_prediction_view fs1, gene_prediction_view fs2 where fls.gene_ref1 = fs1.gene_ref and fls.gene_ref2 = fs2.gene_ref and fs1.go_term_ref = 2 and fs2.go_term_ref = 2; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Merge Join (cost=1399203593.41..6702491234.74 rows=352770803726 width=22) (actual time=6370194.467..22991303.434 rows=15610535128 loops=1) Merge Cond: (fs2.gene_ref = fls.gene_ref2) -> Index Scan using gene_prediction_view_gene_ref on gene_prediction_view fs2 (cost=0.00..12111899.77 rows=197899 width=5) (actual time=29.592..469838.583 rows=180629 loops=1) Index Cond: (go_term_ref = 2) -> Materialize (cost=1399069432.20..1483728633.52 rows=6772736105 width=21) (actual time=6370164.864..16623552.417 rows=15610535121 loops=1) -> Sort (cost=1399069432.20..1416001272.47 rows=6772736105 width=21) (actual time=6370164.860..13081970.248 rows=1897946790 loops=1) Sort Key: fls.gene_ref2 Sort Method: external merge Disk: 61192240kB -> Merge Join (cost=40681244.97..154286110.62 rows=6772736105 width=21) (actual time=592112.778..2043161.851 rows=1897946790 loops=1) Merge Cond: (fs1.gene_ref = fls.gene_ref1) -> Index Scan using gene_prediction_view_gene_ref on gene_prediction_view fs1 (cost=0.00..12111899.77 rows=197899 width=5) (actual time=0.015..246613.129 rows=180644 loops=1) Index Cond: (go_term_ref = 2) -> Materialize (cost=40586010.10..43490582.70 rows=232365808 width=20) (actual time=592112.755..1121366.375 rows=1897946783 loops=1) -> Sort (cost=40586010.10..41166924.62 rows=232365808 width=20) (actual time=592112.721..870349.308 rows=232241678 loops=1) Sort Key: fls.gene_ref1 Sort Method: external merge Disk: 7260856kB -> Seq Scan on functional_linkage_scores fls (cost=0.00..3928457.08 rows=232365808 width=20) (actual time=14.221..86455.902 rows=232241678 loops=1) Total runtime: 24183346.271 ms (18 rows) Jeremy Harris wrote: > John Beaver wrote: >> I'm having a strange problem with a query. The query is fairly >> simple, with a few constants and two joins. All relevant columns >> should be indexed, and I'm pretty sure there aren't any type >> conversion issues. But the query plan includes a fairly heavy seq >> scan. The only possible complication is that the tables involved are >> fairly large - hundreds of millions of rows each. >> >> Can anyone explain this? There should only ever be a maximum of about >> 50 rows returned when the query is executed. > > You didn't say when you last vacuumed? > If there should only be 50 rows returned then the estimates from the > planner are way out. > > If that doesn't help, we'll need version info, and (if you can afford > the time) an "explain analyze" > > Cheers, > Jeremy >
pgsql-performance by date: