Thread: question about executing JOINs
We're using Postgres 7.2.1 in a biology lab. Some of our joins involving selecting from two tables, but the fields we're selecting aren't equal, they're just nearby. select "tName" as "chrom", [...stuff deleted...] "tStart" - "snp_start" as "distance_from_start", "snp_start" - "tEnd" as "distance_from_end" from ucsc_ref_seq_ali_hg12 join ucsc_snp_tsc_hg12 on ucsc_snp_tsc_hg12."snp_chrom" = ucsc_ref_seq_ali_hg12."tName" where "snp_start" >= "tStart" - 1000000 and "snp_start" <= "tEnd" + 1000000; The problem is that the planner comes up with: Merge Join (cost=201234.34..17463319.54 rows=85230539 width=51) -> Sort (cost=1870.74..1870.74 rows=15165 width=29) -> Seq Scan on ucsc_ref_seq_ali_hg12 (cost=0.00..817.65 rows=15165 width=29) -> Sort (cost=199363.59..199363.59 rows=1145280 width=22) -> Seq Scan on ucsc_snp_tsc_hg12 (cost=0.00..20996.80 rows=1145280 width=22) which doesn't finish after 45 minutes, and presumably would take a while to finish. However, by temporarily doing set enable_mergejoin=off; set enable_hashjoin=off; the planner says Nested Loop (cost=0.00..198183643.06 rows=85230539 width=51) -> Seq Scan on ucsc_ref_seq_ali_hg12 (cost=0.00..817.65 rows=15165 width=29) -> Index Scan using ucsc_snp_tsc_hg12_chrom_start on ucsc_snp_tsc_hg12 (cost=0.00..12962.39 rows=4713 width=22) which takes four minutes. So, in practice, we just save the output of that in a table (using CREATE TABLE AS), which works. It's just a bit awkward to have to fiddle with those switches; it would be nice if the planner were a bit cleverer in this case. Is there a runtime parameter that seems likely to fix this? (perhaps CPU_INDEX_TUPLE_COST). The planner is assuming that it's a cross join: that we need all pairs of records. It's not taking into account the WHERE clause which restricts to a tiny fraction of the records. Perhaps the planner should assume that a nested loop over an index scan only looks at 1% of its records? This would make cross joins more expensive. But most of my cross joins have been accidental, when I've left out a join condition, so I don't really mind them taking longer, because I'll just stop them after 20 minutes anyway :) Note that this isn't a question of join order (I think); it's just a question of which way to execute the join. Thanks, Josh -- Josh Burdick jburdick@gradient.cis.upenn.edu http://www.cis.upenn.edu/~jburdick
Josh Burdick <jburdick@gradient.cis.upenn.edu> writes: > Nested Loop (cost=0.00..198183643.06 rows=85230539 width=51) > -> Seq Scan on ucsc_ref_seq_ali_hg12 (cost=0.00..817.65 rows=15165 > width=29) > -> Index Scan using ucsc_snp_tsc_hg12_chrom_start on > ucsc_snp_tsc_hg12 (cost=0.00..12962.39 rows=4713 width=22) What's the actual numbers of rows involved? (EXPLAIN ANALYZE output would be far more useful than plain EXPLAIN.) Do you have a feeling for the number of rows that would be produced by just the JOIN/ON condition (no constraint on snp_start)? How does that compare to EXPLAIN's estimate of that number of rows? > The planner is assuming that it's a cross join: that we need all > pairs of records. It's not taking into account the WHERE clause which > restricts to a tiny fraction of the records. No it isn't, and yes it is, but it evidently is making a bad estimate of the fraction of rows eliminated by those clauses. I'd like to find out just what its estimate of that fraction is and what the correct value would be. > Perhaps the planner should assume that a nested loop over an index > scan only looks at 1% of its records? Arbitrary assumptions designed to fix one example tend to break other examples ... regards, tom lane
Tom Lane wrote: >Josh Burdick <jburdick@gradient.cis.upenn.edu> writes: > > >>Nested Loop (cost=0.00..198183643.06 rows=85230539 width=51) >> -> Seq Scan on ucsc_ref_seq_ali_hg12 (cost=0.00..817.65 rows=15165 >>width=29) >> -> Index Scan using ucsc_snp_tsc_hg12_chrom_start on >>ucsc_snp_tsc_hg12 (cost=0.00..12962.39 rows=4713 width=22) >> >> > >What's the actual numbers of rows involved? (EXPLAIN ANALYZE output >would be far more useful than plain EXPLAIN.) > > The two tables have 15485 and 1095497 rows, respectively. The join returns 11135953 records, which is humongous, but is still only 0.066% of the number of pairs of rows. (Which is why I stopped EXPLAIN ANALYZE after 45 minutes.) >Do you have a feeling for the number of rows that would be produced by >just the JOIN/ON condition (no constraint on snp_start)? How does that >compare to EXPLAIN's estimate of that number of rows? > > The JOIN/ON condition is on a column with 24 possible values. I'm not quite sure how close the planner would be on that case, but it would be a lot closer. > > >> The planner is assuming that it's a cross join: that we need all >>pairs of records. It's not taking into account the WHERE clause which >>restricts to a tiny fraction of the records. >> >> > >No it isn't, and yes it is, but it evidently is making a bad estimate of >the fraction of rows eliminated by those clauses. I'd like to find out >just what its estimate of that fraction is and what the correct value >would be. > > The planner's original estimates of how many rows are needed from each table (15165 and 1145280) are quite close; pretty much all rows are needed from each table. It estimates that the MERGE JOIN will return 85230539 rows, and 85230539 / (15165 * 1145280) = 0.49%. (As opposed to 0.066%.) For what its worth, if you just multiply the number of rows together for the original query, you get 17368171200, while for the version using the index scan, you get 71472645, which is a lot less, and within a factor of 10 of the actual number of rows returned. So perhaps that number could be factored into the cost estimate better somehow. > > >>Perhaps the planner should assume that a nested loop over an index >>scan only looks at 1% of its records? >> >> > >Arbitrary assumptions designed to fix one example tend to break other >examples ... > > Fair enough. I think the current hack is good enough -- if you really need to override the planner, you can. And the eventual fix should be to make the planner smarter. > regards, tom lane > > > > Hope this helps, Josh -- Josh Burdick jburdick@gradient.cis.upenn.edu http://www.cis.upenn.edu/~jburdick