Home > mailing lists

Re: question about executing JOINs - Mailing list pgsql-general

From	Tom Lane
Subject	Re: question about executing JOINs
Date	October 12, 2002 20:30:53
Msg-id	24975.1034469048@sss.pgh.pa.us Whole thread Raw
In response to	question about executing JOINs (Josh Burdick <jburdick@gradient.cis.upenn.edu>)
List	pgsql-general

Tree view

Josh Burdick <jburdick@gradient.cis.upenn.edu> writes:
> Nested Loop  (cost=0.00..198183643.06 rows=85230539 width=51)
>   ->  Seq Scan on ucsc_ref_seq_ali_hg12  (cost=0.00..817.65 rows=15165
> width=29)
>   ->  Index Scan using ucsc_snp_tsc_hg12_chrom_start on
> ucsc_snp_tsc_hg12  (cost=0.00..12962.39 rows=4713 width=22)

What's the actual numbers of rows involved?  (EXPLAIN ANALYZE output
would be far more useful than plain EXPLAIN.)

Do you have a feeling for the number of rows that would be produced by
just the JOIN/ON condition (no constraint on snp_start)?  How does that
compare to EXPLAIN's estimate of that number of rows?

>     The planner is assuming that it's a cross join: that we need all
> pairs of records.  It's not taking into account the WHERE clause which
> restricts to a tiny fraction of the records.

No it isn't, and yes it is, but it evidently is making a bad estimate of
the fraction of rows eliminated by those clauses.  I'd like to find out
just what its estimate of that fraction is and what the correct value
would be.

> Perhaps the planner should assume that a nested loop over an index
> scan only looks at 1% of its records?

Arbitrary assumptions designed to fix one example tend to break other
examples ...

            regards, tom lane

pgsql-general by date:

From: "Robert Fitzpatrick"
Date: 12 October 2002, 15:28:14
Subject: Re: Removing {"="} privledges

From: Martijn van Oosterhout
Date: 12 October 2002, 22:58:07
Subject: Re: Removing {"="} privledges

Re: question about executing JOINs - Mailing list pgsql-general

Previous

Next