Re: Problem with mergejoin performance - Mailing list pgsql-performance

From Tom Lane
Subject Re: Problem with mergejoin performance
Date
Msg-id 11777.1284387988@sss.pgh.pa.us
Whole thread Raw
In response to Problem with mergejoin performance  (<azajac@ang.com.pl>)
List pgsql-performance
<azajac@ang.com.pl> writes:
> Merge Join  (cost=18.90..20.85 rows=1 width=8) (actual time=614.912..614.912
> rows=0 loops=1)
>    Merge Cond: (rr.id = ze.id)
>    ->  Index Scan using bug_t2_i1 on bug_t2 rr  (cost=0.00..17893.49
> rows=278417 width=4) (actual time=0.023..351.945 rows=278417 loops=1)
>    ->  Sort  (cost=18.88..18.89 rows=4 width=4) (actual time=0.164..0.164
> rows=1 loops=1)
>          Sort Key: ze.id
>          Sort Method:  quicksort  Memory: 17kB
>          ->  Index Scan using bug_t1_i1 on bug_t1 ze  (cost=0.00..18.84
> rows=4 width=4) (actual time=0.059..0.141 rows=4 loops=1)
>                Index Cond: (ids = 94543)
>  Total runtime: 615.003 ms

> I think that problem is with estimation of total mergejoin time, why is it
> so small (18.90..20.85) while estimates of subqueries (especially first) is
> high (0..17893). Merging time should be high, because it needs to scan
> almost all bug t2 table. Am I right?

Actually, a mergejoin can stop short of processing all of either input,
if it exhausts the keys from the other input first; and the planner
knows that.  In this case it evidently thinks that the maximum key from
bug_t1 is much less than the maximum key from bug_t2, so that most of
the indexscan on bug_t2 won't have to be executed.  With only 4 rows
in bug_t1 it doesn't seem very likely that it would get this wrong.
What exactly are those join key values, and what are the min/max values
in bug_t2?

            regards, tom lane

pgsql-performance by date:

Previous
From: Gaetano Mendola
Date:
Subject: Useless sort by
Next
From: Tom Lane
Date:
Subject: Re: Useless sort by