Re: Unnecessary repeat condition for a self inner join - Mailing list pgsql-sql

From Robins Tharakan
Subject Re: Unnecessary repeat condition for a self inner join
Date
Msg-id 36af4bed0807120105l5545d6e3s27428f1f2841c0f6@mail.gmail.com
Whole thread Raw
In response to Re: Unnecessary repeat condition for a self inner join  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Unnecessary repeat condition for a self inner join  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql

While the optimizer theoretically could deduce the extra restriction
condition, it doesn't attempt to.  It's extremely unclear that the extra
cycles to look for such cases would be repaid on average, because cases
like this aren't that common.  The current state of affairs is that
the system will deduce implied equality conditions, but not implied
inequality conditions.

One good thing is that the equality conditions are taken care of. But I fail to understand why do you believe that the second case is rare. I think the optimizer would (in all self-join inequality conditions) tend towards a table scan, which for a large table is a disaster. (Of course, the index scan would help only if the result-set is small)

Besides, I did a simple test and although you are right about the optimizer deducing implied equality conditions, this holds true only for a direct join. In the second query, the optimizer recommends a table scan even for a simple IN() condition.

Is that normal ?


Regards,
Robins Tharakan

Query 1:

SELECT n1.scheme_code
FROM nav n1
    INNER JOIN nav n2 ON n1.scheme_code = n2.scheme_code
WHERE n1.scheme_code = 290

"Nested Loop  (cost=0.00..16147232.47 rows=4796100 width=4)"
"  ->  Index Scan using nav__schemecode_date_lookup3b on nav n1  (cost=0.00..7347.91 rows=2190 width=4)"
"        Index Cond: (scheme_code = 290)"
"  ->  Index Scan using nav__schemecode_date_lookup3b on nav n2  (cost=0.00..7347.91 rows=2190 width=4)"
"        Index Cond: (290 = scheme_code)"


Query 2:

SELECT n1.scheme_code
FROM nav n1
    INNER JOIN nav n2 ON n1.scheme_code = n2.scheme_code
WHERE n1.scheme_code IN (1, 2)

"Hash Join  (cost=206004.00..431864.83 rows=10720451 width=4)"
"  Hash Cond: (n1.scheme_code = n2.scheme_code)"
"  ->  Bitmap Heap Scan on nav n1  (cost=139.62..13663.13 rows=4378 width=4)"
"        Recheck Cond: (scheme_code = ANY ('{1,2}'::integer[]))"
"        ->  Bitmap Index Scan on nav__schemecode_date_lookup3b  (cost=0.00..138.53 rows=4378 width=0)"
"              Index Cond: (scheme_code = ANY ('{1,2}'::integer[]))"
"  ->  Hash  (cost=112078.06..112078.06 rows=5395306 width=4)"
"        ->  Seq Scan on nav n2  (cost=0.00..112078.06 rows=5395306 width=4)"


pgsql-sql by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Rollback in Postgres
Next
From: "Dave Page"
Date:
Subject: Re: Rollback in Postgres