PG Bug reporting form <noreply@postgresql.org> writes:
> I guess there can be differences how IN vs ANY input is handled for prepared
> statements but for plain SQL input I would expect them to produce identical
> plans.
If you can find any such promise in the documentation, I will be glad
to remove it ;-)
> EXPLAIN
> SELECT
> a.*
> FROM
> a
> JOIN b USING (node_id)
> WHERE
> a.project_id = ANY (ARRAY[1]::int8[])
> ORDER BY
> a.project_id,
> a.node_id
> LIMIT 10;
> EXPLAIN
> SELECT
> a.*
> FROM
> a
> JOIN b USING (node_id)
> WHERE
> a.project_id IN (1) -- Also OK with a plain "=1" witout IN
> ORDER BY
> a.project_id,
> a.node_id
> LIMIT 10;
The thing about "project_id IN (1)", which is reduced to "project_id = 1"
by the parser, is that the planner can see that that renders the "ORDER
BY a.project_id" clause a no-op. So it only has to order by node_id,
which is what makes this plan valid:
> Limit (cost=0.72..38.07 rows=10 width=16)
> -> Nested Loop (cost=0.72..55440.79 rows=14842 width=16)
> -> Index Only Scan using b_node_id_key on b (cost=0.29..397.29 rows=15000 width=8)
> -> Index Only Scan using a_uq on a (cost=0.43..3.67 rows=1 width=16)
> Index Cond: ((project_id = 1) AND (node_id = b.node_id))
If you don't recognize that then you're forced into sorting the indexscan
output, which is going to look pretty bad for a small-LIMIT situation.
We don't make any attempt to make a similar deduction from =ANY clauses,
mainly because it's usually not possible to be sure that the array has
only one member. I'm not excited about the cost/benefit ratio of adding
code to check for that.
regards, tom lane