I've read through all of the previous discussions related to stable
subexpression caching, and I'm planning to send a summary email with
all of those links in one place.
But I also happened to stumble upon mention in the TODO of some email
discussion way back in 2007 where Tom suggested [1] we should really
try planning scalar array ops (particularly those with large IN lists)
as `IN (VALUES ...)`.
That actually would solve the specific case I'd had this problem with
(seq scan on a large constant array IN expression). Ideally any query
with forms like:
select * from t where a in (1, 2,...)
select * from t where a in ((select i from x))
would always be isomorphic in planning. But thinking about this
overnight and scanning through things quickly this morning, I have a
feeling that'd be 1.) a pretty significant undertaking, and 2.) likely
to explode the number of plans considered.
Also I don't know if there's a good place to slot that into planning.
Do either of you happen to have any pointers into places that do
similar kinds of rewrites I could look at? And in those cases do we
normally always rewrite or do we consider both styles independently?
I suppose _only_ handling the case where a `IN (VALUES ...)` replaces
a seq scan with a scalar array op might be somewhat easier...but feels
like it leaves a lot of holes.
I'm still at the point where I'm trying to determine if any of the
above (subexpression caching, saop optimization only on constants,
re-planning as `IN (VALUES ...)`) is something reasonable enough
relative to the amount of effort to be worth working on.
James
[1]: https://www.postgresql.org/message-id/19001.1178823208%40sss.pgh.pa.us