On 23/05/18 19:25, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>> I've been working on a patch to add a little cache to SubPlans, to speed
>> up queries with correlated subqueries, where the same subquery is
>> currently executed multiple times with the same parameters. The idea is
>> to cache the result of the subplan, with the correlation vars as the
>> cache key.
>
> I find this pretty bogus as it stands, because it assumes without proof
> that the subquery will deliver identical results for any two parameter
> values that are considered equal by the datatype's default equality
> operator. An easy counterexample is a subquery whose result depends on
> the text conversion of a float8 parameter: zero and minus zero have
> different text forms, but are equal according to float8eq.
Good point.
> To make this
> patch safe, I think you'd need to grovel through the subquery and make
> sure that the parameters are only used as inputs to operators that belong
> to the type's default btree or hash opfamily. (Many other cases would
> work in practice, but we have no semantic knowledge that would let us be
> sure of that.)
Hmm. First thing that comes to mind is to use the raw bytes as cache
key, only treating Datums as equal if their binary representation is
identical.
> That's doable no doubt, but I wonder whether that leaves you in a place
> that's any better than the plan-time-decorrelation approach you proposed
> in the earlier thread. I liked that better TBH; this one seems like
> a very ad-hoc reinvention of a hash join. I don't especially like the
> unpredictable number of executions of the subquery that it results in,
> either.
I'd certainly prefer the plan-time-decorrelation, too. But I suspect
that there's always going to be some cases that cannot be decorrelated,
where a cache like this would help.
- Heikki