Re: Subplan result caching - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Subplan result caching
Date
Msg-id ec033922-579a-647e-f042-2ee4614c70ec@iki.fi
Whole thread Raw
In response to Re: Subplan result caching  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Subplan result caching  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 23/05/18 19:25, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>> I've been working on a patch to add a little cache to SubPlans, to speed
>> up queries with correlated subqueries, where the same subquery is
>> currently executed multiple times with the same parameters. The idea is
>> to cache the result of the subplan, with the correlation vars as the
>> cache key.
> 
> I find this pretty bogus as it stands, because it assumes without proof
> that the subquery will deliver identical results for any two parameter
> values that are considered equal by the datatype's default equality
> operator.  An easy counterexample is a subquery whose result depends on
> the text conversion of a float8 parameter: zero and minus zero have
> different text forms, but are equal according to float8eq.

Good point.

> To make this
> patch safe, I think you'd need to grovel through the subquery and make
> sure that the parameters are only used as inputs to operators that belong
> to the type's default btree or hash opfamily.  (Many other cases would
> work in practice, but we have no semantic knowledge that would let us be
> sure of that.)

Hmm. First thing that comes to mind is to use the raw bytes as cache 
key, only treating Datums as equal if their binary representation is 
identical.

> That's doable no doubt, but I wonder whether that leaves you in a place
> that's any better than the plan-time-decorrelation approach you proposed
> in the earlier thread.  I liked that better TBH; this one seems like
> a very ad-hoc reinvention of a hash join.  I don't especially like the
> unpredictable number of executions of the subquery that it results in,
> either.

I'd certainly prefer the plan-time-decorrelation, too. But I suspect 
that there's always going to be some cases that cannot be decorrelated, 
where a cache like this would help.

- Heikki


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: PG11 jit failing on ppc64el
Next
From: Tom Lane
Date:
Subject: Re: Subplan result caching