On 23 May 2018 at 21:31, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I've been working on a patch to add a little cache to SubPlans, to speed up
> queries with correlated subqueries, where the same subquery is currently
> executed multiple times with the same parameters. The idea is to cache the
> result of the subplan, with the correlation vars as the cache key.
...
> Thoughts?
G'day Sir,
I'm in favour of making improvements here. I had a think about this
and just glanced at the patch to check if you'd done it the way I'd
thought..
I'd thought this might be done with some sort of "LazyMaterialize"
node that could take params and store multiple rows per param set.
That node type could contain all the LRU logic to get rid of the
lesser used items when work_mem begin to fill. If you did it this way
then nodeSubplan.c would need to know nothing about this. The planner
would simply just inject one of these when it thinks some caching
would be wise, similar to how it does with Materialize.
LazyMaterialize would simply check the cache and return those rows, if
they exist, otherwise consult its only subplan to get the rows and
then cache them. If you did it this way, as a followup we could go
plug it into parameterised nested loops to speed up repeated lookups
of the inner side plan. The gains there are probably similar to what
you've mentioned.
What do you think?
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services