On Fri, Mar 23, 2012 at 12:29 PM, Greg Stark <stark@mit.edu> wrote:
> On Fri, Mar 23, 2012 at 3:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The complication, opportunities for bugs, and general slowdown
>> associated with that would outweigh any possible gain, in the opinion
>> of most hackers who have thought about this.
>
> I wouldn't be quite so pessimistic. I think the problem is that the
> hard part in doing this for real is all the parts the proposal glosses
> over. How much memory is it worth dedicating to the cache before the
> cost of that memory costs more than it helps? How do you invalidate
> cache entries efficiently enough that it doesn't become a bottleneck?
I think the question of how you would invalidate things is a very good one.
The other thing that makes me skeptical of this proposal is that I am
not very sure that executing absolutely identical queries is a very
common use case for a relational database. I suppose there might be a
few queries that run over and over again (e.g. whatever you need to
render your home page), but I think those will be the exception, and
not the rule. It therefore seems likely that the overhead of such a
cache would in most cases be greater than the benefit of having it in
the first place.
What I think is more common is the repeated submission of queries that
are *nearly* identical, but with either different parameter bindings
or different constants. It would be nice to have some kind of cache
that would allow us to avoid the overhead of parsing and planning
nearly identical statements over and over again, but the trick is that
you have to fingerprint the query to notice that's happening in the
first place, and the fingerprinting has to cost less than what the
cache saves you. I don't know whether that's possible, but I suspect
it's far from easy.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company