Re: invalidating cached plans - Mailing list pgsql-hackers

From Neil Conway
Subject Re: invalidating cached plans
Date
Msg-id 4236736A.7030302@samurai.com
Whole thread Raw
In response to Re: invalidating cached plans  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: invalidating cached plans  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: invalidating cached plans  (Neil Conway <neilc@samurai.com>)
List pgsql-hackers
Tom Lane wrote:
> I hadn't really gotten as far as working out a reasonable API for the
> module.  The $64 question seems to be what is the input: a textual query
> string, a raw parse analysis tree, or what?

It should be easy enough to accept either, and then convert from the 
query string into a raw parse tree. The storage of the parse tree should 
probably be owned by the cache module, so that might introduce some 
slight complications -- like exposing a MemoryContext for callers to 
allocate inside, but it should be doable anyway.

> And what sort of key does the caller want to use to re-find a
> previously cached plan?

Do we want to share plans between call sites? If so, an API like this 
comes to mind is:

struct CachedPlan
{    List *query_list;    List *plan_list;    char *query_str;    int nargs;    Oid *argtypes;    int refcnt;    /*
variousother info -- perhaps memory context? */
 
};

struct CachedPlan *cache_get_plan(const char *query_str,                                  int nargs, Oid *argtypes);
void cache_destroy_plan(struct CachedPlan *plan);

Where cache_get_plan() would lookup the query string in a hash table 
(mapping strings => CachedPlans). If found, it would check if the plan 
had been invalidated, and would replan it if necessary, then bump its 
reference count and return it. If not found, it would create a new 
CachedPlan, parse, rewrite and plan the query string, and return it. 
This would mean that within a backend we could share planning for 
queries that happened to be byte-for-byte identical.

- it would be nice to do the hash lookup on the result of raw_parser() 
rather than the query string itself, since we would be able to share 
more plans that way. Not sure if that's worth doing, though.

- how do we manage storage? The reference counting above is 
off-the-cuff. Perhaps there's a better way to do this... (Of course, if 
a plan has refcnt > 0, we can still remove it from memory if needed, 
since any call site should provide sufficient information to reconstruct it)

This would also make it somewhat more plausible to share the query cache 
among backends, but I'm not interested in pursuing that right now.

(BTW, another thing to consider is how the rewriter will effect a plan's 
dependencies: I think we should probably invalidate a plan when a 
modification is made to a view or rule that affected the plan. This 
should also be doable, though: we could either modify the rewriter to 
report these dependencies, or trawl the system catalogs looking for 
rules that apply to any of the relations in the query. The latter method 
would result in spurious invalidations, in the case of rules with a 
WHERE clause.)

-Neil


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: idea for concurrent seqscans
Next
From: Tom Lane
Date:
Subject: Re: invalidating cached plans