Re: caching query results - Mailing list pgsql-hackers
From | Karel Zak |
---|---|
Subject | Re: caching query results |
Date | |
Msg-id | Pine.LNX.3.96.1000405122918.31506A-100000@ara.zf.jcu.cz Whole thread Raw |
In response to | Re: caching query results (wieck@debis.com (Jan Wieck)) |
List | pgsql-hackers |
On Tue, 4 Apr 2000, Jan Wieck wrote: > Right. Imagine a querytree (after overhaul) that looks like > this: > > +------+ > | SORT | > +------+ > ^ > | > +-----------------------------+ > | JOIN | > | atts: rel1.att1, rel2.att2 | > | qual: rel1.att2 = rel2.att1 | > +-----------------------------+ > ^ ^ > | | > +------------------+ +------------------+ > | SCAN | | SCAN | > | rel: rel1 | | rel: rel2 | > | atts: att1, att2 | | atts: att1, att2 | > +------------------+ +------------------+ > > which is a node structure describing a query of: > > SELECT rel1.att1, rel2.att2 FROM rel1, rel2 > WHERE rel1.att2 = rel2.att1; > > The "key" identifying this querytree now could look like > > SORT(JOIN(1.1,2.2;SCAN(78991;1,2),SCAN(78995;1,2);)) The nice picture. Thanks, I undestend you now. A question is where create this key - create a specific function that look at to querytree and return key or calculate it during statement transformation (analyze.c ..etc.). Or is any other idea? > > > These keys could be managed in a shared LRU table, and if the > > > > My current code is based on HASH table with keys and query&plan is > > saved in special for a plan created MemoryContext (it is good for > > a example SPI_freeplan()). I thought about it, and what for SPI and PREPARE/EXECUTE query cache use shared memory too? I'm vote for one query cache in postgresql. IMHO is not good create a specific cache for SPI_saveplan()+PREPARE and second for your suggested query cache. If plans saved via SPI (under defined key - 'by_key' interface) will shared under all backends a lot of features will faster (FK, PLangs ..etc) and shared plans cached via PREPARE will persistent across more connetions. (Some web developers will happy :-) But I not sure with this... > IIRC our hash table code insists on using global, per backend > memory. I thought about managing the entire querycache with > a new type of memory context, using different routines for > palloc()/pfree(), working in a shared memory area only and > eventually freeing longest unused plans until allocation > fits. Let's see if using hash tables here would be easy or > not. I look at the current shmem routines - create specific space and hash table for a query cache is not a problem, hash routines are prepared for usage under shmem. The current lock management code is very simular. With hash is not a problem here. A problem is how store (copy) query & plan tree to this (shared) memory. The current copyObject() is based on palloc()/pfree() and as you said we haven't memory management routines (like palloc()) that working in shmem. Would be nice have MemoryContext routines for shmem - example CreateGlobalMemory_in_shmem() and palloc() that knows work with this specific context. It is a dream? A solution is convert query & plan tree to string (like pg_rewrite (views)) and save to cache this string, (and what a speed during (vice versa) parsing?). IMHO for this solution we not need a hash table, we can use a standard system table and a syscache. But more nice is variant with non-string and full plan-tree-structs in a shmem. > > Good. It is solution for 'known-query' and allow it skip any steps in the > > query path. But we still not have any idea for cached plans validity. What > > if user changes oid for any operator, drop column (etc)? > > That's why the key is only good to find "candidates". The > cacheing has to look very close to the nodes in the tree and > maybe compare down to pg_attribute oid's etc. to decide if > it's really the same query or not. OK. Karel
pgsql-hackers by date: