Re: caching query results - Mailing list pgsql-hackers

From Karel Zak
Subject Re: caching query results
Date
Msg-id Pine.LNX.3.96.1000405122918.31506A-100000@ara.zf.jcu.cz
Whole thread Raw
In response to Re: caching query results  (wieck@debis.com (Jan Wieck))
List pgsql-hackers
On Tue, 4 Apr 2000, Jan Wieck wrote:

>     Right.  Imagine  a querytree (after overhaul) that looks like
>     this:
> 
>                        +------+
>                        | SORT |
>                        +------+
>                           ^
>                           |
>            +-----------------------------+
>            | JOIN                        |
>            | atts: rel1.att1, rel2.att2  |
>            | qual: rel1.att2 = rel2.att1 |
>            +-----------------------------+
>                 ^                     ^
>                 |                     |
>       +------------------+  +------------------+
>       | SCAN             |  | SCAN             |
>       | rel:  rel1       |  | rel:  rel2       |
>       | atts: att1, att2 |  | atts: att1, att2 |
>       +------------------+  +------------------+
> 
>     which is a node structure describing a query of:
> 
>     SELECT rel1.att1, rel2.att2 FROM rel1, rel2
>         WHERE rel1.att2 = rel2.att1;
> 
>     The "key" identifying this querytree now could look like
> 
>     SORT(JOIN(1.1,2.2;SCAN(78991;1,2),SCAN(78995;1,2);))
The nice picture. Thanks, I undestend you now. A question is where create 
this key - create a specific function that look at to querytree and return 
key or calculate it during statement transformation (analyze.c ..etc.). 
Or is any other idea?

> > >     These keys could be managed in a shared LRU table, and if the
> >
> > My current code is based on HASH table with keys and query&plan is
> > saved in special for a plan created MemoryContext (it is good for
> > a example SPI_freeplan()).

I thought about it, and what for SPI and PREPARE/EXECUTE query cache 
use shared memory too? I'm vote for one query cache in postgresql. IMHO
is not good create a specific cache for SPI_saveplan()+PREPARE and second
for your suggested query cache.

If plans saved via SPI (under defined key - 'by_key' interface) will shared
under all backends a lot of features will faster (FK, PLangs ..etc) and
shared plans cached via PREPARE will persistent across more connetions.
(Some web developers will happy :-) 

But I not sure with this...
>     IIRC our hash table code insists on using global, per backend
>     memory.   I thought about managing the entire querycache with
>     a new type of memory context, using  different  routines  for
>     palloc()/pfree(),  working  in  a shared memory area only and
>     eventually freeing  longest  unused  plans  until  allocation
>     fits.  Let's  see  if using hash tables here would be easy or
>     not.

I look at the current shmem routines - create specific space and hash 
table for a query cache is not a problem, hash routines are prepared 
for usage under shmem. The current lock management code is very simular.   
With hash is not a problem here. 

A problem is how store (copy) query & plan tree to this (shared) memory.
The current copyObject() is based on palloc()/pfree() and as you said
we haven't memory management routines (like palloc()) that working in 
shmem. 

Would be nice have MemoryContext routines for shmem - example 
CreateGlobalMemory_in_shmem() and palloc() that knows work with this
specific context. It is a dream?

A solution is convert query & plan tree to string (like pg_rewrite (views))
and save to cache this string, (and what a speed during (vice versa) parsing?). 
IMHO for this solution we not need a hash table, we can use a standard system 
table and a syscache.   

But more nice is variant with non-string and full plan-tree-structs in a 
shmem.

> > Good. It is solution for 'known-query' and allow it skip any steps in the
> > query path. But we still not have any idea for cached plans validity. What
> > if user changes oid for any operator, drop column (etc)?
> 
>     That's  why  the  key  is only good to find "candidates". The
>     cacheing has to look very close to the nodes in the tree  and
>     maybe  compare  down  to pg_attribute oid's etc. to decide if
>     it's really the same query or not.

OK.
                    Karel



pgsql-hackers by date:

Previous
From: "Hiroshi Inoue"
Date:
Subject: RE: postgres crash on CURSORS
Next
From: wieck@debis.com (Jan Wieck)
Date:
Subject: Re: 7.0 FK trigger question