Thread: Any advice about function caching?

Any advice about function caching?

From
"Mark Cave-Ayland"
Date:
Hi everyone,

I'm currently in the middle of trying to implement a cache for a PostGIS
server-side function that uses an external library with an expensive startup
cost, and was hoping to find some advice on the best way to implement it.

The function currently takes 2 parameters - a customised C geometry datatype
and a destination SRID which is the primary key to the library parameters in
the spatial_ref_sys table. Currently for each invocation of the function it
is necessary to lookup the parameters for both the source and destinations
SRIDs in spatial_ref_sys, create a handle for each SRID in the external
library, perform the calculation, and then return the result.

Obviously this process is quite expensive when calling the function on large
tables of data. I've currently implemented a basic cache for both the
spatial_ref_sys lookups and the external library functions which gives a
dramatic performance improvement, dropping my test query from over 2 minutes
to 6s(!), but the part I'm experiencing difficulty with is working out when
the start and end of a query occurs.

My current code works by using MemoryContextCreate() to create a child
context to MessageContext and using the Init()/Delete() functions to
initialise and destroy a cache in the local backend. However, this doesn't
really work particularly well when using cursors and prepared queries since
it appears what I should be doing is using a cache per portal rather than a
cache per backend. The other complication is that the external library
handles cannot be free()d directly but instead a customised free function
must be called.

So my questions are:

1) What is the best way of keeping track of whether the cache has been
initalised for a given portal? Looking in contrib/dblink.c it looks as if
the best way to go would be to create a hash table per backend based upon
the portal name in ActivePortal, attach the child context to PortalContext,
and use the Delete() function to call the customised free function and
remove the hash table entry. However, which memory context would be the best
one to use in order to store the hash table entries?

2) Is there a better way of doing this?

3) Would anyone consider a patch to the source tree to make implementing
something like this easier in future?


Many thanks,

Mark.

------------------------
WebBased Ltd
17 Research Way
Plymouth
PL6 8BT

T: +44 (0)1752 797131
F: +44 (0)1752 791023

http://www.webbased.co.uk
http://www.infomapper.com
http://www.swtc.co.uk

This email and any attachments are confidential to the intended recipient
and may also be privileged. If you are not the intended recipient please
delete it from your system and notify the sender. You should not copy it or
use it for any purpose nor disclose or distribute its contents to any other
person.




Re: Any advice about function caching?

From
Tom Lane
Date:
"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
> My current code works by using MemoryContextCreate() to create a child
> context to MessageContext and using the Init()/Delete() functions to
> initialise and destroy a cache in the local backend. However, this doesn't
> really work particularly well when using cursors and prepared queries since
> it appears what I should be doing is using a cache per portal rather than a
> cache per backend.

If you want per-query state, keep it in a data structure linked from the
fcinfo->flinfo->fn_extra field (physically, store it in
fcinfo->flinfo->fn_mcxt, or create a subcontext of that if you wish).

If you need to get control at query shutdown to free non-palloc'd
resources, RegisterExprContextCallback may help.  (I think such
callbacks are only called during *successful* query shutdown, though,
so if you have external library state you need to clean up anyway,
you'll need some other approach to keeping track of it ... maybe a
permanent data structure instead of a per-query one.)

src/backend/utils/fmgr/funcapi.c and src/backend/executor/functions.c
might be useful examples.
        regards, tom lane


Re: Any advice about function caching?

From
"Mark Cave-Ayland"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] 
> Sent: 07 November 2005 23:06
> To: Mark Cave-Ayland (External)
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Any advice about function caching?

(cut)

> If you want per-query state, keep it in a data structure 
> linked from the
> fcinfo->flinfo->fn_extra field (physically, store it in fn_mcxt, or 
> fcinfo->flinfo->create a subcontext of that if you wish).
> 
> If you need to get control at query shutdown to free 
> non-palloc'd resources, RegisterExprContextCallback may help. 
>  (I think such callbacks are only called during *successful* 
> query shutdown, though, so if you have external library state 
> you need to clean up anyway, you'll need some other approach 
> to keeping track of it ... maybe a permanent data structure 
> instead of a per-query one.)
> 
> src/backend/utils/fmgr/funcapi.c and 
> src/backend/executor/functions.c might be useful examples.


Hi Tom,

Thanks for the advice about state - this is definitely pointing me towards
looking at the existing code for aggregates and SRFs. Incidentally I've
found that attaching my "cleanup" memory context to PortalContext with some
elogs() shows that it appears to be called correctly just before the portal
is destroyed - so whatever I finally come up with is likely to be a
combination of the two methods. I will dig further into the function  code
and see how I manage.


Many thanks,

Mark.

------------------------
WebBased Ltd
17 Research Way
Plymouth
PL6 8BT

T: +44 (0)1752 797131
F: +44 (0)1752 791023

http://www.webbased.co.uk   
http://www.infomapper.com 
http://www.swtc.co.uk  

This email and any attachments are confidential to the intended recipient
and may also be privileged. If you are not the intended recipient please
delete it from your system and notify the sender. You should not copy it or
use it for any purpose nor disclose or distribute its contents to any other
person.