Thread: Any advice about function caching?
Hi everyone, I'm currently in the middle of trying to implement a cache for a PostGIS server-side function that uses an external library with an expensive startup cost, and was hoping to find some advice on the best way to implement it. The function currently takes 2 parameters - a customised C geometry datatype and a destination SRID which is the primary key to the library parameters in the spatial_ref_sys table. Currently for each invocation of the function it is necessary to lookup the parameters for both the source and destinations SRIDs in spatial_ref_sys, create a handle for each SRID in the external library, perform the calculation, and then return the result. Obviously this process is quite expensive when calling the function on large tables of data. I've currently implemented a basic cache for both the spatial_ref_sys lookups and the external library functions which gives a dramatic performance improvement, dropping my test query from over 2 minutes to 6s(!), but the part I'm experiencing difficulty with is working out when the start and end of a query occurs. My current code works by using MemoryContextCreate() to create a child context to MessageContext and using the Init()/Delete() functions to initialise and destroy a cache in the local backend. However, this doesn't really work particularly well when using cursors and prepared queries since it appears what I should be doing is using a cache per portal rather than a cache per backend. The other complication is that the external library handles cannot be free()d directly but instead a customised free function must be called. So my questions are: 1) What is the best way of keeping track of whether the cache has been initalised for a given portal? Looking in contrib/dblink.c it looks as if the best way to go would be to create a hash table per backend based upon the portal name in ActivePortal, attach the child context to PortalContext, and use the Delete() function to call the customised free function and remove the hash table entry. However, which memory context would be the best one to use in order to store the hash table entries? 2) Is there a better way of doing this? 3) Would anyone consider a patch to the source tree to make implementing something like this easier in future? Many thanks, Mark. ------------------------ WebBased Ltd 17 Research Way Plymouth PL6 8BT T: +44 (0)1752 797131 F: +44 (0)1752 791023 http://www.webbased.co.uk http://www.infomapper.com http://www.swtc.co.uk This email and any attachments are confidential to the intended recipient and may also be privileged. If you are not the intended recipient please delete it from your system and notify the sender. You should not copy it or use it for any purpose nor disclose or distribute its contents to any other person.
"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes: > My current code works by using MemoryContextCreate() to create a child > context to MessageContext and using the Init()/Delete() functions to > initialise and destroy a cache in the local backend. However, this doesn't > really work particularly well when using cursors and prepared queries since > it appears what I should be doing is using a cache per portal rather than a > cache per backend. If you want per-query state, keep it in a data structure linked from the fcinfo->flinfo->fn_extra field (physically, store it in fcinfo->flinfo->fn_mcxt, or create a subcontext of that if you wish). If you need to get control at query shutdown to free non-palloc'd resources, RegisterExprContextCallback may help. (I think such callbacks are only called during *successful* query shutdown, though, so if you have external library state you need to clean up anyway, you'll need some other approach to keeping track of it ... maybe a permanent data structure instead of a per-query one.) src/backend/utils/fmgr/funcapi.c and src/backend/executor/functions.c might be useful examples. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 07 November 2005 23:06 > To: Mark Cave-Ayland (External) > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Any advice about function caching? (cut) > If you want per-query state, keep it in a data structure > linked from the > fcinfo->flinfo->fn_extra field (physically, store it in fn_mcxt, or > fcinfo->flinfo->create a subcontext of that if you wish). > > If you need to get control at query shutdown to free > non-palloc'd resources, RegisterExprContextCallback may help. > (I think such callbacks are only called during *successful* > query shutdown, though, so if you have external library state > you need to clean up anyway, you'll need some other approach > to keeping track of it ... maybe a permanent data structure > instead of a per-query one.) > > src/backend/utils/fmgr/funcapi.c and > src/backend/executor/functions.c might be useful examples. Hi Tom, Thanks for the advice about state - this is definitely pointing me towards looking at the existing code for aggregates and SRFs. Incidentally I've found that attaching my "cleanup" memory context to PortalContext with some elogs() shows that it appears to be called correctly just before the portal is destroyed - so whatever I finally come up with is likely to be a combination of the two methods. I will dig further into the function code and see how I manage. Many thanks, Mark. ------------------------ WebBased Ltd 17 Research Way Plymouth PL6 8BT T: +44 (0)1752 797131 F: +44 (0)1752 791023 http://www.webbased.co.uk http://www.infomapper.com http://www.swtc.co.uk This email and any attachments are confidential to the intended recipient and may also be privileged. If you are not the intended recipient please delete it from your system and notify the sender. You should not copy it or use it for any purpose nor disclose or distribute its contents to any other person.