Any advice about function caching? - Mailing list pgsql-hackers
From | Mark Cave-Ayland |
---|---|
Subject | Any advice about function caching? |
Date | |
Msg-id | 9EB50F1A91413F4FA63019487FCD251D23931B@WEBBASEDDC.webbasedltd.local Whole thread Raw |
Responses |
Re: Any advice about function caching?
|
List | pgsql-hackers |
Hi everyone, I'm currently in the middle of trying to implement a cache for a PostGIS server-side function that uses an external library with an expensive startup cost, and was hoping to find some advice on the best way to implement it. The function currently takes 2 parameters - a customised C geometry datatype and a destination SRID which is the primary key to the library parameters in the spatial_ref_sys table. Currently for each invocation of the function it is necessary to lookup the parameters for both the source and destinations SRIDs in spatial_ref_sys, create a handle for each SRID in the external library, perform the calculation, and then return the result. Obviously this process is quite expensive when calling the function on large tables of data. I've currently implemented a basic cache for both the spatial_ref_sys lookups and the external library functions which gives a dramatic performance improvement, dropping my test query from over 2 minutes to 6s(!), but the part I'm experiencing difficulty with is working out when the start and end of a query occurs. My current code works by using MemoryContextCreate() to create a child context to MessageContext and using the Init()/Delete() functions to initialise and destroy a cache in the local backend. However, this doesn't really work particularly well when using cursors and prepared queries since it appears what I should be doing is using a cache per portal rather than a cache per backend. The other complication is that the external library handles cannot be free()d directly but instead a customised free function must be called. So my questions are: 1) What is the best way of keeping track of whether the cache has been initalised for a given portal? Looking in contrib/dblink.c it looks as if the best way to go would be to create a hash table per backend based upon the portal name in ActivePortal, attach the child context to PortalContext, and use the Delete() function to call the customised free function and remove the hash table entry. However, which memory context would be the best one to use in order to store the hash table entries? 2) Is there a better way of doing this? 3) Would anyone consider a patch to the source tree to make implementing something like this easier in future? Many thanks, Mark. ------------------------ WebBased Ltd 17 Research Way Plymouth PL6 8BT T: +44 (0)1752 797131 F: +44 (0)1752 791023 http://www.webbased.co.uk http://www.infomapper.com http://www.swtc.co.uk This email and any attachments are confidential to the intended recipient and may also be privileged. If you are not the intended recipient please delete it from your system and notify the sender. You should not copy it or use it for any purpose nor disclose or distribute its contents to any other person.
pgsql-hackers by date: