Home > mailing lists

Any advice about function caching? - Mailing list pgsql-hackers

From	Mark Cave-Ayland
Subject	Any advice about function caching?
Date	November 7, 2005 18:52:05
Msg-id	9EB50F1A91413F4FA63019487FCD251D23931B@WEBBASEDDC.webbasedltd.local Whole thread Raw
Responses	Re: Any advice about function caching?
List	pgsql-hackers

Tree view

Hi everyone,

I'm currently in the middle of trying to implement a cache for a PostGIS
server-side function that uses an external library with an expensive startup
cost, and was hoping to find some advice on the best way to implement it.

The function currently takes 2 parameters - a customised C geometry datatype
and a destination SRID which is the primary key to the library parameters in
the spatial_ref_sys table. Currently for each invocation of the function it
is necessary to lookup the parameters for both the source and destinations
SRIDs in spatial_ref_sys, create a handle for each SRID in the external
library, perform the calculation, and then return the result.

Obviously this process is quite expensive when calling the function on large
tables of data. I've currently implemented a basic cache for both the
spatial_ref_sys lookups and the external library functions which gives a
dramatic performance improvement, dropping my test query from over 2 minutes
to 6s(!), but the part I'm experiencing difficulty with is working out when
the start and end of a query occurs.

My current code works by using MemoryContextCreate() to create a child
context to MessageContext and using the Init()/Delete() functions to
initialise and destroy a cache in the local backend. However, this doesn't
really work particularly well when using cursors and prepared queries since
it appears what I should be doing is using a cache per portal rather than a
cache per backend. The other complication is that the external library
handles cannot be free()d directly but instead a customised free function
must be called.

So my questions are:

1) What is the best way of keeping track of whether the cache has been
initalised for a given portal? Looking in contrib/dblink.c it looks as if
the best way to go would be to create a hash table per backend based upon
the portal name in ActivePortal, attach the child context to PortalContext,
and use the Delete() function to call the customised free function and
remove the hash table entry. However, which memory context would be the best
one to use in order to store the hash table entries?

2) Is there a better way of doing this?

3) Would anyone consider a patch to the source tree to make implementing
something like this easier in future?


Many thanks,

Mark.

------------------------
WebBased Ltd
17 Research Way
Plymouth
PL6 8BT

T: +44 (0)1752 797131
F: +44 (0)1752 791023

http://www.webbased.co.uk
http://www.infomapper.com
http://www.swtc.co.uk

This email and any attachments are confidential to the intended recipient
and may also be privileged. If you are not the intended recipient please
delete it from your system and notify the sender. You should not copy it or
use it for any purpose nor disclose or distribute its contents to any other
person.

pgsql-hackers by date:

From: Greg Stark
Date: 07 November 2005, 18:26:21
Subject: Re: Interval aggregate regression failure (expected seems

From: "Jim C. Nasby"
Date: 07 November 2005, 18:58:17
Subject: Re: Crash during elog.c...

Any advice about function caching? - Mailing list pgsql-hackers

Previous

Next