Thread: hooks for supporting third party blobs?
A recent project of ours involved storing/fetching some reasonably large datasets in a home-brew datatype. The datasets tended to range from a few megabytes, to several gigabytes. We were seeing some nonlinear slowness with using native large objects with larger datasets, presumably due to the increasing depth of the btree index used to track all the little pieces of the blobs.
After some careful consideration, we implemented an alternative to large objects, a system based on storing files in a particular directory, and storing a reference to the files in the database. It worked and gave us good and consistent performance. However, it doesn't support transactions (no isolation, no rollback). We can probably implement some backend code to support such functionality, but the trick is getting the postgres server to keep our code in the loop (so to speak) about when a rollback should be done (and to when).
Is anyone aware of any hooks to support schemes such as ours, or has solved a similar problem?
Thank you.
After some careful consideration, we implemented an alternative to large objects, a system based on storing files in a particular directory, and storing a reference to the files in the database. It worked and gave us good and consistent performance. However, it doesn't support transactions (no isolation, no rollback). We can probably implement some backend code to support such functionality, but the trick is getting the postgres server to keep our code in the loop (so to speak) about when a rollback should be done (and to when).
Is anyone aware of any hooks to support schemes such as ours, or has solved a similar problem?
Thank you.
**********************************************
Eric Davies, M.Sc.
Barrodale Computing Services Ltd.
Tel: (250) 472-4372 Fax: (250) 472-4373
Web: http://www.barrodale.com
Email: eric@barrodale.com
**********************************************
Mailing Address:
P.O. Box 3075 STN CSC
Victoria BC Canada V8W 3W2
Shipping Address:
Hut R, McKenzie Avenue
University of Victoria
Victoria BC Canada V8W 3W2
**********************************************
On Mon, Dec 06, 2004 at 05:11:21PM -0800, Eric Davies wrote: > Is anyone aware of any hooks to support schemes such as ours, or has solved > a similar problem? There's RegisterXactCallback() and RegisterSubXactCallback() functions that may be what you want. They are called whenever a transaction or subtransaction starts, commits, or aborts. You could probably keep a list of things modified during the transaction, so you can clean up at transaction end. (Much like the storage manager does: it only unlinks files for dropped tables at transaction commit.) Make sure to react appropiately at subtransaction abort ... -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "Si quieres ser creativo, aprende el arte de perder el tiempo"
Eric Davies <Eric@barrodale.com> writes: > A recent project of ours involved storing/fetching some reasonably large > datasets in a home-brew datatype. The datasets tended to range from a few > megabytes, to several gigabytes. We were seeing some nonlinear slowness > with using native large objects with larger datasets, presumably due to the > increasing depth of the btree index used to track all the little pieces of > the blobs. Did you do any profiling to back up that "presumably"? It seems at least as likely to me that this was caused by some easily-fixed inefficiency somewhere. There are still a lot of O(N^2) algorithms in the backend that no one has run up against yet ... regards, tom lane