Thread: hooks for supporting third party blobs?

hooks for supporting third party blobs?

From

Eric Davies

Date:

07 December 2004, 01:17:25

A recent project of ours involved storing/fetching some reasonably large datasets in a home-brew datatype. The datasets tended to range from a few megabytes, to several gigabytes. We were seeing some nonlinear slowness with using native large objects with larger datasets, presumably due to the increasing depth of the btree index used to track all the little pieces of the blobs.

After some careful consideration, we implemented an alternative to large objects, a system based on storing files in a particular directory, and storing a reference to the files in the database. It worked and gave us good and consistent performance. However, it doesn't support transactions (no isolation, no rollback). We can probably implement some backend code to support such functionality, but the trick is getting the postgres server to keep our code in the loop (so to speak) about when a rollback should be done (and to when).

Is anyone aware of any hooks to support schemes such as ours, or has solved a similar problem?

Thank you.

**********************************************
Eric Davies, M.Sc.
Barrodale Computing Services Ltd.
Tel: (250) 472-4372 Fax: (250) 472-4373
Web: http://www.barrodale.com
Email: eric@barrodale.com
**********************************************
Mailing Address:
P.O. Box 3075 STN CSC
Victoria BC Canada V8W 3W2

Shipping Address:
Hut R, McKenzie Avenue
University of Victoria
Victoria BC Canada V8W 3W2
**********************************************

Re: hooks for supporting third party blobs?

From

Alvaro Herrera

Date:

07 December 2004, 01:31:14

On Mon, Dec 06, 2004 at 05:11:21PM -0800, Eric Davies wrote:

> Is anyone aware of any hooks to support schemes such as ours, or has solved
> a similar problem?

There's RegisterXactCallback() and RegisterSubXactCallback() functions
that may be what you want.  They are called whenever a transaction or
subtransaction starts, commits, or aborts.  You could probably keep a
list of things modified during the transaction, so you can clean up at
transaction end.

(Much like the storage manager does: it only unlinks files for dropped
tables at transaction commit.)

Make sure to react appropiately at subtransaction abort ...

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Si quieres ser creativo, aprende el arte de perder el tiempo"

Re: hooks for supporting third party blobs?

From

Tom Lane

Date:

07 December 2004, 05:01:37

Eric Davies <Eric@barrodale.com> writes:
> A recent project of ours involved storing/fetching some reasonably large
> datasets in a home-brew datatype.  The datasets tended to range from a few
> megabytes, to several gigabytes. We were seeing some nonlinear slowness
> with using native large objects with larger datasets, presumably due to the
> increasing depth of the btree index used to track all the little pieces of
> the blobs.

Did you do any profiling to back up that "presumably"?  It seems at
least as likely to me that this was caused by some easily-fixed
inefficiency somewhere.  There are still a lot of O(N^2) algorithms
in the backend that no one has run up against yet ...

            regards, tom lane