Thread: Bytea/Large Objects/external files best practices

Bytea/Large Objects/external files best practices

From
David Helgason
Date:
I think those best practices threads are a treat to follow (might even
consider archiving some of them in a sort of best-practices faq), so
here's one more.

In coding an game asset server I want to keep a large number of file
revisions of varying sizes (1Kb-50Mb) inside the database.

Naturally I want to avoid having to allocate whole buffers of 50Mb too
often.

I am also throwing around rsync-style deltas (with librsync), that I
run through functions to generate new objects. The librsync functions
have an elegant buffer oriented approach to reading&writing data, so I
need a way to read through my objects inside the database, again
without having them allocated in RAM too often.

I tried browsing around, but couldn't find any good discussion of this
(or just plain good advice), but of course that may be due to
google-impairment on my part.

I see several options:


1) Bytea column. Seems the cleanest solution, but
    *) it includes lots of escaping-unescaping with associated allocation
of huge buffers.
    *) I seem to remember reading in a discussion in the [hackers] list
that the TOAST column substring operation had been optimized so that
partial fetches might be viable.
    *) Incremental inserts (using a series of "update file set data = data
|| moredata where id=666") probably won't scale nicely.
    *) However it's fully ACID,
    *) and recommended by the docs, if that's a good reason.
    *) TOAST compression. Can't quite figure out if this happens, but in
many cases would store a lot of disk-space & -access.
    *) passing them as arguments to the librsync functions may be a hassle.
    open questions: I wonder what timeframe for this to be viable. It
seems that two things are being worked on that would fix this. The
wire-protocol to avoid having to escape all communications, and methods
for fast incremental update. Both seem to be at least a year away (Pg
7.5, and someone even suggested jumping to 8.0 with the wire-protocol
changes).

2) Bytea column with many rows per file.
    *) Also some escaping-unescaping, but with small buffers
    *) Simple and fast to do partial inserts/selects.
    *) Full ACID (and with a couple of triggers, also easy to keep
completely correct)
    *) Butt ugly solution, but what we used to do with ye olde versions of
MSQL.
    *) Compression, but probably lesser space savings with smaller chunks.
    open questions: What would the chunk-size be? I would imagine 5-20Kb
per row.

3) Large Objects.
    *) Requires special interface to talk to, not that its so bad.
    *) Need a trigger to garbage collect them (simple to do though)
    *) Fully ACID.
    *) I've actually implemented the librsync functions using this
allready, and it seems to work nicely with large files.

4) External files. I really don't like the idea of this.
    *) As high performance as anything.
    *) Need an alternate mechanism to access the server, with security
implications etc.
    *) Complications trying to maintain more-or-less full ACIDity
    open questions: What tricks to use to keep this ACID-ish? I won't have
any updating of content once it's committed, so just keeping the files
read-only most of the time would do most of the trick. Also, I imagine

I'm going with 3) for now, but even tempted by 2) when looking this
over.

At one point I was playing around with 4), and made a plperlu server to
run inside the database to ease this. The security was implemented by
the client receiving a ticket from the database when requesting to
create a file, and then having to send this ticket before being allowed
to send any data to the perl-server. Not an un-clever system, I think,
that I'd be happy to share.

I wonder what other people are doing and if anyone has other arguments.



David Helgason,
Over the Edge Entertainments


Re: Bytea/Large Objects/external files best practices

From
Karsten Hilbert
Date:
> I think those best practices threads are a treat to follow (might even
> consider archiving some of them in a sort of best-practices faq), so
Please do.

> 1) Bytea column. Seems the cleanest solution, but
>     *) I seem to remember reading in a discussion in the [hackers] list
> that the TOAST column substring operation had been optimized so that
> partial fetches might be viable.
Works. Not sure what happens on the wire, though.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: Bytea/Large Objects/external files best practices

From
Tom Lane
Date:
David Helgason <david@uti.is> writes:
>     *) I seem to remember reading in a discussion in the [hackers] list
> that the TOAST column substring operation had been optimized so that
> partial fetches might be viable.

I believe that works now for both text and bytea columns.  However the
cost is that you have to store the data uncompressed (by disabling the
normal automatic compression, see ALTER COLUMN SET STORAGE).  Pick
your poison...

>     open questions: I wonder what timeframe for this to be viable. It
> seems that two things are being worked on that would fix this. The
> wire-protocol to avoid having to escape all communications, and methods
> for fast incremental update.

Binary wire protocol is done in 7.4 (at least if you are using libpq;
not sure about the state of binary protocol support in other client
libraries).  Incremental update does remain an unsolved problem.

> 2) Bytea column with many rows per file.
>     open questions: What would the chunk-size be? I would imagine 5-20Kb
> per row.

You'd want to keep it under 2Kb per row to avoid triggering TOASTing of
any single row.  What you're doing here amounts to hand TOASTing, and
you don't want to pay the penalty of another layer of the same mechanism
underneath.  (Alternatively you might be able to set the storage option
so that compression is allowed but out-of-line storage is not, even for
rows up to 8K.  I forget what the options are exactly.)

> 4) External files. I really don't like the idea of this.
>     open questions: What tricks to use to keep this ACID-ish?

AFAICS, you can't guarantee much of anything if you do this.

            regards, tom lane