Re: use BLOBS or use pointers to files on the filesystem? - Mailing list pgsql-general

From Jason Earl
Subject Re: use BLOBS or use pointers to files on the filesystem?
Date
Msg-id 87g06jc6ta.fsf@npa01zz001.simplot.com
Whole thread Raw
In response to use BLOBS or use pointers to files on the filesystem?  ("Travis Bear" <ec9098@no.spam.go.com>)
Responses Re: use BLOBS or use pointers to files on the filesystem?
List pgsql-general


"Travis Bear" <ec9098@www2.us.postgresql.org> writes:

> Hi
>
> I'm kind of a database newbie in general, so be nice to me if this
> is a bonehead question.  I am building a java servlet based web
> application to manage photos and images.  My first thougth has been
> to store the images as BLOBs in the database, but I read one user
> say that you're better off not using BLOBs.  Instead, he says, store
> your large data objects as files on the filesystem, and just use the
> database to store the path to the file and any file-related meta
> data.

Like most programming questions this involves a series of tradeoffs.
It used to be that PostgreSQL had an 8K limit per row (32K if you
editted the source and compiled your own).  PostgreSQL has a large
object interface, but it is even clunkier to use than the filesystem
and almost certainly slower.

Thanks to TOAST the 8K limit is now essentially gone.  You can store
fairly large files in PostgreSQL quite easily (their still is a hard
limit, but it is much much larger).  However, the BYTEA type still
needs some work (some characters must be escaped before they can be
inserted), and there is a significant overhead involved (large fields
are compressed and then split across several tuples).

Filesystems are *really* good at sharing files.  PostgreSQL, being
filesystem based itself, isn't likely to ever really be competitive
with a plain filesystem.

> I wasn't sure what to think of this, so I was hoping to get some
> other ideas and opinions.  I expect to accumulate 1-2 gigs of new
> data per year, max, with average image size being 700k.

It depends on what you plan to do with these images.  For example, I
am actually storing images (about that size too) directly in
PostgreSQL.  In fact, I have even bypassed the BYTEA type and I am
simply base64 encoding the images and inserting them into a big fat
text field.  However, I am doing this because these images should get
fetched quite rarely, and when one of them does get fetched it is very
likely that it will get fetched again in short order.  So I can easily
afford to base64 encode/decode.  PostgreSQL's transactions are useful
to me as well in that it is easier to insure that file changes go as
planned.  If an error occurs I simply roll back, and all is well.  If
something goes wrong while writing to a filesystem based image I am
left with a corrupted image.

Most importantly, however, I don't have to change my existing
PostgreSQL backup scripts :).

> Also, if I've missed any obvious documentation on this issue please
> feel free to send it my way.

Take a look at the large object material and the TOAST documentation
as they represent the two alternatives that you have when saving
binary (or merely large) fields in Postgresql.

> Thanks!
>
>
> Travis

I hope this was helpful

Jason

pgsql-general by date:

Previous
From: "Campano, Troy"
Date:
Subject: Re: Remote Access to pgsql DB ???
Next
From: "Robert B. Easter"
Date:
Subject: Re: problem with select after updating