Thread: When much of a DB is large objects - PG 8.4

When much of a DB is large objects - PG 8.4

From
David Wall
Date:
We have a system in which a large amount of the data is stored as large
objects now in PG 8.4.  The relational aspects of these objects are
stored using traditional tables, but we store a lot of binary data
(files, images) and XML objects that are compressed and then encrypted.
The ACID properties of PG are great for this since we can ensure
multiple updates are committed/rolled-back together.

But, even if I have 30 tables that themselves all have OID types for the
encrypted or binary data, most ends up in the pg_largeobject table,
which makes partitioning and disk access complicated, and there's always
some concern we'll end up maxing out the OIDs in that system table and
what the performance of banging against one table invites.

So my question is there a way to load balance/partition pg_largeobject
across disks or otherwise ensure I'm getting the best performance from
PG?  I know I can use BYTEA (forget its max size off hand, but note that
it's not exactly JDBC friendly because of all the escaping to be done
moving a large byte array).  Would I do well, perhaps to clone the idea
of pg_largeobject for those objects that are say 100KB or less (many
will be I suspect) and store them in a sequence of BYTEA rows in my own
tables as way to allow the storage of these blobs across many tables
instead of just pg_largeobject?  It probably wouldn't be as efficient as
the large object code, but I'm sure it could be made to work.

Thanks for any hints or ideas....

David

Re: When much of a DB is large objects - PG 8.4

From
Merlin Moncure
Date:
On Mon, Oct 19, 2009 at 9:11 PM, David Wall <d.wall@computer.org> wrote:
> We have a system in which a large amount of the data is stored as large
> objects now in PG 8.4.  The relational aspects of these objects are stored
> using traditional tables, but we store a lot of binary data (files, images)
> and XML objects that are compressed and then encrypted.  The ACID properties
> of PG are great for this since we can ensure multiple updates are
> committed/rolled-back together.
>
> But, even if I have 30 tables that themselves all have OID types for the
> encrypted or binary data, most ends up in the pg_largeobject table, which
> makes partitioning and disk access complicated, and there's always some
> concern we'll end up maxing out the OIDs in that system table and what the
> performance of banging against one table invites.
>
> So my question is there a way to load balance/partition pg_largeobject
> across disks or otherwise ensure I'm getting the best performance from PG?
>  I know I can use BYTEA (forget its max size off hand, but note that it's
> not exactly JDBC friendly because of all the escaping to be done moving a
> large byte array).  Would I do well, perhaps to clone the idea of
> pg_largeobject for those objects that are say 100KB or less (many will be I
> suspect) and store them in a sequence of BYTEA rows in my own tables as way
> to allow the storage of these blobs across many tables instead of just
> pg_largeobject?  It probably wouldn't be as efficient as the large object
> code, but I'm sure it could be made to work.

the bytea limit is 1gb (as are all datums in postgres).
pg_largeobject can go up to 2gb, but in either case you'd likely run
into severe performance/scalability issues long before objects began
approaching those size because of memory usage and other issues.  With
100kb objects though, you should be all right.

libpq supports a binary protocol mode which allows you to execute
queries sending bytea without escaping.  (I'm not familiar with the
jdbc driver, but I'd imagine it should support it in some fashion).  l
would start researching there: find out if the jdbc driver supports
binary queries and use them if possible.  If you can't or won't be use
jdbc in this way, your options are to stick with large objects or try
and figure out another way to get data into the database.

merlin

Re: When much of a DB is large objects - PG 8.4

From
David Wall
Date:
> the bytea limit is 1gb (as are all datums in postgres).
> pg_largeobject can go up to 2gb, but in either case you'd likely run
> into severe performance/scalability issues long before objects began
> approaching those size because of memory usage and other issues.  With
> 100kb objects though, you should be all right.
>

Thanks, Merlin.  Yes, I am not worried so much about the size limits of
a given field or row (not yet anyway).

But I am concerned that all my data across all my tables really ends up
in a single pg_largeobject table, which seems like it could be a
bottleneck.  Since it's blobs and generally big, I figure repeated
access to the table doesn't really even benefit from caching that much
like if I had a small table that was being hit a lot.  I am worried
about the overall table size of pg_largeobject as blobs are inserted in
my various tables, but they only get an OID stored, whereas
pg_largeobject gets all of the data across all of those tables.

I am concerned with backups/restores, crash recovery, partitioning
options, etc. if most of my data is stored in the single pg_largeobject
table.  Can it be partitioned?  How many blobs can it hold before it
won't perform well?

And is there any performance gain if I had several pg_largeobject-like
tables that I built myself using bytea types as it does?  I mean, does
PG perform any better if my queries are across a series of tables all
with their own byteas rather than using LOs?

> libpq supports a binary protocol mode which allows you to execute
> queries sending bytea without escaping.  (I'm not familiar with the
> jdbc driver, but I'd imagine it should support it in some fashion).  l
> would start researching there: find out if the jdbc driver supports
> binary queries and use them if possible.  If you can't or won't be use
> jdbc in this way, your options are to stick with large objects or try
> and figure out another way to get data into the database.
>
Thanks.  I'll see what I can learn about bytea escaping in JDBC as I
don't see anything obvious in its JDBC-specific PG docs.  Perhaps it no
longer suffers from each byte being converted into escaped characters,
which really balloons already big enough data for the transfer
protocols, and that would be great.  Of course, it's all moot if there
would not be any real performance gain to be had by having these objects
stored across multiple tables rather than all being in pg_largeobject
(most such data is not deleted, though it is often updated, so vacuumlo
running daily generally isn't a problem for us).

David