Re: Largeobject Access Controls (r2460) - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Largeobject Access Controls (r2460)
Date
Msg-id 4B59BA50020000250002EA66@gw.wicourts.gov
Whole thread Raw
In response to Re: Largeobject Access Controls (r2460)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Largeobject Access Controls (r2460)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Now the argument against that is that it won't scale terribly well
> to situations with very large numbers of blobs.  However, I'm not
> convinced that the current approach of cramming them all into one
> TOC entry scales so well either.  If your large objects are
> actually large, there's not going to be an enormous number of
> them.  We've heard of people with many tens of thousands of
> tables, and pg_dump speed didn't seem to be a huge bottleneck for
> them (at least not in recent versions).  So I'm feeling we should
> not dismiss the idea of one TOC entry per blob.
> 
> Thoughts?
We've got a "DocImage" table with about 7 million rows storing PDF
documents in a bytea column, approaching 1 TB of data.  (We don't
want to give up ACID guarantees, replication, etc. by storing them
on the file system with filenames in the database.)  This works
pretty well, except that client software occasionally has a tendency
to run out of RAM.  The interface could arguably be cleaner if we
used BLOBs, but the security issues have precluded that in
PostgreSQL.
I suspect that 7 million BLOBs (and growing fast) would be a problem
for this approach.  Of course, if we're atypical, we could stay with
bytea if this changed.  Just a data point.
-Kevin
cir=> select count(*) from "DocImage"; count
---------6891626
(1 row)

cir=> select pg_size_pretty(pg_total_relation_size('"DocImage"'));pg_size_pretty
----------------956 GB
(1 row)



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Largeobject Access Controls (r2460)
Next
From: Tom Lane
Date:
Subject: Re: Largeobject Access Controls (r2460)