Re: Largeobject Access Controls (r2460) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Largeobject Access Controls (r2460)
Date
Msg-id 7999.1264193724@sss.pgh.pa.us
Whole thread Raw
In response to Re: Largeobject Access Controls (r2460)  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: Largeobject Access Controls (r2460)
List pgsql-hackers
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We've heard of people with many tens of thousands of
>> tables, and pg_dump speed didn't seem to be a huge bottleneck for
>> them (at least not in recent versions).  So I'm feeling we should
>> not dismiss the idea of one TOC entry per blob.
>> 
>> Thoughts?

> I suspect that 7 million BLOBs (and growing fast) would be a problem
> for this approach.  Of course, if we're atypical, we could stay with
> bytea if this changed.  Just a data point.

Do you have the opportunity to try an experiment on hardware similar to
what you're running that on?  Create a database with 7 million tables
and see what the dump/restore times are like, and whether
pg_dump/pg_restore appear to be CPU-bound or memory-limited when doing
it.  If they aren't, we could conclude that millions of TOC entries
isn't a problem.

A compromise we could consider is some sort of sub-TOC-entry scheme that
gets the per-BLOB entries out of the main speed bottlenecks, while still
letting us share most of the logic.  For instance, I suspect that the
first bottleneck in pg_dump would be the dependency sorting, but we
don't really need to sort all the blobs individually for that.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Largeobject Access Controls (r2460)
Next
From: "Kevin Grittner"
Date:
Subject: Re: Largeobject Access Controls (r2460)