"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We've heard of people with many tens of thousands of
>> tables, and pg_dump speed didn't seem to be a huge bottleneck for
>> them (at least not in recent versions). So I'm feeling we should
>> not dismiss the idea of one TOC entry per blob.
>>
>> Thoughts?
> I suspect that 7 million BLOBs (and growing fast) would be a problem
> for this approach. Of course, if we're atypical, we could stay with
> bytea if this changed. Just a data point.
Do you have the opportunity to try an experiment on hardware similar to
what you're running that on? Create a database with 7 million tables
and see what the dump/restore times are like, and whether
pg_dump/pg_restore appear to be CPU-bound or memory-limited when doing
it. If they aren't, we could conclude that millions of TOC entries
isn't a problem.
A compromise we could consider is some sort of sub-TOC-entry scheme that
gets the per-BLOB entries out of the main speed bottlenecks, while still
letting us share most of the logic. For instance, I suspect that the
first bottleneck in pg_dump would be the dependency sorting, but we
don't really need to sort all the blobs individually for that.
regards, tom lane