Thread: Large objects oids

Large objects oids

From
David Wall
Date:
Since large objects use OIDs, does PG 8.3 have a limit of 4 billion
large objects across all of my various tables (actually, I presume OIDs
are used elsewhere besides just large objects)?

Is there any plan on allowing large objects to support more than 2GB?
As data gets larger and larger, I can see this being a problem if you'd
like the ACID properties of a DB to work with video and other large data
sets.

Since all large objects are stored in the pg_catalog.pg_largeobject
table, will running 'vacuum full' on it reduce the FSM issues I often
see with errors like:

WARNING:  relation "pg_catalog.pg_largeobject" contains more than
"max_fsm_pages" pages with useful free space
HINT:  Consider using VACUUM FULL on this relation or increasing the
configuration parameter "max_fsm_pages".
NOTICE:  number of page slots needed (1045968) exceeds max_fsm_pages
(300000)

Thanks,
David


Re: Large objects oids

From
Tom Lane
Date:
David Wall <d.wall@computer.org> writes:
> Since large objects use OIDs, does PG 8.3 have a limit of 4 billion
> large objects across all of my various tables

Yup, and in practice you'd better have a lot less than that or assigning
a new OID might take a long time.

> (actually, I presume OIDs
> are used elsewhere besides just large objects)?

They are, but this isn't relevant to large objects.  The uniqueness
requirement is only per-catalog.

> Is there any plan on allowing large objects to support more than 2GB?

No, it's not on the radar screen really.

            regards, tom lane

Re: Large objects oids

From
David Wall
Date:

Tom Lane wrote:
David Wall <d.wall@computer.org> writes: 
Since large objects use OIDs, does PG 8.3 have a limit of 4 billion 
large objects across all of my various tables   
Yup, and in practice you'd better have a lot less than that or assigning
a new OID might take a long time. 

What's a rough estimate of "a lot less"?  Are we talking 2 billion, 3 billion, 1 billion?


(actually, I presume OIDs 
are used elsewhere besides just large objects)?    
They are, but this isn't relevant to large objects.  The uniqueness
requirement is only per-catalog. 
Isn't there just one catalog per postmaster instance (pg_catalog)?  The issue we have is that one postmaster runs a large number of databases (let's say 100 for easy calculations), so even with the max 4 billion potential OIDs, that would drop each DB to 40 million each.

Part of this is just architectural to us.  We do heavy encryption/compression of data (in particular digitally signed XML text) and use large objects to store these, but we may need to change these to use bytea since they wouldn't use up OIDs and the actual data size tends not to be too large (perhaps 10KB compressed and encrypted binary data) and can be done in a block.  All that character escaping of binary data, though, makes the JDBC-to-Postmaster interface a tad bit ugly, though.


 
Is there any plan on allowing large objects to support more than 2GB?     
No, it's not on the radar screen really. 

Too bad, but again, we can always work around it, even if means a layer that bundles large objects sort of like large objects bundle bytea.  We prefer not to store it outside the database as the large files can get out of sync from the database (ACID properties) and of course need to be backed up separately from the database backups and WAL copying for replication.

David

Re: Large objects oids

From
Tom Lane
Date:
David Wall <d.wall@computer.org> writes:
> Tom Lane wrote:
>> Yup, and in practice you'd better have a lot less than that or assigning
>> a new OID might take a long time.

> What's a rough estimate of "a lot less"?  Are we talking 2 billion, 3
> billion, 1 billion?

It's difficult to say --- the assignment problem occurs if there's a
very long run of consecutive OIDs in use in the table beginning right
at the spot where the OID counter currently is (having wrapped around
one or more times since those objects were created).  So the question is
what is the probability of that.  The fact that the same OID counter
feeds everything (system objects, TOAST objects, and LOs) means that any
given catalog probably hasn't got tremendously long runs, but it could
happen.

Personally I'd feel uncomfortable designing a system that expected to
have more than a few million LOs per database, but that's just a guess.

>> They are, but this isn't relevant to large objects.  The uniqueness
>> requirement is only per-catalog.
>>
> Isn't there just one catalog per postmaster instance (pg_catalog)?

Sorry, by "catalog" I meant "any table with a unique OID column".
For purposes of this discussion you could consider each database's
pg_largeobject table to be one such catalog.

> The
> issue we have is that one postmaster runs a large number of databases
> (let's say 100 for easy calculations), so even with the max 4 billion
> potential OIDs, that would drop each DB to 40 million each.

No, because the OIDs needn't be unique across databases.  The situation
you describe would actually be best-case as long as the LO creation
activity is well-intermixed across the databases, because that would
minimize the probability of a long run of consecutive OIDs being
assigned to LOs within any one database.  If it was really well-mixed
you could probably approach a billion LOs per DB without noticing any
problems.

            regards, tom lane