Thread: 4B row limit for CLOB tables

4B row limit for CLOB tables

From
Roger Pack
Date:
Hello.  I see on this page a mention of basically a 4B row limit for
tables that have BLOB's
https://wiki.postgresql.org/wiki/BinaryFilesInDB

Is this fact mentioned in the documentation anywhere? Is there an
official source for this? (If not, maybe consider this a feature
request to mention it in the documentation on BLOB).
Cheers and thanks.
-roger-


Re: 4B row limit for CLOB tables

From
Roger Pack
Date:
On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
> Hello.  I see on this page a mention of basically a 4B row limit for
> tables that have BLOB's

Oops I meant for BYTEA or TEXT columns, but it's possible the
reasoning is the same...

> https://wiki.postgresql.org/wiki/BinaryFilesInDB
>
> Is this fact mentioned in the documentation anywhere? Is there an
> official source for this? (If not, maybe consider this a feature
> request to mention it in the documentation on BLOB).
> Cheers and thanks.
> -roger-
>


Re: 4B row limit for CLOB tables

From
Adrian Klaver
Date:
On 01/29/2015 09:51 AM, Roger Pack wrote:
> Hello.  I see on this page a mention of basically a 4B row limit for
> tables that have BLOB's
> https://wiki.postgresql.org/wiki/BinaryFilesInDB
>
> Is this fact mentioned in the documentation anywhere? Is there an
> official source for this? (If not, maybe consider this a feature
> request to mention it in the documentation on BLOB).

Take a look at:

http://www.postgresql.org/about/

Pretty sure it has to do with this:

Maximum Table Size    32 TB

> Cheers and thanks.
> -roger-
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: 4B row limit for CLOB tables

From
Steve Atkins
Date:
On Jan 29, 2015, at 9:53 AM, Roger Pack <rogerdpack2@gmail.com> wrote:

> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>> Hello.  I see on this page a mention of basically a 4B row limit for
>> tables that have BLOB's
>
> Oops I meant for BYTEA or TEXT columns, but it's possible the
> reasoning is the same...

It only applies to large objects, not bytea or text.

>> https://wiki.postgresql.org/wiki/BinaryFilesInDB

Some of that looks incorrect or out of date. (e.g. large objects can be a lot
bigger than 2GB in 9.3+).


>>
>> Is this fact mentioned in the documentation anywhere? Is there an
>> official source for this? (If not, maybe consider this a feature
>> request to mention it in the documentation on BLOB).
>> Cheers and thanks.
>> -roger

I'm not sure whether it's mentioned explicitly, but large objects are
referenced by an OID, which is a 32 bit value (and a global resource).

If you had 4B BLOBs, though, running out of OIDs would probably be
the least of your worries.

Cheers,
  Steve


Re: 4B row limit for CLOB tables

From
Bill Moran
Date:
On Thu, 29 Jan 2015 10:41:58 -0800
Steve Atkins <steve@blighty.com> wrote:
>
> >> Is this fact mentioned in the documentation anywhere? Is there an
> >> official source for this? (If not, maybe consider this a feature
> >> request to mention it in the documentation on BLOB).
> >> Cheers and thanks.
> >> -roger
>
> I'm not sure whether it's mentioned explicitly, but large objects are
> referenced by an OID, which is a 32 bit value (and a global resource).
>
> If you had 4B BLOBs, though, running out of OIDs would probably be
> the least of your worries.

Because of how other RDBMs systems use BLOB-ish types, I think a lot of
people get confused about when to use bytea vs. a large object in
PostgreSQL ... and as a result, end up using large objects more often
than is really necessary.

Large objects are for LARGE data ... keep in mind that a bytea column
can store up to 4G of data. While that seems excessive, it's perfectly
reasonable to use it to store images and other data that's frequently
in the "several megabytes" range. In general, if you can transfer the
entirety of the data in a single shot, then bytea will work fine for
you.

Large objects are for something more like streaming media, where it's
impractical to store the entire file in memory, even for a short time.
I.e. you'd read ~100k from the DB into application memory, do
processing on that data, then discard it and read another 100k.  While
large objects certainly fill a nitch and for some uses are the only
way to make things work, I have never actually seen an implementation
where large objects were the right solution to the problem. (Of course,
I've never worked in the movie or music industry) but I've found
that bytea is usually the correct storage method for things like PDF
files, wordprocessor files, images, etc ... anything where the entire
file needs to be delivered before it can be used.

--
Bill Moran


Re: 4B row limit for CLOB tables

From
Roger Pack
Date:
Forgot to reply all on this one, many thanks to Steve Adrian and Bill
for their answers.

On Jan 29, 2015, at 12:32 PM, Roger Pack <rogerdpack2@gmail.com> wrote:

> On 1/29/15, Steve Atkins <steve@blighty.com> wrote:
>>
>> On Jan 29, 2015, at 9:53 AM, Roger Pack <rogerdpack2@gmail.com> wrote:
>>
>>> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>>> tables that have BLOB's
>>>
>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>> reasoning is the same...
>>
>> It only applies to large objects, not bytea or text.
>>
>>>> https://wiki.postgresql.org/wiki/BinaryFilesInDB
>>
>> Some of that looks incorrect or out of date. (e.g. large objects can be a
>> lot
>> bigger than 2GB in 9.3+).
>>
>>
>>>>
>>>> Is this fact mentioned in the documentation anywhere? Is there an
>>>> official source for this? (If not, maybe consider this a feature
>>>> request to mention it in the documentation on BLOB).
>>>> Cheers and thanks.
>>>> -roger
>>
>> I'm not sure whether it's mentioned explicitly, but large objects are
>> referenced by an OID, which is a 32 bit value (and a global resource).
>
> Thanks for the info, precisely what I was looking for.
> As a following up, could you elaborate on what you mean by "global
> resource"? I believe OID's are generated from a global counter.  Does
> this mean the maximum number of large objects in the database is 4B?

Well, OIDs are generated from a global counter, but there's nothing to
stop that wrapping around.

OIDs are used in a lot of places in the system - to identify tables,
and functions
and loaded modules and suchlike, and duplicates are prevented
by unique indexes or similar. But that means that if the OID counter were
to wrap around and return an OID that was already in use in the same
context then the attempt to use it would fail. For instance, you might
try to create a table, and it would fail because the "next" OID was already
used to specify another table.

Wrapping the OID counter around will cause all sorts of things to break.

Use of OIDs by the user (as opposed to by postgresql itself for internal
bookkeeping) has been deprecated for years.

That's one reason, but not the only reason, that I don't believe anyone
should every use the postgresql large object infrastructure. For small
(megabytes rather than gigabytes) chunks of data that might be processed
in the database or might not, bytea or text types are the right thing. For
anything larger, or anything that's not actually processed within the database
(e.g. images or PDFs handled by a webapp) then leaving the file on the
filesystem and just storing metadata in the database is usually the right
thing.


> If you actually had that many BLOB's (and the counter wrapped) I
> assume that lo_create would start failing [i.e. it has some kind of
> uniqueness constraint on the oid]?  Or something like that?

lo_create() would fail, but so would the rest of the database. Nothing
would work.

A billion large objects is too many. (Exactly a billion too many, in fact).

Cheers,
  Steve


Re: 4B row limit for CLOB tables

From
Roger Pack
Date:
On 1/29/15, Steve Atkins <steve@blighty.com> wrote:
>
> On Jan 29, 2015, at 9:53 AM, Roger Pack <rogerdpack2@gmail.com> wrote:
>
>> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>> tables that have BLOB's
>>
>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>> reasoning is the same...
>
> It only applies to large objects, not bytea or text.

OK I think I figured out possibly why the wiki says this.  I guess
BYTEA entries > 8KB will be autostored via TOAST, which uses an OID in
its backend.  So BYTEA have the same limitation.  It appears that
disabling TOAST is not an option [1]?
So I guess if the number of BYTEA entries (in the sum all tables?
partitioning doesn't help?) with size > 8KB is > 4 billion then there
is actually no option there? (I'm not running into that case just
conjecturing).
Thanks!
-roger-

[1] http://www.postgresql.org/message-id/20130405140348.GC4326@awork2.anarazel.de


Re: 4B row limit for CLOB tables

From
Tatsuo Ishii
Date:
> I'm not sure whether it's mentioned explicitly, but large objects are
> referenced by an OID, which is a 32 bit value (and a global resource).

Large object is not necessarily referenced by OID since 8.1. You can
assign arbitrary 32 bit integers as long as they are unique in the
pg_largeobject table.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp