Re: Fwd: [GENERAL] 4B row limit for CLOB tables - Mailing list pgsql-hackers

From Roger Pack
Subject Re: Fwd: [GENERAL] 4B row limit for CLOB tables
Date
Msg-id CAL1QdWdJMz-KBgx=zKgsxKxOh8zQ3qjj5Wd_EzLzyaTvbCEu5A@mail.gmail.com
Whole thread Raw
In response to Fwd: [GENERAL] 4B row limit for CLOB tables  (Roger Pack <rogerdpack2@gmail.com>)
List pgsql-hackers
On 1/30/15, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 1/30/15 11:54 AM, Roger Pack wrote:
>>>> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>>>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>>>> tables that have BLOB's
>>>>
>>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>>> reasoning is the same...
>>>
>>> It only applies to large objects, not bytea or text.
>>
>> OK I think I figured out possibly why the wiki says this.  I guess
>> BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
>> its backend.  So BYTEA has a same limitation.  It appears that
>> disabling TOAST is not an option [1].
>> So I guess if the number of BYTEA entries (in the sum all tables?
>> partitioning doesn't help?) with size > 2KB is > 4 billion then there
>> is actually no option there?  If this occurred it might cause "all
>> sorts of things to break"? [2]
>
> It's a bit more complex than that. First, toast isn't limited to bytea;
> it holds for ALL varlena fields in a table that are allowed to store
> externally. Second, the limit is actually per-table: every table gets
> it's own toast table, and each toast table is limited to 4B unique OIDs.
> Third, the OID counter is actually global, but the code should handle
> conflicts by trying to get another OID. See toast_save_datum(), which
> calls GetNewOidWithIndex().
>
> Now, the reality is that GetNewOidWithIndex() is going to keep
> incrementing the global OID counter until it finds an OID that isn't in
> the toast table. That means that if you actually get anywhere close to
> using 4B OIDs you're going to become extremely unhappy with the
> performance of toasting new data.

OK so "system stability" doesn't degrade per se when it wraps, good to know.

So basically when it gets near 4B rows it may have to wrap that
counter multiple times, and for each "entry" it's searching if it's
already used, etc.

So I guess partitioning tables for now is an acceptable work around,
good to know.

Thanks much for your response, good to know the details before we dive
into postgres with our 8B row table with BYTEA's in it :)



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Release note bloat is getting out of hand
Next
From: Roger Pack
Date:
Subject: Re: Fwd: [GENERAL] 4B row limit for CLOB tables