Re: Fwd: [GENERAL] 4B row limit for CLOB tables - Mailing list pgsql-hackers

From Roger Pack
Subject Re: Fwd: [GENERAL] 4B row limit for CLOB tables
Date
Msg-id CAL1QdWdd8WC=i8TniGddFqGKZ3gY0Ngn-Cira4A6O7BQNcOD=g@mail.gmail.com
Whole thread Raw
In response to Fwd: [GENERAL] 4B row limit for CLOB tables  (Roger Pack <rogerdpack2@gmail.com>)
Responses Re: Fwd: [GENERAL] 4B row limit for CLOB tables
List pgsql-hackers
Oops forgot to forward to the list (suggestion/feature request to the
list admin for the various pg lists: make the default "reply to" go to
the list, not the sender, if at all possible).

Response below:

On 1/30/15, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 1/30/15 11:54 AM, Roger Pack wrote:
>>>> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>>>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>>>> tables that have BLOB's
>>>>
>>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>>> reasoning is the same...
>>>
>>> It only applies to large objects, not bytea or text.
>>
>> OK I think I figured out possibly why the wiki says this.  I guess
>> BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
>> its backend.  So BYTEA has a same limitation.  It appears that
>> disabling TOAST is not an option [1].
>> So I guess if the number of BYTEA entries (in the sum all tables?
>> partitioning doesn't help?) with size > 2KB is > 4 billion then there
>> is actually no option there?  If this occurred it might cause "all
>> sorts of things to break"? [2]
>
> It's a bit more complex than that. First, toast isn't limited to bytea;
> it holds for ALL varlena fields in a table that are allowed to store
> externally. Second, the limit is actually per-table: every table gets
> it's own toast table, and each toast table is limited to 4B unique OIDs.
> Third, the OID counter is actually global, but the code should handle
> conflicts by trying to get another OID. See toast_save_datum(), which
> calls GetNewOidWithIndex().
>
> Now, the reality is that GetNewOidWithIndex() is going to keep
> incrementing the global OID counter until it finds an OID that isn't in
> the toast table. That means that if you actually get anywhere close to
> using 4B OIDs you're going to become extremely unhappy with the
> performance of toasting new data.

OK so "system stability" doesn't degrade per se when it wraps [since
they all use that GetNewOid method or similar [?] good to know.

So basically when it gets near 4B TOAST'ed rows it may have to wrap that
counter and search for "unused" number, and for each number it's
querying the TOAST table to see if it's already used, degrading
performance.

So I guess partitioning tables for now is an acceptable work around,
good to know.

Thanks much for your response, good to know the details before we dive
into postgres with our 8B row table with BYTEA's in it :)



pgsql-hackers by date:

Previous
From: Marco Nenciarini
Date:
Subject: Re: pg_check_dir comments and implementation mismatch
Next
From: Marco Nenciarini
Date:
Subject: Re: File based Incremental backup v8