Thread: Fwd: [GENERAL] 4B row limit for CLOB tables

Fwd: [GENERAL] 4B row limit for CLOB tables

From
Roger Pack
Date:
>> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>> tables that have BLOB's
>>
>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>> reasoning is the same...
>
> It only applies to large objects, not bytea or text.

OK I think I figured out possibly why the wiki says this.  I guess
BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
its backend.  So BYTEA has a same limitation.  It appears that
disabling TOAST is not an option [1].
So I guess if the number of BYTEA entries (in the sum all tables?
partitioning doesn't help?) with size > 2KB is > 4 billion then there
is actually no option there?  If this occurred it might cause "all
sorts of things to break"? [2]
Thanks!
-roger-

[1] http://www.postgresql.org/message-id/20130405140348.GC4326@awork2.anarazel.de
[2] http://www.postgresql.org/message-id/CAL1QdWfb-p5kE9DT2pMqBxohaKG=vxmDREmSBjc+7TkbOeKBBw@mail.gmail.com



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Jim Nasby
Date:
On 1/30/15 11:54 AM, Roger Pack wrote:
>>> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>>> tables that have BLOB's
>>>
>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>> reasoning is the same...
>>
>> It only applies to large objects, not bytea or text.
>
> OK I think I figured out possibly why the wiki says this.  I guess
> BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
> its backend.  So BYTEA has a same limitation.  It appears that
> disabling TOAST is not an option [1].
> So I guess if the number of BYTEA entries (in the sum all tables?
> partitioning doesn't help?) with size > 2KB is > 4 billion then there
> is actually no option there?  If this occurred it might cause "all
> sorts of things to break"? [2]

It's a bit more complex than that. First, toast isn't limited to bytea; 
it holds for ALL varlena fields in a table that are allowed to store 
externally. Second, the limit is actually per-table: every table gets 
it's own toast table, and each toast table is limited to 4B unique OIDs. 
Third, the OID counter is actually global, but the code should handle 
conflicts by trying to get another OID. See toast_save_datum(), which 
calls GetNewOidWithIndex().

Now, the reality is that GetNewOidWithIndex() is going to keep 
incrementing the global OID counter until it finds an OID that isn't in 
the toast table. That means that if you actually get anywhere close to 
using 4B OIDs you're going to become extremely unhappy with the 
performance of toasting new data.

I don't think it would be horrifically hard to change the way toast OIDs 
are assigned (I'm thinking we'd basically switch to creating a sequence 
for every toast table), but I don't think anyone's ever tried to push 
toast hard enough to hit this kind of limit.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Roger Pack
Date:
Oops forgot to forward to the list (suggestion/feature request to the
list admin for the various pg lists: make the default "reply to" go to
the list, not the sender, if at all possible).

Response below:

On 1/30/15, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 1/30/15 11:54 AM, Roger Pack wrote:
>>>> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>>>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>>>> tables that have BLOB's
>>>>
>>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>>> reasoning is the same...
>>>
>>> It only applies to large objects, not bytea or text.
>>
>> OK I think I figured out possibly why the wiki says this.  I guess
>> BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
>> its backend.  So BYTEA has a same limitation.  It appears that
>> disabling TOAST is not an option [1].
>> So I guess if the number of BYTEA entries (in the sum all tables?
>> partitioning doesn't help?) with size > 2KB is > 4 billion then there
>> is actually no option there?  If this occurred it might cause "all
>> sorts of things to break"? [2]
>
> It's a bit more complex than that. First, toast isn't limited to bytea;
> it holds for ALL varlena fields in a table that are allowed to store
> externally. Second, the limit is actually per-table: every table gets
> it's own toast table, and each toast table is limited to 4B unique OIDs.
> Third, the OID counter is actually global, but the code should handle
> conflicts by trying to get another OID. See toast_save_datum(), which
> calls GetNewOidWithIndex().
>
> Now, the reality is that GetNewOidWithIndex() is going to keep
> incrementing the global OID counter until it finds an OID that isn't in
> the toast table. That means that if you actually get anywhere close to
> using 4B OIDs you're going to become extremely unhappy with the
> performance of toasting new data.

OK so "system stability" doesn't degrade per se when it wraps [since
they all use that GetNewOid method or similar [?] good to know.

So basically when it gets near 4B TOAST'ed rows it may have to wrap that
counter and search for "unused" number, and for each number it's
querying the TOAST table to see if it's already used, degrading
performance.

So I guess partitioning tables for now is an acceptable work around,
good to know.

Thanks much for your response, good to know the details before we dive
into postgres with our 8B row table with BYTEA's in it :)



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Álvaro Hernández Tortosa
Date:
On 31/01/15 14:55, Roger Pack wrote:
> [...]
>
> Now, the reality is that GetNewOidWithIndex() is going to keep
> incrementing the global OID counter until it finds an OID that isn't in
> the toast table. That means that if you actually get anywhere close to
> using 4B OIDs you're going to become extremely unhappy with the
> performance of toasting new data.
> OK so "system stability" doesn't degrade per se when it wraps [since
> they all use that GetNewOid method or similar [?] good to know.
>
> So basically when it gets near 4B TOAST'ed rows it may have to wrap that
> counter and search for "unused" number, and for each number it's
> querying the TOAST table to see if it's already used, degrading
> performance.

    The problem here is that performance degrades exponentially, or 
worse. Speaking here from experience, we already tested this for a very 
similar case (table creation, where two oids are consumed from a global 
sequence when inserting to pg_class). Have a look at 
http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013, 
slides 43-45. We tested there this scenario and shown that table 
creations per second dropped from 10K to a few per second and then to a 
few per day. In the graphs you can't even realize there were more tables 
been created. At around 8K tables from the theoretical limit of 4B oids 
consumed, the process basically stopped (doing more insertions).
    Hope that this information helps.
    Best regards,
    Álvaro


-- 
Álvaro Hernández Tortosa


-----------
8Kdata




Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
José Luis Tallón
Date:
On 01/31/2015 12:25 AM, Jim Nasby wrote:
> [snip]
> It's a bit more complex than that. First, toast isn't limited to 
> bytea; it holds for ALL varlena fields in a table that are allowed to 
> store externally. Second, the limit is actually per-table: every table 
> gets it's own toast table, and each toast table is limited to 4B 
> unique OIDs. Third, the OID counter is actually global, but the code 
> should handle conflicts by trying to get another OID. See 
> toast_save_datum(), which calls GetNewOidWithIndex().
>
> Now, the reality is that GetNewOidWithIndex() is going to keep 
> incrementing the global OID counter until it finds an OID that isn't 
> in the toast table. That means that if you actually get anywhere close 
> to using 4B OIDs you're going to become extremely unhappy with the 
> performance of toasting new data.

Indeed ......

> I don't think it would be horrifically hard to change the way toast 
> OIDs are assigned (I'm thinking we'd basically switch to creating a 
> sequence for every toast table), but I don't think anyone's ever tried 
> to push toast hard enough to hit this kind of limit.

We did. The Billion Table Project, part2 (a.k.a. "when does Postgres' 
OID allocator become a bottleneck").... The allocator becomes 
essentially unusable at about 2.1B OIDs, where it performed very well at 
"quite empty"(< 100M objects) levels.

So yes, using one sequence per TOAST table should help.
Combined with the new SequenceAMs / sequence implementation being 
proposed (specifically: one file for all sequences in a certain 
tablespace) this should scale much better.


My 2c.


Regards,
    / J.L.





Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Roger Pack
Date:
On 1/30/15, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 1/30/15 11:54 AM, Roger Pack wrote:
>>>> On 1/29/15, Roger Pack <rogerdpack2@gmail.com> wrote:
>>>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>>>> tables that have BLOB's
>>>>
>>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>>> reasoning is the same...
>>>
>>> It only applies to large objects, not bytea or text.
>>
>> OK I think I figured out possibly why the wiki says this.  I guess
>> BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
>> its backend.  So BYTEA has a same limitation.  It appears that
>> disabling TOAST is not an option [1].
>> So I guess if the number of BYTEA entries (in the sum all tables?
>> partitioning doesn't help?) with size > 2KB is > 4 billion then there
>> is actually no option there?  If this occurred it might cause "all
>> sorts of things to break"? [2]
>
> It's a bit more complex than that. First, toast isn't limited to bytea;
> it holds for ALL varlena fields in a table that are allowed to store
> externally. Second, the limit is actually per-table: every table gets
> it's own toast table, and each toast table is limited to 4B unique OIDs.
> Third, the OID counter is actually global, but the code should handle
> conflicts by trying to get another OID. See toast_save_datum(), which
> calls GetNewOidWithIndex().
>
> Now, the reality is that GetNewOidWithIndex() is going to keep
> incrementing the global OID counter until it finds an OID that isn't in
> the toast table. That means that if you actually get anywhere close to
> using 4B OIDs you're going to become extremely unhappy with the
> performance of toasting new data.

OK so "system stability" doesn't degrade per se when it wraps, good to know.

So basically when it gets near 4B rows it may have to wrap that
counter multiple times, and for each "entry" it's searching if it's
already used, etc.

So I guess partitioning tables for now is an acceptable work around,
good to know.

Thanks much for your response, good to know the details before we dive
into postgres with our 8B row table with BYTEA's in it :)



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Roger Pack
Date:
On 2/2/15, José Luis Tallón <jltallon@adv-solutions.net> wrote:
> On 01/31/2015 12:25 AM, Jim Nasby wrote:
>> [snip]
>> It's a bit more complex than that. First, toast isn't limited to
>> bytea; it holds for ALL varlena fields in a table that are allowed to
>> store externally. Second, the limit is actually per-table: every table
>> gets it's own toast table, and each toast table is limited to 4B
>> unique OIDs. Third, the OID counter is actually global, but the code
>> should handle conflicts by trying to get another OID. See
>> toast_save_datum(), which calls GetNewOidWithIndex().
>>
>> Now, the reality is that GetNewOidWithIndex() is going to keep
>> incrementing the global OID counter until it finds an OID that isn't
>> in the toast table. That means that if you actually get anywhere close
>> to using 4B OIDs you're going to become extremely unhappy with the
>> performance of toasting new data.
>
> Indeed ......
>
>> I don't think it would be horrifically hard to change the way toast
>> OIDs are assigned (I'm thinking we'd basically switch to creating a
>> sequence for every toast table), but I don't think anyone's ever tried
>> to push toast hard enough to hit this kind of limit.
>
> We did. The Billion Table Project, part2 (a.k.a. "when does Postgres'
> OID allocator become a bottleneck").... The allocator becomes
> essentially unusable at about 2.1B OIDs, where it performed very well at
> "quite empty"(< 100M objects) levels.
>
> So yes, using one sequence per TOAST table should help.
> Combined with the new SequenceAMs / sequence implementation being
> proposed (specifically: one file for all sequences in a certain
> tablespace) this should scale much better.

But it wouldn't be perfect, right? I mean if you had multiple
deletion/insertions and pass 4B then the "one sequence per TOAST
table" would still wrap [albeit more slowly], and performance start
degrading the same way.  And there would still be the hard 4B limit.
Perhaps the foreign key to the TOAST table could be changed from oid
(32 bits) to something else (64 bits) [as well the sequence] so that
it never wraps?  What do you think? And would a more aggressive change
like this have a chance of being accepted into the code base?
Thanks.
-roger-



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Tom Lane
Date:
Roger Pack <rogerdpack2@gmail.com> writes:
> On 2/2/15, José Luis Tallón <jltallon@adv-solutions.net> wrote:
>> So yes, using one sequence per TOAST table should help.
>> Combined with the new SequenceAMs / sequence implementation being
>> proposed (specifically: one file for all sequences in a certain
>> tablespace) this should scale much better.

> But it wouldn't be perfect, right? I mean if you had multiple
> deletion/insertions and pass 4B then the "one sequence per TOAST
> table" would still wrap [albeit more slowly], and performance start
> degrading the same way.  And there would still be the hard 4B limit.
> Perhaps the foreign key to the TOAST table could be changed from oid
> (32 bits) to something else (64 bits) [as well the sequence] so that
> it never wraps?  What do you think? And would a more aggressive change
> like this have a chance of being accepted into the code base?

There has been some thought about this, but I have seen no, zero, reports
of anyone actually running into problems *in practice* (as opposed to
contrived cases like "can we create a billion tables").  So we probably
aren't going to want to address it until it starts being a real problem.

The reason it's not as significant as you might think is that small field
values (less than a couple KB *after compression*) don't get pushed out
to the TOAST table, so they don't consume OIDs.  And large field values,
like megabytes worth, aren't a problem either because you just aren't
gonna have that many of them.  (Simple arithmetic.)  You could potentially
get into trouble if you had a whole lot of entries that were just a little
over the toasting threshold, because then you'd have a lot of OIDs
consumed but still a manageable total amount of disk space.  But that
doesn't seem to be a very common usage pattern.

Also, partitioning the table largely eliminates the problem because each
partition will have its own TOAST table.  I'm on record as saying that
many people are far too quick to decide that they need partitioning; but
once you get into the volume of data where 4B toast entries starts to
look like a limitation, you will probably have other reasons to think
that you need to partition.

In short, this is something that's theoretically interesting but doesn't
seem worth doing in practice --- yet anyway.
        regards, tom lane



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
José Luis Tallón
Date:
On 02/02/2015 09:36 PM, Roger Pack wrote:
> On 2/2/15, José Luis Tallón <jltallon@adv-solutions.net> wrote:
>> On 01/31/2015 12:25 AM, Jim Nasby wrote:
>>> [snip]
>>> It's a bit more complex than that. First, toast isn't limited to
>>> bytea; it holds for ALL varlena fields in a table that are allowed to
>>> store externally. Second, the limit is actually per-table: every table
>>> gets it's own toast table, and each toast table is limited to 4B
>>> unique OIDs. Third, the OID counter is actually global, but the code
>>> should handle conflicts by trying to get another OID. See
>>> toast_save_datum(), which calls GetNewOidWithIndex().
>>>
>>> Now, the reality is that GetNewOidWithIndex() is going to keep
>>> incrementing the global OID counter until it finds an OID that isn't
>>> in the toast table. That means that if you actually get anywhere close
>>> to using 4B OIDs you're going to become extremely unhappy with the
>>> performance of toasting new data.
>> Indeed ......
>>
>>> I don't think it would be horrifically hard to change the way toast
>>> OIDs are assigned (I'm thinking we'd basically switch to creating a
>>> sequence for every toast table), but I don't think anyone's ever tried
>>> to push toast hard enough to hit this kind of limit.
>> We did. The Billion Table Project, part2 (a.k.a. "when does Postgres'
>> OID allocator become a bottleneck").... The allocator becomes
>> essentially unusable at about 2.1B OIDs, where it performed very well at
>> "quite empty"(< 100M objects) levels.
>>
>> So yes, using one sequence per TOAST table should help.
>> Combined with the new SequenceAMs / sequence implementation being
>> proposed (specifically: one file for all sequences in a certain
>> tablespace) this should scale much better.
> But it wouldn't be perfect, right? I mean if you had multiple
> deletion/insertions and pass 4B then the "one sequence per TOAST
> table" would still wrap [albeit more slowly], and performance start
> degrading the same way.  And there would still be the hard 4B limit.
> Perhaps the foreign key to the TOAST table could be changed from oid
> (32 bits) to something else (64 bits) [as well the sequence] so that
> it never wraps?

Hmm.... 2^32 times aprox. 2kB (as per usual heuristics, ~4 rows per heap 
page) is 8796093022208 (~9e13) bytes ... which results in 8192 1GB segments :O
Looks like partitioning might be needed much sooner than that (if only 
for index efficiency reasons)... unless access is purely sequential.

The problem with changing the id from 32 to 64 bits is that the storage 
*for everybody else* doubles, making the implementation slower for 
most.... though this might be actually not that important.
The alternative could be some "long LOB" ("HugeOBject"?) using the 
equivalent to "serial8" whereas regular LOBs would use "serial4".


Anybody actually reaching this limit out there?



Regards,
    / J .L.




Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Jim Nasby
Date:
On 2/2/15 3:50 PM, José Luis Tallón wrote:
> Hmm.... 2^32 times aprox. 2kB (as per usual heuristics, ~4 rows per heap
> page) is 8796093022208 (~9e13) bytes
>   ... which results in 8192 1GB segments :O
> Looks like partitioning might be needed much sooner than that (if only
> for index efficiency reasons)... unless access is purely sequential.
>
> The problem with changing the id from 32 to 64 bits is that the storage
> *for everybody else* doubles, making the implementation slower for
> most.... though this might be actually not that important.
> The alternative could be some "long LOB" ("HugeOBject"?) using the
> equivalent to "serial8" whereas regular LOBs would use "serial4".

Well, it depends on how we did this. We could (for example) add a field 
to pg_class that determines what type to use for toast pointers; OID, 
int, or bigint. That could then be taken into account in the *toast* 
functions.

But as others have pointed out, we haven't even had any real complaints 
about toast using OIDs as being an issue until now, so I think it's 
premature to start messing with this. At most it's just something to 
keep in mind so we don't preclude doing this in the future.

BTW, regarding the size of what gets toasted; I've often thought it 
would be useful to allow a custom size limit on columns so that you 
could easily force data to be toasted if you knew you were very unlikely 
to access it. Basically, a cheap form of vertical partitioning.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [GENERAL] 4B row limit for CLOB tables

From
Matthew Kelly
Date:
> Hmm.... 2^32 times aprox. 2kB (as per usual heuristics, ~4 rows per heap page) is 8796093022208 (~9e13) bytes
> ... which results in 8192 1GB segments :O

8192 1GB segments is just 8TB, its not _that_ large.  At TripAdvisor we’ve been using a NoSQL solution to do session
storage. We are looking to probably swap that out to be Postgres (every other db backing the site is Postgres).
Essentially,what I’m building is a system with 1 logical table that maps session id to a 2KB+ grab bag of ever changing
sessionattributes which is partially normalized, partially json.  315 million uniques a month multiplied by the
retentionpolicy means I need to hold 2-4 billion session objects (and somehow expire old ones).  Additionally, most
httpcalls can update the session, so between maintenance windows I expect to take around 20 billion 'upserts’.
Obviously,I will have to shard and partition the table in practice, but this weekend I ran a test that demonstrated
thata single table on a 9.4 server + logical replication + Dell 730xd can handle 4x that workload.  Well, it can for 38
hours…until you wrap xid’s on the toast table.  :P  I’ll be the first to admit that isn’t the normal use case though.
I’mhappy to have found this thread, however, because I’m going to have to build around the global oid counter,
explicitlythe prevent the problem I explain below regarding clustering.
 

> Anybody actually reaching this limit out there?

Well its not the 4 billion row limit that concerns me, its the global shared counter in conjunction with
pg_restore/clusteringthat is actually pretty concerning.
 

Just checked through all of TripAdvisor’s normal databases and the max tuples I see in single toast table is
17,000,000,so that is still a couple of orders of magnitude too small.  (however, close enough that it’ll be a concern
ina few years).
 

However, I do have active databases where the current oid is between 1 billion and 2 billion.  They were last
dump-restoredfor a hardware upgrade a couple years ago and were a bit more than half the size.  I therefore can imagine
thatI have tables which are keyed by ~8,000,000 consecutive oids.
 

I would argue that when it wraps there will be a single insert that will probably block for 2-5 minutes while it tries
toaccomplish ~8,000,000 index scans inside of GetNewOidWithIndex.  Even partitioning doesn’t protect you from this
potentialproblem.
 

What even more weird is that this issue can be trigged by consuming too many oid’s in a different database in the same
cluster(i.e. creating large amounts of temp tables)
 

> The problem with changing the id from 32 to 64 bits is that the storage *for everybody else* doubles, making the
implementationslower for most.... though this might be actually not that important.
 

Well, you aren’t doubling the storage.  Even if you have to store the key in 4 places, you are adding 16 bytes per
TOASTtuple.  If we work off the 2KB estimate for each TOAST tuple, then you are only increasing the storage by 0.7%.
I’msure there are more hidden costs but we are really only talking about a low single digit percent increase.  In
exchange,you get to drop one index scan per toast insert; an index scan looking in the only hot part of the index. 
 

That being said I’d be perfectly happy merely giving each TOAST table its own sequence as that almost entire mitigates
therisk of an unexpected lock up on reasonably sized tables/partitions, and provides a functional work around for those
ofus with larger than average installs.
 

- Matt K

Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
José Luis Tallón
Date:
On 02/03/2015 03:44 AM, Jim Nasby wrote:
> [snip]
>> The alternative could be some "long LOB" ("HugeOBject"?) using the
>> equivalent to "serial8" whereas regular LOBs would use "serial4".
>
> Well, it depends on how we did this. We could (for example) add a 
> field to pg_class that determines what type to use for toast pointers; 
> OID, int, or bigint. That could then be taken into account in the 
> *toast* functions.
>
> But as others have pointed out, we haven't even had any real 
> complaints about toast using OIDs as being an issue until now, so I 
> think it's premature to start messing with this. At most it's just 
> something to keep in mind so we don't preclude doing this in the future.

A patch creating those HOBs (Huge Objects) might well make sense *after* 
the sequence refactoring got merged.
Removing the bottleneck due to the OID allocator for this use case will 
be definitively welcome    (I don't dare to code that just yet, but here's hoping someone will 
step in O:-)

> BTW, regarding the size of what gets toasted; I've often thought it 
> would be useful to allow a custom size limit on columns so that you 
> could easily force data to be toasted if you knew you were very 
> unlikely to access it. Basically, a cheap form of vertical partitioning.

Hmmm.... alter column set storage external / set storage extended ?
From http://www.postgresql.org/docs/9.4/static/sql-altertable.html :    ALTER [ COLUMN ] column_name SET STORAGE {
PLAIN| EXTERNAL | 
 
EXTENDED | MAIN }
    This would do what you described, right?


HTH,
    / J.L.




Re: [GENERAL] 4B row limit for CLOB tables

From
Tom Lane
Date:
Matthew Kelly <mkelly@tripadvisor.com> writes:
> However, I do have active databases where the current oid is between 1 billion and 2 billion.  They were last
dump-restoredfor a hardware upgrade a couple years ago and were a bit more than half the size.  I therefore can imagine
thatI have tables which are keyed by ~8,000,000 consecutive oids.
 

> I would argue that when it wraps there will be a single insert that will probably block for 2-5 minutes while it
triesto accomplish ~8,000,000 index scans inside of GetNewOidWithIndex.  Even partitioning doesn’t protect you from
thispotential problem.
 

That may be a hazard, but ...

> That being said I’d be perfectly happy merely giving each TOAST table its own sequence as that almost entire
mitigatesthe risk of an unexpected lock up on reasonably sized tables/partitions, and provides a functional work around
forthose of us with larger than average installs.
 

... this "fix" would actually make things enormously worse.  With the
single counter feeding all tables, you at least have a reasonable
probability that there are not enormously long runs of consecutive OIDs in
any one toast table.  With a sequence per table, you are nearly guaranteed
that there are such runs, because inserts into other tables don't create a
break.

(This effect is also why you're wrong to claim that partitioning can't fix
it.)
        regards, tom lane



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
David Steele
Date:
On 2/3/15 10:01 AM, José Luis Tallón wrote:<br /><br /><blockquote cite="mid:54D0E2B1.3030900@adv-solutions.net"
type="cite">Hmmm....alter column set storage external / set storage extended ? <br /><br /> From <a
class="moz-txt-link-freetext"
href="http://www.postgresql.org/docs/9.4/static/sql-altertable.html">http://www.postgresql.org/docs/9.4/static/sql-altertable.html</a>
:<br />     ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN } <br /><br />     This would
dowhat you described, right? <br /></blockquote><br /> EXTENDED is the default for most TOAST-able types and is still
subjectto <tt class="SYMBOL">TOAST_TUPLE_THRESHOLD which is normally 2K.  EXTERNAL is the same but with no
compression.<br/><br /> See: <a class="moz-txt-link-freetext"
href="http://www.postgresql.org/docs/9.4/static/storage-toast.html">http://www.postgresql.org/docs/9.4/static/storage-toast.html</a><br
/><br/></tt> <pre class="moz-signature" cols="72">--  
- David Steele
<a class="moz-txt-link-abbreviated" href="mailto:david@pgmasters.net">david@pgmasters.net</a></pre>

Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Jim Nasby
Date:
On 2/3/15 9:50 AM, David Steele wrote:
> On 2/3/15 10:01 AM, José Luis Tallón wrote:
>
>> Hmmm.... alter column set storage external / set storage extended ?
>>
>> From http://www.postgresql.org/docs/9.4/static/sql-altertable.html :
>>     ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL |
>> EXTENDED | MAIN }
>>
>>     This would do what you described, right?
>
> EXTENDED is the default for most TOAST-able types and is still subject
> to TOAST_TUPLE_THRESHOLD which is normally 2K. EXTERNAL is the same but
> with no compression.
>
> See: http://www.postgresql.org/docs/9.4/static/storage-toast.html

Right. I'd like to be able to set per-column TOAST_TUPLE_THRESHOLD.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [GENERAL] 4B row limit for CLOB tables

From
Jim Nasby
Date:
On 2/3/15 9:01 AM, Tom Lane wrote:
> Matthew Kelly <mkelly@tripadvisor.com> writes:
>> However, I do have active databases where the current oid is between 1 billion and 2 billion.  They were last
dump-restoredfor a hardware upgrade a couple years ago and were a bit more than half the size.  I therefore can imagine
thatI have tables which are keyed by ~8,000,000 consecutive oids.
 
>
>> I would argue that when it wraps there will be a single insert that will probably block for 2-5 minutes while it
triesto accomplish ~8,000,000 index scans inside of GetNewOidWithIndex.  Even partitioning doesn’t protect you from
thispotential problem.
 
>
> That may be a hazard, but ...
>
>> That being said I’d be perfectly happy merely giving each TOAST table its own sequence as that almost entire
mitigatesthe risk of an unexpected lock up on reasonably sized tables/partitions, and provides a functional work around
forthose of us with larger than average installs.
 
>
> ... this "fix" would actually make things enormously worse.  With the
> single counter feeding all tables, you at least have a reasonable
> probability that there are not enormously long runs of consecutive OIDs in
> any one toast table.  With a sequence per table, you are nearly guaranteed
> that there are such runs, because inserts into other tables don't create a
> break.
>
> (This effect is also why you're wrong to claim that partitioning can't fix
> it.)

That's assuming that toasting is evenly spread between tables. In my 
experience, that's not a great bet...
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
David Steele
Date:
On 2/3/15 5:27 PM, Jim Nasby wrote:
> On 2/3/15 9:50 AM, David Steele wrote:
>> EXTENDED is the default for most TOAST-able types and is still subject
>> to TOAST_TUPLE_THRESHOLD which is normally 2K. EXTERNAL is the same but
>> with no compression.
>>
>> See: http://www.postgresql.org/docs/9.4/static/storage-toast.html
>
> Right. I'd like to be able to set per-column TOAST_TUPLE_THRESHOLD.

No argument there.  There are some columns that I would prefer to always
TOAST because even 2K can be very big for some use cases.

--
- David Steele
david@pgmasters.net



Re: [GENERAL] 4B row limit for CLOB tables

From
Matthew Kelly
Date:
> That's assuming that toasting is evenly spread between tables. In my experience, that's not a great bet...

Time to create a test:
SELECT chunk_id::bigint/100000 as id_range, count(*), count(*)/(100000::float) density FROM (SELECT chunk_id FROM pg_toast.pg_toast_39000165 WHERE chunk_id <
 100000000 AND chunk_seq = 0) f GROUP BY id_range ORDER BY id_range;

The machine in question was restored in parallel in Sept 2013 as part of an upgrade from 8.4.  It has about 2000 tables, so its definitely not dominated by a couple tables. Progress towards oid wrap around is about 25.6%.

With minimal effort, I found 2 bad examples, and I’m sure I can easily find more. I attached the results for those two.

There were runs of 1,100,000+ and 600,000+ chunk_ids where more than 99% of the chunk_id are taken.  After restore completion, oid densities averaged less than 20 per 100,000 and 400 per 100,000 respectively.  The only reasons those runs seem to be so short is because the tables were much smaller back then.  I expect that next time I dump restore (necessary for upgrading OS versions due to the collation issue), I’m going to have runs closer to 20,0000,000.

> ... this "fix" would actually make things enormously worse.  With the
> single counter feeding all tables, you at least have a reasonable
> probability that there are not enormously long runs of consecutive OIDs in
> any one toast table.  With a sequence per table, you are nearly guaranteed
> that there are such runs, because inserts into other tables don't create a
> break.

It makes each toast table independent (and far less likely to wrap) .  It would wrap when the sum(mods on THIS toast table) > 2^32.  Right now the function looks like:

sum(mods on ALL toast tables in cluster) + sum(created normal tables in cluster * k) + sum(created temp tables in cluster * k) + [...] > 2^32,
WHERE k average number of ids consumed for pg_class, pg_type, etc...

In the case of an insert only table (which is a common use case for partitions), the id would only wrap when the TOAST table was “full”.  On the other hand currently, it would wrap into its pg_restored section when the combined oid consuming operations on the cluster surpassed 4 billion.

That being said, I’m certainly not attached to that solution.  My real argument is that although its not a problem today, we are only about 5 years from it being a problem for large installs and the first time you’ll hear about it is after someone has a 5 minute production outage on a database thats been taking traffic for 2 years.

- Matt K.


Attachment

Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Bruce Momjian
Date:
On Sun, Feb  1, 2015 at 03:54:03PM +0100, Álvaro Hernández Tortosa wrote:
> 
> On 31/01/15 14:55, Roger Pack wrote:
> >[...]
> >
> >Now, the reality is that GetNewOidWithIndex() is going to keep
> >incrementing the global OID counter until it finds an OID that isn't in
> >the toast table. That means that if you actually get anywhere close to
> >using 4B OIDs you're going to become extremely unhappy with the
> >performance of toasting new data.
> >OK so "system stability" doesn't degrade per se when it wraps [since
> >they all use that GetNewOid method or similar [?] good to know.
> >
> >So basically when it gets near 4B TOAST'ed rows it may have to wrap that
> >counter and search for "unused" number, and for each number it's
> >querying the TOAST table to see if it's already used, degrading
> >performance.
> 
> 
>     The problem here is that performance degrades exponentially, or
> worse. Speaking here from experience, we already tested this for a
> very similar case (table creation, where two oids are consumed from
> a global sequence when inserting to pg_class). Have a look at
> http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013,
> slides 43-45. We tested there this scenario and shown that table
> creations per second dropped from 10K to a few per second and then
> to a few per day. In the graphs you can't even realize there were
> more tables been created. At around 8K tables from the theoretical
> limit of 4B oids consumed, the process basically stopped (doing more
> insertions).

I had a look at our FAQ about Postgres limitations and I don't see
anything that needs changing:
https://wiki.postgresql.org/wiki/FAQ#What_is_the_maximum_size_for_a_row.2C_a_table.2C_and_a_database.3F
Maximum size for a database? unlimited (32 TB databases exist)Maximum size for a table? 32 TBMaximum size for a row?
400GBMaximum size for a field? 1 GBMaximum number of rows in a table? unlimitedMaximum number of columns in a table?
250-1600depending on column typesMaximum number of indexes on a table? unlimited
 

We don't report the maximum number of tables per database, or the
maximum number of TOAST values.  Agreed?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Stephen Frost
Date:
* Bruce Momjian (bruce@momjian.us) wrote:
> On Sun, Feb  1, 2015 at 03:54:03PM +0100, Álvaro Hernández Tortosa wrote:
> >     The problem here is that performance degrades exponentially, or
> > worse. Speaking here from experience, we already tested this for a
> > very similar case (table creation, where two oids are consumed from
> > a global sequence when inserting to pg_class). Have a look at
> > http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013,
> > slides 43-45. We tested there this scenario and shown that table
> > creations per second dropped from 10K to a few per second and then
> > to a few per day. In the graphs you can't even realize there were
> > more tables been created. At around 8K tables from the theoretical
> > limit of 4B oids consumed, the process basically stopped (doing more
> > insertions).
>
> I had a look at our FAQ about Postgres limitations and I don't see
> anything that needs changing:
>
>     https://wiki.postgresql.org/wiki/FAQ#What_is_the_maximum_size_for_a_row.2C_a_table.2C_and_a_database.3F
>
>     Maximum size for a database? unlimited (32 TB databases exist)
>     Maximum size for a table? 32 TB
>     Maximum size for a row? 400 GB
>     Maximum size for a field? 1 GB
>     Maximum number of rows in a table? unlimited
>     Maximum number of columns in a table? 250-1600 depending on column types
>     Maximum number of indexes on a table? unlimited
>
> We don't report the maximum number of tables per database, or the
> maximum number of TOAST values.  Agreed?

For my 2c, this limitation is a surprise to users and therefore we
should add documentation to point out that it exists, unless we're going
to actually fix it (which is certainly what I'd prefer to see...).

As for the other discussion on the thread, having a per-table sequence
would be far better as it'd reduce the wrap-around risk quite a bit and,
further, we should be able to provide that without breaking the on-disk
format.  What would be really nice is a way to expand the size of the
key when needed- in other words, instead of wrapping around, if we
actually hit 4B toasted values in a table then stick a flag somewhere
for the next toasted value that says "this value is in the second toast
table/fork" and then go up to 4B on that one, etc.  That allows us to
support more than 4B toasted values and doesn't require searching for
holes in the values assigned.

If we end up with empty toast tables eventually, then allow reusing
them.  Perhaps vacuum can even be used to make a note somewhere saying
"this toast table is now empty and can be reused".

In the end, I'd like to think we can do better here than having a hard
limit at 4B when it comes to how many values over a few KB we can store.
As mentioned, that isn't all that much these days.  I'm not saying that
my proposal or what's been proposed upthread is an answer, but I've
certainly build PG systems which store over 4B rows and it's not hard to
imagine cases where I might have wanted a toasted value for each of
those rows.
Thanks,
    Stephen

Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Bruce Momjian (bruce@momjian.us) wrote:
>> On Sun, Feb  1, 2015 at 03:54:03PM +0100, �lvaro Hern�ndez Tortosa wrote:
>>> The problem here is that performance degrades exponentially, or
>>> worse. Speaking here from experience, we already tested this for a
>>> very similar case (table creation, where two oids are consumed from
>>> a global sequence when inserting to pg_class). Have a look at
>>> http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013,
>>> slides 43-45. We tested there this scenario and shown that table
>>> creations per second dropped from 10K to a few per second and then
>>> to a few per day. In the graphs you can't even realize there were
>>> more tables been created. At around 8K tables from the theoretical
>>> limit of 4B oids consumed, the process basically stopped (doing more
>>> insertions).

>> We don't report the maximum number of tables per database, or the
>> maximum number of TOAST values.  Agreed?

> For my 2c, this limitation is a surprise to users and therefore we
> should add documentation to point out that it exists, unless we're going
> to actually fix it (which is certainly what I'd prefer to see...).

TBH, I've got very little enthusiasm for fixing this given the number
of reports of trouble from the field, which so far as I recall is zero.
�lvaro's case came up through intentionally trying to create an
unreasonable number of tables, not from real usage.  This thread likewise
appears to contain lots of speculation and no reports of anyone hitting
a problem in practice.

Certainly this is likely to become an issue at some point in the future,
but I'm not finding it very compelling to worry about now.  By the time
it does become an issue, we may have additional considerations or use
cases that should inform a solution; which seems to me to be a good
argument not to try to fix it in advance of real problems.  Perhaps,
for example, we'd find that at the same time we ought to relax the 1GB
limit on individual-value size; or perhaps not.

Having said all that, if we did try to fix it today, I'd imagine changing
TOAST value identifiers to int64 and inventing a new TOAST pointer format
for use when 32 bits isn't wide enough for the ID.  But I think we're best
advised to hold off doing that until the need becomes pressing.
        regards, tom lane



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Robert Haas
Date:
On Thu, Apr 23, 2015 at 11:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Having said all that, if we did try to fix it today, I'd imagine changing
> TOAST value identifiers to int64 and inventing a new TOAST pointer format
> for use when 32 bits isn't wide enough for the ID.  But I think we're best
> advised to hold off doing that until the need becomes pressing.

Just out of curiosity, has anyone thought about inventing a new TOAST
pointer format on the grounds that our TOAST pointers are unreasonably
large? IIUC, a TOAST pointer right now is 18 bytes: 16 for a
varatt_external, and then that gets embedded in a varattrib_1b_e with
a va_header byte and a va_tag byte.  Eliminating one or both of
va_rawsize and va_extsize from the TOAST pointer itself seems like it
could save quite a bit of space on disk.  Maybe you could even find a
way to get rid of va_toastrelid; after all, at the point when you
first acquire a pointer to the tuple, you surely know what relation
it's a part of.  You'd probably want to force de-TOASTing (or
converting to a more expressive form of TOAST pointer, anyway) when
you extracted the column from the tuple, which might be hard to
arrange.

But the benefits could be pretty significant.  Suppose you have a
table where each tuple is 4K untoasted, with all but 100 bytes of that
in a single column. So, as stored, you've got 100 bytes of regular
stuff plus an 18-byte TOAST header.  If you could trim 2 of the
above-mentioned 4-byte fields out of the TOAST header, that would
reduce the size of the main relation fork by almost 7%.  If you could
trim all 3 of them out, you'd save more than 10%.  That's not nothing,
and the benefits could be even larger for rows that contain multiple
TOAST pointers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Apr 23, 2015 at 11:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Having said all that, if we did try to fix it today, I'd imagine changing
> > TOAST value identifiers to int64 and inventing a new TOAST pointer format
> > for use when 32 bits isn't wide enough for the ID.  But I think we're best
> > advised to hold off doing that until the need becomes pressing.
>
> Just out of curiosity, has anyone thought about inventing a new TOAST
> pointer format on the grounds that our TOAST pointers are unreasonably
> large?

I'd not thought about it, but sure sounds like a good idea from here.

Would be particularly great if we were able to do this and increase the
number of supported toast pointers and avoid having to go hunting for
unused identifiers due to wrapping.
Thanks!
    Stephen

Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Álvaro Hernández Tortosa
Date:
On 24/04/15 05:24, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>> * Bruce Momjian (bruce@momjian.us) wrote:
>>> On Sun, Feb  1, 2015 at 03:54:03PM +0100, Álvaro Hernández Tortosa wrote:
>>>> The problem here is that performance degrades exponentially, or
>>>> worse. Speaking here from experience, we already tested this for a
>>>> very similar case (table creation, where two oids are consumed from
>>>> a global sequence when inserting to pg_class). Have a look at
>>>> http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013,
>>>> slides 43-45. We tested there this scenario and shown that table
>>>> creations per second dropped from 10K to a few per second and then
>>>> to a few per day. In the graphs you can't even realize there were
>>>> more tables been created. At around 8K tables from the theoretical
>>>> limit of 4B oids consumed, the process basically stopped (doing more
>>>> insertions).
>>> We don't report the maximum number of tables per database, or the
>>> maximum number of TOAST values.  Agreed?
>> For my 2c, this limitation is a surprise to users and therefore we
>> should add documentation to point out that it exists, unless we're going
>> to actually fix it (which is certainly what I'd prefer to see...).
> TBH, I've got very little enthusiasm for fixing this given the number
> of reports of trouble from the field, which so far as I recall is zero.
> Álvaro's case came up through intentionally trying to create an
> unreasonable number of tables, not from real usage.  This thread likewise
> appears to contain lots of speculation and no reports of anyone hitting
> a problem in practice.
    It is certainly true that this was a very synthetic case. I 
envision, however, certain use cases where we may hit a very large 
number of tables:

- Massive multitenancy
- Aggressive partitioning
- Massive multitenancy with aggressive partitioning
- Software dynamically generated tables, like those created by ToroDB 
(https://github.com/torodb/torodb). In ToroDB we generate tables 
depending only on the input data, so we may end up having as many as 
required by the datasource. For example, a "general purpose" json 
datastore may generate several tables per document inserted.

>
> Certainly this is likely to become an issue at some point in the future,
> but I'm not finding it very compelling to worry about now.  By the time
> it does become an issue, we may have additional considerations or use
> cases that should inform a solution; which seems to me to be a good
> argument not to try to fix it in advance of real problems.  Perhaps,
    I understand this argument, and it makes sense. However, on the 
other side, given the long time it may take from patch to commit and 
then release version to companies finally using it in production, I'd 
rather try to fix it soon, as there are already reports and use cases 
that may hit it, rather than wait three years until it explodes in our 
faces. After all, 640Kb RAM is enough, right? So maybe 2B tables is not 
that far in the horizon. Who knows.
    Regards,
    Álvaro


-- 
Álvaro Hernández Tortosa


-----------
8Kdata




Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Jim Nasby
Date:
On 4/24/15 7:11 PM, Álvaro Hernández Tortosa wrote:
> On 24/04/15 05:24, Tom Lane wrote:
...
>> TBH, I've got very little enthusiasm for fixing this given the number
>> of reports of trouble from the field, which so far as I recall is zero.
>> Álvaro's case came up through intentionally trying to create an
>> unreasonable number of tables, not from real usage.  This thread likewise
>> appears to contain lots of speculation and no reports of anyone hitting
>> a problem in practice.
>
>      It is certainly true that this was a very synthetic case. I
> envision, however, certain use cases where we may hit a very large
> number of tables:

The original case has NOTHING to do with the number of tables and 
everything to do with the number of toasted values a table can have. If 
you have to toast 4B attributes in a single relation it will fail. In 
reality, if you get anywhere close to that things will fall apart due to 
OID conflicts.

This case isn't nearly as insane as 4B tables. A table storing 10 text 
fields each of which is 2K would hit this limit with only 400M rows. If 
my math is right that's only 8TB; certainly not anything insane 
space-wise or rowcount-wise.

Perhaps it's still not fixing, but I think it's definitely worth 
documenting.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Bruce Momjian
Date:
On Fri, Apr 24, 2015 at 11:39:04PM -0500, Jim Nasby wrote:
> On 4/24/15 7:11 PM, Álvaro Hernández Tortosa wrote:
> >On 24/04/15 05:24, Tom Lane wrote:
> ...
> >>TBH, I've got very little enthusiasm for fixing this given the number
> >>of reports of trouble from the field, which so far as I recall is zero.
> >>Álvaro's case came up through intentionally trying to create an
> >>unreasonable number of tables, not from real usage.  This thread likewise
> >>appears to contain lots of speculation and no reports of anyone hitting
> >>a problem in practice.
> >
> >     It is certainly true that this was a very synthetic case. I
> >envision, however, certain use cases where we may hit a very large
> >number of tables:
> 
> The original case has NOTHING to do with the number of tables and
> everything to do with the number of toasted values a table can have.
> If you have to toast 4B attributes in a single relation it will
> fail. In reality, if you get anywhere close to that things will fall
> apart due to OID conflicts.
> 
> This case isn't nearly as insane as 4B tables. A table storing 10
> text fields each of which is 2K would hit this limit with only 400M
> rows. If my math is right that's only 8TB; certainly not anything
> insane space-wise or rowcount-wise.
> 
> Perhaps it's still not fixing, but I think it's definitely worth
> documenting.

And it is now documented in the Postgres FAQ thanks to 'Rogerdpack',
which is where that "maximum" table came from:
https://wiki.postgresql.org/wiki/FAQ#What_is_the_maximum_size_for_a_row.2C_a_table.2C_and_a_database.3F
Note if you are storing a table with rows that exceed 2KB in size(aggregate size of each row) then the "Maximum number
ofrows in atable" may be limited to 4 Billion, see TOAST. 
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Álvaro Hernández Tortosa
Date:
On 25/04/15 06:39, Jim Nasby wrote:
> On 4/24/15 7:11 PM, Álvaro Hernández Tortosa wrote:
>> On 24/04/15 05:24, Tom Lane wrote:
> ...
>>> TBH, I've got very little enthusiasm for fixing this given the number
>>> of reports of trouble from the field, which so far as I recall is zero.
>>> Álvaro's case came up through intentionally trying to create an
>>> unreasonable number of tables, not from real usage.  This thread 
>>> likewise
>>> appears to contain lots of speculation and no reports of anyone hitting
>>> a problem in practice.
>>
>>      It is certainly true that this was a very synthetic case. I
>> envision, however, certain use cases where we may hit a very large
>> number of tables:
>
> The original case has NOTHING to do with the number of tables and 
> everything to do with the number of toasted values a table can have. 
> If you have to toast 4B attributes in a single relation it will fail. 
> In reality, if you get anywhere close to that things will fall apart 
> due to OID conflicts.
>
> This case isn't nearly as insane as 4B tables. A table storing 10 text 
> fields each of which is 2K would hit this limit with only 400M rows. 
> If my math is right that's only 8TB; certainly not anything insane 
> space-wise or rowcount-wise.
>
> Perhaps it's still not fixing, but I think it's definitely worth 
> documenting.
    They are definitely different problems, but caused by similar 
symptoms: an oid wrapping around, or not even there: just trying to find 
an unused one. If fixed, we should probably look at both at the same time.
    It's worth document but also, as I said, maybe also fixing them, so 
that if three years from now they really show up, solution is already in 
production (rather than in patching state).
    Regards,
    Álvaro


-- 
Álvaro Hernández Tortosa


-----------
8Kdata




Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Stephen Frost
Date:
* Álvaro Hernández Tortosa (aht@8Kdata.com) wrote:
>     It's worth document but also, as I said, maybe also fixing them,
> so that if three years from now they really show up, solution is
> already in production (rather than in patching state).

With the proliferation of JSON usage in PG thanks to jsonb, I'd count us
lucky if we don't get complaints about this in the next three years.

I don't expect to have time to work on it in the near future,
unfortunately, but Robert's thoughts on supporting a new TOAST pointer
structure (with a way to support what's currently there, to avoid an
on-disk break) seems like a good starting point to me.
Thanks!
    Stephen

Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Jim Nasby
Date:
On 4/25/15 1:19 PM, Bruce Momjian wrote:
>     Note if you are storing a table with rows that exceed 2KB in size
>     (aggregate size of each row) then the "Maximum number of rows in a
>     table" may be limited to 4 Billion, see TOAST.

That's not accurate though; you could be limited to far less than 4B 
rows. If each row has 10 fields that toast, you'd be limited to just 
400M rows.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Roger Pack
Date:
On 4/27/15, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 4/25/15 1:19 PM, Bruce Momjian wrote:
>>     Note if you are storing a table with rows that exceed 2KB in size
>>     (aggregate size of each row) then the "Maximum number of rows in a
>>     table" may be limited to 4 Billion, see TOAST.
>
> That's not accurate though; you could be limited to far less than 4B
> rows. If each row has 10 fields that toast, you'd be limited to just
> 400M rows.

Good point.  I noted that on the TOAST wiki page now, at least (and
also mentioned that using partitioning is a "work around" for now).



Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
José Luis Tallón
Date:
On 04/27/2015 08:49 AM, Jim Nasby wrote:
> On 4/25/15 1:19 PM, Bruce Momjian wrote:
>>     Note if you are storing a table with rows that exceed 2KB in size
>>     (aggregate size of each row) then the "Maximum number of rows in a
>>     table" may be limited to 4 Billion, see TOAST.
>
> That's not accurate though; you could be limited to far less than 4B 
> rows. If each row has 10 fields that toast, you'd be limited to just 
> 400M rows.

ISTM like the solution is almost here, and could be done without too 
much (additional) work:
* We have already discussed having a page-per-sequence with the new 
SeqAMs being introduced and how that would improve scalability.
* We have commented on having a sequence per TOAST table    (hence, 4B toasted values per table each up to 4B chunks in
size...
 
vs just 4B toasted values per cluster)
    I'm not sure that I can do it all by myself just yet, but I sure 
can try if there is interest.    (just after I'm done with another patch that is independent from 
this, though)
    This would be material for 9.6, of course :)

Thanks,
    J.L.




Re: Fwd: [GENERAL] 4B row limit for CLOB tables

From
Jim Nasby
Date:
On 4/28/15 5:41 AM, José Luis Tallón wrote:
> On 04/27/2015 08:49 AM, Jim Nasby wrote:
>> On 4/25/15 1:19 PM, Bruce Momjian wrote:
>>>     Note if you are storing a table with rows that exceed 2KB in size
>>>     (aggregate size of each row) then the "Maximum number of rows in a
>>>     table" may be limited to 4 Billion, see TOAST.
>>
>> That's not accurate though; you could be limited to far less than 4B
>> rows. If each row has 10 fields that toast, you'd be limited to just
>> 400M rows.
>
> ISTM like the solution is almost here, and could be done without too
> much (additional) work:
> * We have already discussed having a page-per-sequence with the new
> SeqAMs being introduced and how that would improve scalability.
> * We have commented on having a sequence per TOAST table
>      (hence, 4B toasted values per table each up to 4B chunks in size...
> vs just 4B toasted values per cluster)
>
>      I'm not sure that I can do it all by myself just yet, but I sure
> can try if there is interest.

I don't think it would be hard at all to switch toast pointers to being 
sequence generated instead of OIDs. The only potential downside I see is 
the extra space required for all the sequnces... but that would only 
matter on the tinyest of clusters (think embedded), which probably don't 
have that many tables to begin with.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com