Re: Large files for relations - Mailing list pgsql-hackers

From Jim Mlodgenski
Subject Re: Large files for relations
Date
Msg-id CAB_5SReGK4FhMkb+wjY0umy8AUDfYQ7UUwFjGN9-M+aGsm+E-w@mail.gmail.com
Whole thread Raw
In response to Re: Large files for relations  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers


On Thu, May 11, 2023 at 7:38 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Fri, May 12, 2023 at 8:16 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:
> On Mon, May 1, 2023 at 9:29 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>> I am not aware of any modern/non-historic filesystem[2] that can't do
>> large files with ease.  Anyone know of anything to worry about on that
>> front?
>
> There is some trouble in the ambiguity of what we mean by "modern" and "large files". There are still a large number of users of ext4 where the max file size is 16TB. Switching to a single large file per relation would effectively cut the max table size in half for those users. How would a user with say a 20TB table running on ext4 be impacted by this change?

Hrmph.  Yeah, that might be a bit of a problem.  I see it discussed in
various places that MySQL/InnoDB can't have tables bigger than 16TB on
ext4 because of this, when it's in its default one-file-per-object
mode (as opposed to its big-tablespace-files-to-hold-all-the-objects
mode like DB2, Oracle etc, in which case I think you can have multiple
16TB segment files and get past that ext4 limit).  It's frustrating
because 16TB is still really, really big and you probably should be
using partitions, or more partitions, to avoid all kinds of other
scalability problems at that size.  But however hypothetical the
scenario might be, it should work,

Agreed, it is frustrating, but it is not hypothetical. I have seen a number of
users having single tables larger than 16TB and don't use partitioning because
of the limitations we have today. The most common reason is needing multiple
unique constraints on the table that don't include the partition key. Something
like a user_id and email. There are workarounds for those cases, but usually
it's easier to deal with a single large table than to deal with the sharp edges
those workarounds introduce.
 
 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: psql tests hangs
Next
From: Stephen Frost
Date:
Subject: Re: Large files for relations