Re: Large files for relations - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Large files for relations |
Date | |
Msg-id | CA+TgmoY=PBJaMsV4FuWYSMmNc7EmLWN53eSS4p8SbjF6z2fdgQ@mail.gmail.com Whole thread Raw |
In response to | Re: Large files for relations (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Large files for relations
|
List | pgsql-hackers |
On Fri, May 12, 2023 at 9:53 AM Stephen Frost <sfrost@snowman.net> wrote: > While I tend to agree that 1GB is too small, 1TB seems like it's > possibly going to end up on the too big side of things, or at least, > if we aren't getting rid of the segment code then it's possibly throwing > away the benefits we have from the smaller segments without really > giving us all that much. Going from 1G to 10G would reduce the number > of open file descriptors by quite a lot without having much of a net > change on other things. 50G or 100G would reduce the FD handles further > but starts to make us lose out a bit more on some of the nice parts of > having multiple segments. This is my view as well, more or less. I don't really like our current handling of relation segments; we know it has bugs, and making it non-buggy feels difficult. And there are performance issues as well -- file descriptor consumption, for sure, but also probably that crossing a file boundary likely breaks the operating system's ability to do readahead to some degree. However, I think we're going to find that moving to a system where we have just one file per relation fork and that file can be arbitrarily large is not fantastic, either. Jim's point about running into filesystem limits is a good one (hi Jim, long time no see!) and the problem he points out with ext4 is almost certainly not the only one. It doesn't just have to be filesystems, either. It could be a limitation of an archiving tool (tar, zip, cpio) or a file copy utility or whatever as well. A quick Google search suggests that most such things have been updated to use 64-bit sizes, but my point is that the set of things that can potentially cause problems is broader than just the filesystem. Furthermore, even when there's no hard limit at play, a smaller file size can occasionally be *convenient*, as in Pavel's example of using hard links to share storage between backups. From that point of view, a 16GB or 64GB or 256GB file size limit seems more convenient than no limit and more convenient than a large limit like 1TB. However, the bugs are the flies in the ointment (ahem). If we just make the segment size bigger but don't get rid of segments altogether, then we still have to fix the bugs that can occur when you do have multiple segments. I think part of Thomas's motivation is to dodge that whole category of problems. If we gradually deprecate multi-segment mode in favor of single-file-per-relation-fork, then the fact that the segment handling code has bugs becomes progressively less relevant. While that does make some sense, I'm not sure I really agree with the approach. The problem is that we're trading problems that we at least theoretically can fix somehow by hitting our code with a big enough hammer for an unknown set of problems that stem from limitations of software we don't control, maybe don't even know about. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: