Re: Large files for relations - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Large files for relations
Date
Msg-id CA+hUKGJxtfM3hLrmjP910_nm3DFCFZaUzbHG5Rhw6FpdLwpc6A@mail.gmail.com
Whole thread Raw
In response to Re: Large files for relations  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Large files for relations
List pgsql-hackers
On Tue, May 2, 2023 at 3:28 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I like this patch - it can save some system sources - I am not sure how much, because bigger tables usually use
partitioningusually. 

Yeah, if you only use partitions of < 1GB it won't make a difference.
Larger partitions are not uncommon, though.

> Important note - this feature breaks sharing files on the backup side - so before disabling 1GB sized files, this
issueshould be solved. 

Hmm, right, so there is a backup granularity continuum with "whole
database cluster" at one end, "only files whose size, mtime [or
optionally also checksum] changed since last backup" in the middle,
and "only blocks that changed since LSN of last backup" at the other
end.  Getting closer to the right end of that continuum can make
backups require less reading, less network transfer, less writing
and/or less storage space depending on details.  But this proposal
moves the middle thing further to the left by changing the granularity
from 1GB to whole relation, which can be gargantuan with this patch.
Ultimately we need to be all the way at the right on that continuum,
and there are clearly several people working on that goal.

I'm not involved in any of those projects, but it's fun to think about
an alien technology that produces complete standalone backups like
rsync --link-dest (as opposed to "full" backups followed by a chain of
"incremental" backups that depend on it so you need to retain them
carefully) while still sharing disk blocks with older backups, and
doing so with block granularity.  TL;DW something something WAL
something something copy_file_range().



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: COPY TO STDOUT Apache Arrow support
Next
From: Thomas Munro
Date:
Subject: Re: Large files for relations