Re: Potential stack overflow in incremental base backup - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Potential stack overflow in incremental base backup
Date
Msg-id CA+hUKGLJBT4Nto1kUp2Q4hLSw5c30a=+ai5Uk2iYn5gD3LgYVQ@mail.gmail.com
Whole thread Raw
In response to Re: Potential stack overflow in incremental base backup  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Apr 16, 2024 at 4:10 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Apr 10, 2024 at 9:55 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > To rescue my initdb --rel-segsize project[1] for v18, I will have a go
> > at making that dynamic.  It looks like we don't actually need to
> > allocate that until we get to the GetFileBackupMethod() call, and at
> > that point we have the file size.  If I understand correctly,
> > statbuf.st_size / BLCKSZ would be enough, so we could embiggen our
> > block number buffer there if required, right?
>
> Yes.

Here is a first attempt at that.  Better factoring welcome.  New
observations made along the way: the current coding can exceed
MaxAllocSize and error out, or overflow 32 bit size_t and allocate
nonsense.  Do you think it'd be better to raise an error explaining
that, or silently fall back to full backup (tried that way in the
attached), or that + log messages?  Another option would be to use
huge allocations, so we only have to deal with that sort of question
for 32 bit systems (i.e. effectively hypothetical/non-relevant
scenarios), but I don't love that idea.

> ...
> I do understand that a 1GB segment size is not that big in 2024, and
> that filesystems with a 2GB limit are thought to have died out a long
> time ago, and I'm not against using larger segments. I do think,
> though, that increasing the segment size by 32768x in one shot is
> likely to be overdoing it.

My intuition is that the primary interesting lines to cross are at 2GB
and 4GB due to data type stuff.  I defend against the most basic
problem in my proposal: I don't let you exceed your off_t type, but
that doesn't mean we don't have off_t/ssize_t/size_t/long snafus
lurking in our code that could bite a 32 bit system with large files.
If you make it past those magic numbers and your tools are all happy,
I think you should be home free until you hit file system limits,
which are effectively unhittable on most systems except ext4's already
bemoaned 16TB limit AFAIK.  But all the same, I'm contemplating
limiting the range to 1TB in the first version, not because of general
fear of unknown unknowns, but just because it means we don't need to
use "huge" allocations for this known place, maybe until we can
address that.

Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: pg_sequence_last_value() for unlogged sequences on standbys
Next
From: Nathan Bossart
Date:
Subject: Re: allow changing autovacuum_max_workers without restarting