Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory
Date
Msg-id CALj2ACV=+KP6jpL_NpTzRnTRahL_DmKdNftrezyTwQEZxnN_BA@mail.gmail.com
Whole thread Raw
In response to Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory  (Michael Paquier <michael@paquier.xyz>)
Responses Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory
List pgsql-hackers
On Mon, Apr 25, 2022 at 6:38 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Apr 22, 2022 at 07:17:37PM +0530, Bharath Rupireddy wrote:
> > Right. We find enough disk space and go to write and suddenly the
> > write operations fail for some reason or the VM crashes because of a
> > reason other than disk space. I think the foolproof solution is to
> > figure out the available disk space before prepadding or compressing
> > and also use the
> > write-first-to-temp-file-and-then-rename-it-to-original-file as
> > proposed in the earlier patches in this thread.
>
> Yes, what would count here is only the amount of free space in a
> partition.  The total amount of space available becomes handy once you
> begin introducing things like percentage-based quota policies for the
> disk when archiving.  The free amount of space could be used to define
> a policy based on the maximum number of bytes you need to leave
> around, as well, but this is not perfect science as this depends of
> what FSes decide to do underneath.  There are a couple of designs
> possible here.  When I had to deal with my upthread case I have chosen
> one as I had no need to worry only about Linux, it does not mean that
> this is the best choice that would fit with the long-term community
> picture.  This comes down to how much pg_receivewal should handle
> automatically, and how it should handle it.

Thanks. I'm not sure why we are just thinking of crashes due to
out-of-disk space. Figuring out free disk space before writing a huge
file (say a WAL file) is a problem in itself to the core postgres as
well, not just pg_receivewal.

I think we are off-track a bit here. Let me illustrate what's the
whole problem is and the idea:

If the node/VM on which pg_receivewal runs, goes down/crashes or fails
during write operation while padding the target WAL file (the .partial
file) with zeros, the unfilled target WAL file ((let me call this file
a partially padded .partial file) will be left over and subsequent
reads/writes to that it will fail with "write-ahead log file \"%s\"
has %zd bytes, should be 0 or %d" error which requires manual
intervention to remove it. In a service, this manual intervention is
what we would like to avoid. Let's not much bother right now for
compressed file writes (for now at least) as they don't have a
prepadding phase.

The proposed solution is to make the prepadding atomic - prepad the
XXXX.partial file as XXXX.partial.tmp name and after the prepadding
rename (durably if sync option is chosen for pg_receivewal) to
XXXX.partial. Before prepadding  XXXX.partial.tmp, delete the
XXXX.partial.tmp if it exists.

The above problem isn't unique to pg_receivewal alone, pg_basebackup
too uses CreateWalDirectoryMethod and dir_open_for_write via
ReceiveXlogStream.

IMHO, pg_receivewal checking for available disk space before writing
any file should better be discussed separately?

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
Next
From: Alvaro Herrera
Date:
Subject: Re: tweak to a few index tests to hits ambuildempty() routine.