On Mon, Apr 25, 2022 at 6:38 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Apr 22, 2022 at 07:17:37PM +0530, Bharath Rupireddy wrote:
> > Right. We find enough disk space and go to write and suddenly the
> > write operations fail for some reason or the VM crashes because of a
> > reason other than disk space. I think the foolproof solution is to
> > figure out the available disk space before prepadding or compressing
> > and also use the
> > write-first-to-temp-file-and-then-rename-it-to-original-file as
> > proposed in the earlier patches in this thread.
>
> Yes, what would count here is only the amount of free space in a
> partition. The total amount of space available becomes handy once you
> begin introducing things like percentage-based quota policies for the
> disk when archiving. The free amount of space could be used to define
> a policy based on the maximum number of bytes you need to leave
> around, as well, but this is not perfect science as this depends of
> what FSes decide to do underneath. There are a couple of designs
> possible here. When I had to deal with my upthread case I have chosen
> one as I had no need to worry only about Linux, it does not mean that
> this is the best choice that would fit with the long-term community
> picture. This comes down to how much pg_receivewal should handle
> automatically, and how it should handle it.
Thanks. I'm not sure why we are just thinking of crashes due to
out-of-disk space. Figuring out free disk space before writing a huge
file (say a WAL file) is a problem in itself to the core postgres as
well, not just pg_receivewal.
I think we are off-track a bit here. Let me illustrate what's the
whole problem is and the idea:
If the node/VM on which pg_receivewal runs, goes down/crashes or fails
during write operation while padding the target WAL file (the .partial
file) with zeros, the unfilled target WAL file ((let me call this file
a partially padded .partial file) will be left over and subsequent
reads/writes to that it will fail with "write-ahead log file \"%s\"
has %zd bytes, should be 0 or %d" error which requires manual
intervention to remove it. In a service, this manual intervention is
what we would like to avoid. Let's not much bother right now for
compressed file writes (for now at least) as they don't have a
prepadding phase.
The proposed solution is to make the prepadding atomic - prepad the
XXXX.partial file as XXXX.partial.tmp name and after the prepadding
rename (durably if sync option is chosen for pg_receivewal) to
XXXX.partial. Before prepadding XXXX.partial.tmp, delete the
XXXX.partial.tmp if it exists.
The above problem isn't unique to pg_receivewal alone, pg_basebackup
too uses CreateWalDirectoryMethod and dir_open_for_write via
ReceiveXlogStream.
IMHO, pg_receivewal checking for available disk space before writing
any file should better be discussed separately?
Regards,
Bharath Rupireddy.