Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes? - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?
Date
Msg-id CALj2ACUmvGb78zd49OXcRWvTX1WYHnOcF6oikQiR2i2yH4zvgw@mail.gmail.com
Whole thread Raw
In response to Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?  (Andres Freund <andres@anarazel.de>)
Responses Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Feb 13, 2023 at 11:09 PM Andres Freund <andres@anarazel.de> wrote:
>
> > Later code assigns iov[0].iov_len thus we need to provide a separate
> > iov non-const variable, or can we use pwrite instead there?  (I didn't
> > find pg_pwrite_with_retry(), though)
>
> Given that we need to do that, and given that we already need to loop to
> handle writes that are longer than PG_IOV_MAX * BLCKSZ, it's probably not
> worth avoiding iov initialization.
>
> But I think it's worth limiting the initialization to blocks.

We can still optimize away the for loop by using a single iovec for
remaining size, like the attached v2 patch.

> I'd also try to combine the first pg_writev_* with the second one.

Done, PSA v2 patch.

On Tue, Feb 14, 2023 at 6:40 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-02-12 09:31:36 -0800, Andres Freund wrote:
> > Another thing: I think we should either avoid iterating over all the IOVs if
> > we don't need them, or, even better, initialize the array as a constant, once.
>
> I just tried to use pg_pwrite_zeros - and couldn't because it doesn't have an
> offset parameter.  Huh, what lead to the function being so constrained?

Done, PSA v2 patch.

We could do few more things, but honestly I feel they're unnecessary:
1) An assert-only code that checks if the asked file contents are
zeroed at the end of pg_pwrite_zeros (to be more defensive, but
reading 16MB files and checking if it's zero-filled will surely
slowdown the Assert builds).
2) A small test module passing in a file with the size to write isn't
multiple of block size, meaning, the code we have in the function to
write last remaining bytes (less than BLCKSZ) gets covered which isn't
covered right now -
https://coverage.postgresql.org/src/common/file_utils.c.gcov.html.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Next
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum