Thread: File* argument order, argument types

File* argument order, argument types

From
Andres Freund
Date:
Hi,

While trying to add additional File* functions (FileZero, FileFallocat) I went
back and forth about the argument order between "amount" and "offset".

We have:

extern int FilePrefetch(File file, off_t offset, off_t amount, uint32 wait_event_info);
extern int FileRead(File file, void *buffer, size_t amount, off_t offset, uint32 wait_event_info);
extern int FileWrite(File file, const void *buffer, size_t amount, off_t offset, uint32 wait_event_info);
extern int FileTruncate(File file, off_t offset, uint32 wait_event_info);
extern void FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info);

and I want to add (for [1])
extern int FileZero(File file, off_t amount, off_t offset, uint32 wait_event_info);
extern int FileFallocate(File file, off_t amount, off_t offset, uint32 wait_event_info);

The differences originate in trying to mirror the underlying function's
signatures:

int posix_fadvise(int fd, off_t offset, off_t len, int advice);
ssize_t pread(int fd, void buf[.count], size_t count, off_t offset);
ssize_t pwrite(int fd, const void buf[.count], size_t count, off_t offset);
int ftruncate(int fd, off_t length);
int posix_fallocate(int fd, off_t offset, off_t len);
int sync_file_range(int fd, off64_t offset, off64_t nbytes, unsigned int flags);


It seems quite confusing to be this inconsistent about argument order and
argument types in the File* functions. For one, the relation to the underlying
posix functions isn't always obvious. For another, we're not actually
mirroring the signatures all that well, our argument and return types don't
actually match.


It'd be easy enough to decide on a set of types for the arguments, that'd be
API (but not necessarily ABI compatible, but we don't care) compatible. But
changing the argument order would commonly lead to silent breakage, which
obviously would be bad. Or maybe it's unlikely enough that there are external
callers?

I don't know what to actually propose. I guess the least bad I can see is to
pick one type & argument order that we document to be the default, with a
caveat placed above the functions not following the argument order.

Order wise, I think we should choose amount, offset.  For the return type we
probably should pick ssize_t?  I don't know what we should standardize on for
'amount', I'd probably be inclined to go for size_t.

Greetings,

Andres Freund

[1] https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de



Re: File* argument order, argument types

From
Ashutosh Bapat
Date:
On Sat, Feb 18, 2023 at 6:23 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> While trying to add additional File* functions (FileZero, FileFallocat) I went
> back and forth about the argument order between "amount" and "offset".
>
> We have:
>
> extern int FilePrefetch(File file, off_t offset, off_t amount, uint32 wait_event_info);
> extern int FileRead(File file, void *buffer, size_t amount, off_t offset, uint32 wait_event_info);
> extern int FileWrite(File file, const void *buffer, size_t amount, off_t offset, uint32 wait_event_info);
> extern int FileTruncate(File file, off_t offset, uint32 wait_event_info);
> extern void FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info);
>
> and I want to add (for [1])
> extern int FileZero(File file, off_t amount, off_t offset, uint32 wait_event_info);
> extern int FileFallocate(File file, off_t amount, off_t offset, uint32 wait_event_info);
>
> The differences originate in trying to mirror the underlying function's
> signatures:
>
> int posix_fadvise(int fd, off_t offset, off_t len, int advice);
> ssize_t pread(int fd, void buf[.count], size_t count, off_t offset);
> ssize_t pwrite(int fd, const void buf[.count], size_t count, off_t offset);
> int ftruncate(int fd, off_t length);
> int posix_fallocate(int fd, off_t offset, off_t len);
> int sync_file_range(int fd, off64_t offset, off64_t nbytes, unsigned int flags);
>
>
> It seems quite confusing to be this inconsistent about argument order and
> argument types in the File* functions. For one, the relation to the underlying
> posix functions isn't always obvious. For another, we're not actually
> mirroring the signatures all that well, our argument and return types don't
> actually match.
>
>
> It'd be easy enough to decide on a set of types for the arguments, that'd be
> API (but not necessarily ABI compatible, but we don't care) compatible. But
> changing the argument order would commonly lead to silent breakage, which
> obviously would be bad. Or maybe it's unlikely enough that there are external
> callers?

I am sure there are extensions and forks which use these APIs and they
will be surprised to see this change OR will face silent breakage.
Do you consider those as external callers?

-- 
Best Wishes,
Ashutosh Bapat



Re: File* argument order, argument types

From
Peter Eisentraut
Date:
On 18.02.23 01:52, Andres Freund wrote:
> I don't know what to actually propose. I guess the least bad I can see is to
> pick one type & argument order that we document to be the default, with a
> caveat placed above the functions not following the argument order.
> 
> Order wise, I think we should choose amount, offset.  For the return type we
> probably should pick ssize_t?  I don't know what we should standardize on for
> 'amount', I'd probably be inclined to go for size_t.

This reminds me that most people talk about LIMIT X OFFSET Y, even 
though what actually happens is that the offset is handled first and 
then the limit is applied.  This is also reflected in the standard SQL 
spelling OFFSET Y FETCH FIRST Z ROWS, in that order.  So, just saying 
that there is universal disagreement on the order of these things.

I think the correct order is offset, then amount.  And I think the OS C 
APIs mostly agree with that, if you look at the newer ones.  The 
exceptions are the likes of pread() and pwrite(); I think they just kept 
the signature of read() and write() and added the additional offset 
argument at the end, which I think is a sensible compromise.

For the proposed FileFallocate() I would therefore also keep the order 
of posix_fallocate(), so it would be

     FileFallocate(File file, off_t offset, off_t len, ...)