Re: pg_read_file() with virtual files returns empty string - Mailing list pgsql-hackers

From Joe Conway
Subject Re: pg_read_file() with virtual files returns empty string
Date
Msg-id 239cf821-5f47-ff68-677c-61d15e84e770@joeconway.com
Whole thread Raw
In response to Re: pg_read_file() with virtual files returns empty string  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg_read_file() with virtual files returns empty string
List pgsql-hackers
On 6/27/20 3:43 PM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> The attached patch fixes this for me. I think it ought to be backpatched through
>> pg11.
>
>> Comments?
>
> 1. Doesn't seem to be accounting for the possibility of an error in fread().
>
> 2. Don't we want to remove the stat() call altogether, if we're not
> going to believe its length?
>
> 3. This bit might need to cast the RHS to int64:
>     if (bytes_to_read > (MaxAllocSize - VARHDRSZ))
> otherwise it might be treated as an unsigned comparison.
> Or you could check for bytes_to_read < 0 separately.
>
> 4. appendStringInfoString seems like quite the wrong thing to use
> when the input is binary data.
>
> 5. Don't like the comment.  Whether the file is virtual or not isn't
> very relevant here.
>
> 6. If the file size exceeds 1GB, I fear we'll get some rather opaque
> failure from the stringinfo infrastructure.  It'd be better to
> check for that here and give a file-too-large error.


All good stuff -- I believe the attached checks all the boxes.

I noted while at this, that the current code can never hit this case:

!     if (bytes_to_read < 0)
!     {
!         if (seek_offset < 0)
!             bytes_to_read = -seek_offset;

The intention here seems to be that if you pass bytes_to_read = -1 with a
negative offset, it will give you the last offset bytes of the file.

But all of the SQL exposed paths disallow an explicit negative value for
bytes_to_read. This was also not documented as far as I can tell so I eliminated
that case in the attached. Is that actually a case I should fix/support instead?

Separately, it seems to me that a two argument version of pg_read_file() would
be useful:

  pg_read_file('filename', offset)

In that case bytes_to_read = -1 could be passed down in order to read the entire
file after the offset. In fact I think that would nicely handle the negative
offset case as well.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Attachment

pgsql-hackers by date:

Previous
From: Felix Lechner
Date:
Subject: Re: Fwd: PostgreSQL: WolfSSL support
Next
From: Tom Lane
Date:
Subject: Re: bugfix: invalid bit/varbit input causes the log file to be unreadable