Re: pg_read_file() with virtual files returns empty string - Mailing list pgsql-hackers

From Joe Conway
Subject Re: pg_read_file() with virtual files returns empty string
Date
Msg-id cc993759-5360-c61f-a345-02fc99b9fb78@joeconway.com
Whole thread Raw
In response to Re: pg_read_file() with virtual files returns empty string  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg_read_file() with virtual files returns empty string
List pgsql-hackers
On 7/2/20 4:27 PM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> When I saw originally MaxAllocSize - 5 fail I skipped to something smaller by
>> 4096 and it worked. But here I see that the actual max size is MaxAllocSize - 6.
>
> Huh, I wonder why it's not max - 5.  Probably not worth worrying about,
> though.

Well this part:

+    rbytes = fread(sbuf.data + sbuf.len, 1,
+       (size_t) (sbuf.maxlen - sbuf.len - 1), file);

could actually be:

+    rbytes = fread(sbuf.data + sbuf.len, 1,
+       (size_t) (sbuf.maxlen - sbuf.len), file);

because there is no actual need to reserve a byte for the trailing null, since
we are not using appendBinaryStringInfo() anymore, and that is where the
trailing NULL gets written.

With that change (and some elog(NOTICE,...) calls) we have:

select length(pg_read_binary_file('/tmp/rbftest2.bin'));
NOTICE:  loop start - buf max len: 1024; buf len 4
NOTICE:  loop end - buf max len: 8192; buf len 8192
NOTICE:  loop start - buf max len: 8192; buf len 8192
NOTICE:  loop end - buf max len: 16384; buf len 16384
NOTICE:  loop start - buf max len: 16384; buf len 16384
[...]
NOTICE:  loop end - buf max len: 536870912; buf len 536870912
NOTICE:  loop start - buf max len: 536870912; buf len 536870912
NOTICE:  loop end - buf max len: 1073741823; buf len 1073741822
   length
------------
 1073741818
(1 row)

Or max - 5, so we got our byte back :-)

In fact, in principle there is no reason we can't get to max - 4 with this code
except that when the filesize is exactly 1073741819, we need to try to read one
more byte to find the EOF that way I did in my patch. I.e.:

-- use 1073741819 byte file
select length(pg_read_binary_file('/tmp/rbftest1.bin'));
NOTICE:  loop start - buf max len: 1024; buf len 4
NOTICE:  loop end - buf max len: 8192; buf len 8192
NOTICE:  loop start - buf max len: 8192; buf len 8192
NOTICE:  loop end - buf max len: 16384; buf len 16384
NOTICE:  loop start - buf max len: 16384; buf len 16384
[...]
NOTICE:  loop end - buf max len: 536870912; buf len 536870912
NOTICE:  loop start - buf max len: 536870912; buf len 536870912
NOTICE:  loop end - buf max len: 1073741823; buf len 1073741823
NOTICE:  loop start - buf max len: 1073741823; buf len 1073741823
ERROR:  requested length too large

Because we read the last byte, but not beyond, EOF is not reached, so on the
next loop iteration we continue and fail on max size rather than exit the loop.

But I am guessing that test in particular was what you thought too complicated
for what it accomplishes?

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Warn when parallel restoring a custom dump without data offsets
Next
From: Tom Lane
Date:
Subject: Re: pg_read_file() with virtual files returns empty string