Re: pg_read_file() with virtual files returns empty string - Mailing list pgsql-hackers

From Joe Conway
Subject Re: pg_read_file() with virtual files returns empty string
Date
Msg-id 79fec3f2-7635-d936-596a-12518d5fecfe@joeconway.com
Whole thread Raw
In response to Re: pg_read_file() with virtual files returns empty string  (Joe Conway <mail@joeconway.com>)
Responses Re: pg_read_file() with virtual files returns empty string
List pgsql-hackers
On 7/1/20 5:17 PM, Joe Conway wrote:
> On 7/1/20 4:12 PM, Tom Lane wrote:
>> Joe Conway <mail@joeconway.com> writes:
>>> I did some performance testing of the worst case/largest possible file and found
>>> that skipping the stat and bulk read does cause a significant regression.
>>
>> Yeah, I was wondering a little bit if that'd be an issue.
>>
>>> In the attached patch I was able to get most of the performance degradation back
>>> -- ~600ms. Hopefully you don't think what I did was "too cute by half" :-). Do
>>> you think this is good enough or should we go back to using the stat file size
>>> when it is > 0?
>>
>> I don't think it's unreasonable to "get in bed" with the innards of the
>> StringInfo; plenty of other places do already, such as pqformat.h or
>> pgp_armor_decode, just to name the first couple that I came across in a
>> quick grep.
>>
>> However, if we're going to get in bed with it, let's get all the way in
>> and just read directly into the StringInfo's buffer, as per attached.
>> This saves all the extra memcpy'ing and reduces the number of fread calls
>> to at most log(N).
>
> Works for me. I'll retest to see how well it does performance-wise and report back.

A quick test shows that this gets performance back on par with HEAD.

The only downside is that the max filesize is reduced to (MaxAllocSize -
MIN_READ_SIZE - 1) compared to MaxAllocSize with the old method.

But anyone pushing that size limit is going to run into other issues anyway. I.e
(on pg11):

8<---------------
select length(pg_read_binary_file('/tmp/rbftest4.bin'));
   length

------------
 1073737726
(1 row)

select pg_read_binary_file('/tmp/rbftest4.bin');
ERROR:  invalid memory alloc request size 2147475455
8<---------------

So probably not worth worrying about.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

pgsql-hackers by date:

Previous
From: "Sanaba, Bilva"
Date:
Subject: Adding Support for Copy callback functionality on COPY TO api
Next
From: Magnus Hagander
Date:
Subject: Re: Remove Deprecated Exclusive Backup Mode