On 7/1/20 4:12 PM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> I did some performance testing of the worst case/largest possible file and found
>> that skipping the stat and bulk read does cause a significant regression.
>
> Yeah, I was wondering a little bit if that'd be an issue.
>
>> In the attached patch I was able to get most of the performance degradation back
>> -- ~600ms. Hopefully you don't think what I did was "too cute by half" :-). Do
>> you think this is good enough or should we go back to using the stat file size
>> when it is > 0?
>
> I don't think it's unreasonable to "get in bed" with the innards of the
> StringInfo; plenty of other places do already, such as pqformat.h or
> pgp_armor_decode, just to name the first couple that I came across in a
> quick grep.
>
> However, if we're going to get in bed with it, let's get all the way in
> and just read directly into the StringInfo's buffer, as per attached.
> This saves all the extra memcpy'ing and reduces the number of fread calls
> to at most log(N).
Works for me. I'll retest to see how well it does performance-wise and report back.
> (This also fixes a bug in your version, which is that it captured
> the buf.data pointer before any repalloc that might happen.)
Yeah, I saw that after sending this.
Thanks,
Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development