Re: Improvements in Copy From - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Improvements in Copy From
Date
Msg-id 20200911.180401.1250008268606505036.horikyota.ntt@gmail.com
Whole thread Raw
In response to Improvements in Copy From  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
At Fri, 11 Sep 2020 18:44:13 +1000, Peter Smith <smithpb2250@gmail.com> wrote in 
> On Thu, Sep 10, 2020 at 9:21 PM vignesh C <vignesh21@gmail.com> wrote:
> > > Whether such a micro-optimisation is worth doing is another question.
> > Yes, what you suggested can also be done, but even I have the same
> > question as you. Because we will reduce just one function call, the
> > eof check is present immediately in the function, Should we include
> > this or not?
> 
> I expect the difference from my suggestion is too small to be measured.
> 
> Probably it is not worth changing the already complicated code unless
> those changes can achieve something observable.
> 
> ~~
> 
> FYI, I ran a few performance tests BEFORE/AFTER applying your patch.
> 
> Perf results for \COPY 5GB CSV file to UNLOGGED table.
> 
> perf -a –g <pid>
> psql -d test -c "\copy tbl from '/my/path/data_5GB.csv' with (format csv);”
> perf report –g
> 
> BEFORE
> #1 CopyReadLineText = 12.70%, CopyLoadRawBuf = 0.81%
> #2 CopyReadLineText = 12.54%, CopyLoadRawBuf = 0.81%
> #3 CopyReadLineText = 12.52%, CopyLoadRawBuf = 0.81%
> 
> AFTER
> #1 CopyReadLineText = 12.55%, CopyLoadRawBuf = 1.20%
> #2 CopyReadLineText = 12.15%, CopyLoadRawBuf = 1.10%
> #3 CopyReadLineText = 13.11%, CopyLoadRawBuf = 1.24%
> #4 CopyReadLineText = 12.86%, CopyLoadRawBuf = 1.18%
> 
> I didn't quite know how to interpret those results. It was opposite
> what I expected. Perhaps the original excessive CopyLoadRawBuf calls
> were so brief they could often avoid being sampled? Anyway, I hope you
> have a better understanding of perf than I do and can explain it.
> 
> I then repeated/times same tests but without perf
> 
> BEFORE
> #1 4min.36s
> #2 4min.45s
> #3 4min.43s
> #4 4min.34s
> 
> AFTER
> #1 4min.41s
> #2 4min.37s
> #3 4min.34s
> 
> As you can see, unfortunately, the patch gave no observable benefit
> for my test case.

That observation agrees with my assumption.

At Fri, 11 Sep 2020 15:58:04 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
me> we should do that. On the contrary, if incoming data were
me> intermittently delayed for some reasons (heavy load of client or
me> in-between network), this patch would make things worse by waiting for
me> delayed bits before processing already received bits.

It seems that a slow network is enough to cause that behavior even
without any trouble,

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Bug in logical decoding of in-progress transactions
Next
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: Transactions involving multiple postgres foreign servers, take 2