At Thu, 10 Sep 2020 21:55:27 +0300, Surafel Temesgen <surafel3000@gmail.com> wrote in
> On Thu, Sep 10, 2020 at 1:17 PM vignesh C <vignesh21@gmail.com> wrote:
>
> >
> > >
> > > We have a patch for column matching feature [1] that may need a header
> > line to be further processed. Even without that I think it is preferable to
> > process the header line for nothing than adding those checks to the loop,
> > performance-wise.
> >
> > I had seen that patch, I feel that change to match the header if the
> > header is specified can be addressed in this patch if that patch gets
> > committed first or vice versa. We are doing a lot of processing for
> > the data which we need not do anything. Shouldn't this be skipped if
> > not required. Similar check is present in NextCopyFromRawFields also
> > to skip header.
> >
>
> The existing check is unavoidable but we can live better without the checks
> added by the patch. For very large files the loop may iterate millions of
> times if it is not in billion and I am sure doing the check that many times
> will incur noticeable performance degradation than further processing a
> single line.
FWIW, I thought the same thing seeing the additional if-conditions. It
gives more loss than gain.
For the first part, the patch reveals COPY_NEW_FE, which I don't think
to be a knowledge for the function, to CopyGetData. Considering that
that doesn't seem to offer noticeable performance gain, I don't think
we should do that. On the contrary, if incoming data were
intermittently delayed for some reasons (heavy load of client or
in-between network), this patch would make things worse by waiting for
delayed bits before processing already received bits.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center