Re: Force lookahead in COPY FROM parsing - Mailing list pgsql-hackers

From John Naylor
Subject Re: Force lookahead in COPY FROM parsing
Date
Msg-id CAFBsxsExdB8krHZftosniQwv6Jr7QfRVw=sckHTpQ3SKwYgJZw@mail.gmail.com
Whole thread Raw
In response to Re: Force lookahead in COPY FROM parsing  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Force lookahead in COPY FROM parsing  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers

On Thu, Apr 1, 2021 at 4:47 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Ok, I wouldn't expect to see much difference in that test, it gets
> drowned in all the other parsing overhead. I tested this now with this:
>
> copy (select repeat('x', 10000) from generate_series(1, 100000)) to
> '/tmp/copydata-x.txt'
> create table blackhole_tab (a text);
>
> copy blackhole_tab from '/tmp/copydata-x.txt' where false;
>
> I took the min of 5 runs of that COPY FROM statement:
>
> master:
> 4107 ms
>
> v3-0001-Simplify-COPY-FROM-parsing-by-forcing-lookahead.patch:
> 3172 ms
>
> I was actually surprised it was so effective on that test, I expected a
> small but noticeable gain. But I'll take it.

Nice! With this test on my laptop I only get 7-8% increase, but that's much better than what I saw before.

I have nothing further so it's RFC. The patch is pretty simple compared to the earlier ones, but is worth running the fuzzer again as added insurance?

As an aside, I noticed the URL near the top of copyfromparse.c that explains a detail of macros has moved from

http://www.cit.gu.edu.au/~anthony/info/C/C.macros

to

https://antofthy.gitlab.io/info/C/C_macros.txt

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Magzum Assanbayev
Date:
Subject: GSoC 2021 - Student looking for a mentor - Magzum Assanbayev
Next
From: Melanie Plageman
Date:
Subject: Re: Parallel Full Hash Join