Re: Speed up COPY TO text/CSV parsing using SIMD - Mailing list pgsql-hackers
| From | KAZAR Ayoub |
|---|---|
| Subject | Re: Speed up COPY TO text/CSV parsing using SIMD |
| Date | |
| Msg-id | CA+K2Rum-TB_iNzDWoXOJspf=jq0gd-wees8+9tBTJNyhy9cK5g@mail.gmail.com Whole thread |
| In response to | Re: Speed up COPY TO text/CSV parsing using SIMD (Nathan Bossart <nathandbossart@gmail.com>) |
| Responses |
Re: Speed up COPY TO text/CSV parsing using SIMD
Re: Speed up COPY TO text/CSV parsing using SIMD |
| List | pgsql-hackers |
On Tue, Mar 17, 2026 at 7:49 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Sat, Mar 14, 2026 at 11:43:38PM +0100, KAZAR Ayoub wrote:
> Just a small concern about where some varlenas have a larger binary size
> than its text representation ex:
> SELECT pg_column_size(to_tsvector('SIMD is GOOD'));
> pg_column_size
> ----------------
> 32
>
> its text representation is less than sizeof(Vector8) so currently v3 would
> enter SIMD path and exit out just from the beginning (two extra branches)
> because it does this:
> + if (TupleDescAttr(tup_desc, attnum - 1)->attlen == -1 &&
> + VARSIZE_ANY_EXHDR(DatumGetPointer(value)) > sizeof(Vector8))
>
> I thought maybe we could do * 2 or * 4 its binary size, depends on the type
> really but this is just a proposition if this case is something concerning.
Can we measure the impact of this? How likely is this case?
I'll respond to this separately in a different email.
> +static pg_attribute_always_inline void CopyAttributeOutText(CopyToState cstate, const char *string,
> + bool use_simd, size_t len);
> +static pg_attribute_always_inline void CopyAttributeOutCSV(CopyToState cstate, const char *string,
> + bool use_quote, bool use_simd, size_t len);
Can you test this on its own, too? We might be able to separate this and
the change below into a prerequisite patch, assuming they show benefits.
I tested inlining alone and found the results were about an improvement of 1% to 4% across all configurations.
The inlining is only meaningful in combination with the SIMD work, for the reason described below.
> if (is_csv)
> - CopyAttributeOutCSV(cstate, string,
> - cstate->opts.force_quote_flags[attnum - 1]);
> + {
> + if (use_simd)
> + CopyAttributeOutCSV(cstate, string,
> + cstate->opts.force_quote_flags[attnum - 1],
> + true, len);
> + else
> + CopyAttributeOutCSV(cstate, string,
> + cstate->opts.force_quote_flags[attnum - 1],
> + false, len);
There isn't a terrible amount of branching on use_simd in these functions,
so I'm a little skeptical this makes much difference. As above, it would
be good to measure it
I compiled three variants
v3: use_simd passed as compile-time, CopyAttribute functions inlined.
v3_variable: use_simd as is variable, CopyAttribute functions inlined.
v3_variable_noinline: use_simd as is variable, CopyAttribute functions are not inlined.
v3: use_simd passed as compile-time, CopyAttribute functions inlined.
v3_variable: use_simd as is variable, CopyAttribute functions inlined.
v3_variable_noinline: use_simd as is variable, CopyAttribute functions are not inlined.
None of the helpers are explicitly inlined by us.
The assembly reveals two things:
1) The CSV SIMD helpers (CopyCheckCSVQuoteNeedSIMD, CopySkipCSVEscapeSIMD) are inlined by the compiler naturally in all
three variants, CopySkipTextSIMD is never inlined by the compiler in any variant.
2) The constant-emitting approach (v3) does matter (just a little apparently) specifically for CopySkipTextSIMD.
2) The constant-emitting approach (v3) does matter (just a little apparently) specifically for CopySkipTextSIMD.
Its the same story as COPY FROM patch's first commit it just emits code without use_simd branch
jbe ... ; len > sizeof(Vector8)
je ... ; need_transcoding
call CopySkipTextSIMD
jbe ... ; len > sizeof(Vector8)
je ... ; need_transcoding
call CopySkipTextSIMD
Whether the extra branching in for constant passing is worth it or not is demonstrated by the benchmark.
Test Master v3 v3_var v3_var_noinl
TEXT clean 1504ms -24.1% -23.0% -21.5%
CSV clean 1760ms -34.9% -32.7% -33.0%
TEXT 1/3 backslashes 3763ms +4.6% +6.9% +4.1%
CSV 1/3 quotes 3885ms +3.1% +2.7% -0.8%
Wide table TEXT (integer columns):
Cols Master v3 v3_var v3_var_noinl
50 2083ms -0.7% -0.6% +3.5%
100 4094ms -0.1% -0.5% +4.5%
200 1560ms +0.6% -2.3% +3.2%
500 1905ms -1.0% -1.3% +4.7%
1000 1455ms +1.8% +0.4% +4.3%
Wide table CSV:
Cols Master v3 v3_var v3_var_noinl
50 2421ms +4.0% +6.7% +5.8%
100 4980ms +0.1% +2.0% +0.1%
200 1901ms +1.4% +3.5% +1.4%
500 2328ms +1.8% +2.7% +2.2%
1000 1815ms +2.0% +2.8% +2.5%
I'm not sure whether there's a diff between v3 and v3_var practically speaking, what do you think ?
Regards,
Ayoub
pgsql-hackers by date: