Re: Speed up COPY FROM text/CSV parsing using SIMD - Mailing list pgsql-hackers

From Manni Wood
Subject Re: Speed up COPY FROM text/CSV parsing using SIMD
Date
Msg-id CAKWEB6pJ-5b7QUmVtG12hC0bQ82OvDv4XsidAcnngN36q28qTQ@mail.gmail.com
Whole thread
In response to Re: Speed up COPY FROM text/CSV parsing using SIMD  (Nazir Bilal Yavuz <byavuz81@gmail.com>)
Responses Re: Speed up COPY FROM text/CSV parsing using SIMD
List pgsql-hackers


On Sun, Mar 8, 2026 at 5:31 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Hi,

On Sat, 7 Mar 2026 at 02:31, KAZAR Ayoub <ma_kazar@esi.dz> wrote:
>
> On Sat, Mar 7, 2026 at 12:13 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>>
>> On Fri, Mar 06, 2026 at 03:25:46PM -0600, Manni Wood wrote:
>> > Well, golly! Look at these numbers. Old master with no lz4, your v11 patch
>> > with no lz4, and then your v11 patch with lz4 compiled in.
>>
>> I'm appreciative of all the benchmarking that you and others are doing, but
>> wouldn't we be more interested in the difference between "old master with
>> lz4" and "v11 with lz4"?  Else, we have multiple variables in play.
>
> Yes I agree because the lz4 effect doesn't prove anything for the SIMD patch itself right ? So basically a comparison for the SIMD effect should be "master with/out lz4 vs patched with/out lz4, respectively and nothing more!", is this correct ?

Yes, I think 'master with/out lz4 vs patched with/out lz4,
respectively' is enough to determine the effect of the SIMD patch.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Hello!

As requested, here are some numbers based on the latest master but with the copy code inlining excised (`git revert dc592a41557b072178f1798700bf9c69cd8e4235`), compared to master with copy code inlining left in place and the v11 patch applied.
Both results have lz4 compression in place.

I have not run numbers without lz4. I assume I could use the two postgres instances that I have compiled with lz4, but just set `default_toast_compression = pglz` in postgesql.conf for both instances. Let me know if that is a mistaken assumption on my part.

arm NARROW master without inline with lz4
TXT :                 10362.799500 ms
CSV :                 10288.791000 ms
TXT with 1/3 escapes: 10411.416250 ms
CSV with 1/3 quotes:  12318.385750 ms

arm NARROW master with inline with lz4 with v11patch
TXT :                 10317.125750 ms  0.440747% improvement
CSV :                 10418.020250 ms -1.256020% regression
TXT with 1/3 escapes: 10188.319500 ms  2.142809% improvement
CSV with 1/3 quotes:  12032.964500 ms  2.317035% improvement


arm WIDE master without inline with lz4
TXT :                  5608.834500 ms
CSV :                  8115.155000 ms
TXT with 1/3 escapes:  7037.290500 ms
CSV with 1/3 quotes:  10894.615750 ms

arm WIDE master with inline with lz4 with v11patch
TXT :                  3190.268750 ms  43.120647% improvement
CSV :                  3135.177000 ms  61.366394% improvement
TXT with 1/3 escapes:  6373.746750 ms   9.428966% improvement
CSV with 1/3 quotes:  10336.763500 ms   5.120440% improvement



x86 NARROW-master-without-inline-with-lz4.log
TXT :                 26701.079250 ms
CSV :                 26492.235500 ms
TXT with 1/3 escapes: 28590.508250 ms
CSV with 1/3 quotes:  34876.742750 ms

x86 NARROW-master-with-inline-with-lz4-with-v11patch.log
TXT :                 26511.747750 ms  0.709078% improvement
CSV :                 26261.269750 ms  0.871824% improvement
TXT with 1/3 escapes: 27702.964750 ms  3.104329% improvement
CSV with 1/3 quotes:  32339.393000 ms  7.275191% improvement


x86 WIDE-master-without-inline-with-lz4.log
TXT :                 14485.563250 ms
CSV :                 21392.582000 ms
TXT with 1/3 escapes: 18081.514750 ms
CSV with 1/3 quotes:  32547.086250 ms

x86 WIDE-master-with-inline-with-lz4-with-v11patch.log
TXT :                  8080.378250 ms  44.217714% improvement
CSV :                  8283.723000 ms  61.277591% improvement
TXT with 1/3 escapes: 15054.111000 ms  16.743087% improvement
CSV with 1/3 quotes:  25668.009750 ms  21.135768% improvement
--
-- Manni Wood EDB: https://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Emitting JSON to file using COPY TO
Next
From: Daniel Gustafsson
Date:
Subject: Re: [PATCH] Remove unused is_error parameter from TeardownHistoricSnapshot()