Re: Speed up COPY FROM text/CSV parsing using SIMD - Mailing list pgsql-hackers

From Manni Wood
Subject Re: Speed up COPY FROM text/CSV parsing using SIMD
Date
Msg-id CAKWEB6qx9mEd8a-QqDe1xqqyuoR=NzUPwJvyc59sUbLc18RHUQ@mail.gmail.com
Whole thread Raw
In response to Re: Speed up COPY FROM text/CSV parsing using SIMD  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
Hello.

I tried Ayoub Kazar's test files again, using Nazir Bilal Yavuz's v3 patches, but with one difference since my last attempt: this time, I used 5 million lines per file. For each 5 million line file, I ran the import 5 times and averaged the results.

(I found that even using 1 million lines could sometimes produce surprising speedups where the newer algorithm should be at least a tiny bit slower than the non-simd version.)

The text file with no special characters is 30% faster. The CSV file with no special characters is 39% faster. The text file with roughly 1/3rd special characters is 0.5% slower. The CSV file with roughly 1/3rd special characters is 2.7% slower.

I also tried files that alternated lines with no special characters and lines with 1/3rd special characters, thinking I could force the algorithm to continually check whether or not it should use simd and therefore force more overhead in the try-simd/don't-try-simd housekeeping code. The text file was still 50% faster. The CSV file was still 13% faster.



On Mon, Nov 24, 2025 at 3:59 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Thu, Nov 20, 2025 at 03:55:43PM +0300, Nazir Bilal Yavuz wrote:
> On Thu, 20 Nov 2025 at 00:01, Nathan Bossart <nathandbossart@gmail.com> wrote:
>> +            /* Load a chunk of data into a vector register */
>> +            vector8_load(&chunk, (const uint8 *) &copy_input_buf[input_buf_ptr]);
>>
>> In other places, processing 2 or 4 vectors of data at a time has proven
>> faster.  Have you tried that here?
>
> Sorry, I could not find the related code piece. I only saw the
> vector8_load() inside of hex_decode_safe() function and its comment
> says:
>
> /*
>  * We must process 2 vectors at a time since the output will be half the
>  * length of the input.
>  */
>
> But this does not mention any speedup from using 2 vectors at a time.
> Could you please show the related code?

See pg_lfind32().

--
nathan


--
-- Manni Wood EDB: https://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)
Next
From: Robert Haas
Date:
Subject: Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent