Re: Speed up COPY TO text/CSV parsing using SIMD - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Speed up COPY TO text/CSV parsing using SIMD
Date
Msg-id acv2vu8miagnHG1B@nathan
Whole thread Raw
In response to Re: Speed up COPY TO text/CSV parsing using SIMD  (KAZAR Ayoub <ma_kazar@esi.dz>)
List pgsql-hackers
On Fri, Mar 27, 2026 at 07:48:38PM +0100, KAZAR Ayoub wrote:
> I added a prescan loop inside the simd helpers trying to catch special
> chars in sizeof(Vector8) characters, i measured how good is this at
> reducing the overhead of starting simd and exiting at first vector:
> the scalar loop is better than SIMD for one vector if it finds a special
> character before 6th character, worst case is not a clean vector, where the
> scalar loop needs 20 more cycles compared to SIMD.
> This helps mitigate the case of JSON(B) in CSV format, this is why I only
> added this for CSV case only.

Interesting.

> In a benchmark with 10M early SIMD exit like the JSONB case, the previous
> 3% regression is gone.

While these are nice results, I think it's best that we target v20 for this
patch so that we have more time to benchmark and explore edge cases.

-- 
nathan



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Add pg_stat_autovacuum_priority
Next
From: Sami Imseih
Date:
Subject: Re: Add pg_stat_autovacuum_priority