Home > mailing lists

Re: Speed up COPY FROM text/CSV parsing using SIMD - Mailing list pgsql-hackers

From	Nathan Bossart
Subject	Re: Speed up COPY FROM text/CSV parsing using SIMD
Date	October 21, 2025 21:55:07
Msg-id	aPfXC3G30qpivfD9@nathan Whole thread Raw
In response to	Re: Speed up COPY FROM text/CSV parsing using SIMD (KAZAR Ayoub <ma_kazar@esi.dz>)
List	pgsql-hackers

Tree view

On Tue, Oct 21, 2025 at 08:17:01AM +0200, KAZAR Ayoub wrote:
>>> I'm also trying the idea of doing SIMD inside quotes with prefix XOR
>>> using carry less multiplication avoiding the slow path in all cases even
>>> with weird looking input, but it needs to take into consideration the
>>> availability of PCLMULQDQ instruction set with <wmmintrin.h> and here we
>>> go, it quickly starts to become dirty OR we can wait for the decision to
>>> start requiring x86-64-v2 or v3 which has SSE4.2 and AVX2.
>
> [...]
> 
> Currently we are at 200-400Mbps which isn't that terrible compared to
> production and non production grade parsers (of course we don't only parse
> in our case), also we are using SSE2 only so theoretically if we add
> support for avx later on we'll have even better numbers.
> Maybe more micro optimizations to the current heuristic can squeeze it more.

I'd greatly prefer that we stick with SSE2/Neon (i.e., simd.h) unless the
gains are extraordinary.  Beyond the inherent complexity of using
architecture-specific intrinsics, you also have to deal with configure-time
checks, runtime checks, and function pointer overhead juggling.  That tends
to be a lot of work for the amount of gain.

-- 
nathan

pgsql-hackers by date:

From: Nathan Bossart
Date: 21 October 2025, 21:40:09
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD

From: Alexander Borisov
Date: 21 October 2025, 22:20:21
Subject: Unicode 17

Re: Speed up COPY FROM text/CSV parsing using SIMD - Mailing list pgsql-hackers

Previous

Next