Re: Speed up COPY FROM text/CSV parsing using SIMD - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Speed up COPY FROM text/CSV parsing using SIMD
Date
Msg-id aY0FL4rXUl6ykn-a@nathan
Whole thread Raw
In response to Re: Speed up COPY FROM text/CSV parsing using SIMD  (Nazir Bilal Yavuz <byavuz81@gmail.com>)
Responses Re: Speed up COPY FROM text/CSV parsing using SIMD
List pgsql-hackers
On Wed, Feb 11, 2026 at 04:27:50PM +0300, Nazir Bilal Yavuz wrote:
> I am sharing a v6 which implements (1). My benchmark results show
> almost no difference for the special-character cases and a nice
> improvement for the no-special-character cases.

Thanks!

> +    /* Initialize SIMD variables */
> +    cstate->simd_enabled = false;
> +    cstate->simd_initialized = false;

> +    /* Initialize SIMD on the first read */
> +    if (unlikely(!cstate->simd_initialized))
> +    {
> +        cstate->simd_initialized = true;
> +        cstate->simd_enabled = true;
> +    }

Why do we do this initialization in CopyReadLine() as opposed to setting
simd_enabled to true when initializing cstate in BeginCopyFrom()?  If we
can initialize it in BeginCopyFrom, we could probably remove
simd_initialized.

> +    if (cstate->simd_enabled)
> +        result = CopyReadLineText(cstate, is_csv, true);
> +    else
> +        result = CopyReadLineText(cstate, is_csv, false);

I know we discussed this upthread, but I'd like to take a closer look at
this to see whether/why it makes such a big difference.  It's a bit awkward
that CopyReadLineText() needs to manage both its local simd_enabled and
cstate->simd_enabled.

+            /* Load a chunk of data into a vector register */
+            vector8_load(&chunk, (const uint8 *) ©_input_buf[input_buf_ptr]);

As mentioned upthread [0], I think it's worth testing whether processing
multiple vectors worth of data in each loop iteration is worthwhile.

[0] https://postgr.es/m/aSTVOe6BIe5f1l3i%40nathan

-- 
nathan



pgsql-hackers by date:

Previous
From: Paul A Jungwirth
Date:
Subject: Re: SQL:2011 Application Time Update & Delete
Next
From: Chao Li
Date:
Subject: Odd usage of errmsg_internal in bufmgr.c