Re: Speed up COPY FROM text/CSV parsing using SIMD - Mailing list pgsql-hackers

From KAZAR Ayoub
Subject Re: Speed up COPY FROM text/CSV parsing using SIMD
Date
Msg-id CA+K2RumUD+aJ3vuD+05aDWj6geek5DCPYD5peXrRU41QjtORFA@mail.gmail.com
Whole thread Raw
In response to Re: Speed up COPY FROM text/CSV parsing using SIMD  (Manni Wood <manni.wood@enterprisedb.com>)
List pgsql-hackers
Hello,

On Tue, Jan 20, 2026 at 9:49 PM Manni Wood <manni.wood@enterprisedb.com> wrote:
Hello, all I have more benchmarks.

These benchmarks are from a Raspberry Pi 5 that I bought. It has an Arm Cortex A76 processor.

(I was so impressed with the stability of the results I got on my standalone Intel tower PC that I figured I needed a standalone Arm-based machine that was not a laptop and not a VM at a cloud service provider. The run-to-run results were indeed more stable, just like with my standalone tower PC.)

COPY FROM

master: (852558b9)

text, no special: 9111
text, 1/3 special: 10302
csv, no special: 11147
csv, 1/3 special: 13375

v3

text, no special: 7351 (19.3% speedup)
text, 1/3 special: 10397 (0.9% regression)
csv, no special: 7272 (34.7% speedup)
csv, 1/3 special: 13472 (0.7% regression)

v4.2

text, no special: 7300 (19.6% speedup)
text, 1/3 special: 10537 (2.3% regression)
csv, no special: 7260 (34.8% speedup)
csv, 1/3 special: 13881 (3.8% regression)

COPY TO

master: (852558b9)

text, no special: 2446
text, 1/3 special: 6988
csv, no special: 2822
csv, 1/3 special: 6967

v4 (copy to)

text, no special: 1533 (37.3% speedup)
text, 1/3 special: 5949 (14.8% speedup)
csv, no special: 1560 (44.7% speedup)
csv, 1/3 special: 6006 (13.8% speedup)

I find these results particularly exciting because with the COPY FROM v3 patch, the worst-case scenarios are just under 1% regression. The v4 COPY TO patch is a win across the board.

Note that I ran these benchmarks with everything in RAM disk and using the cpupower instructions that Nazir suggested.

So on Arm, the v3 COPY FROM patch is almost all upside, and the v4 COPY TO patch is all upside. The same is almost true for Intel, but the CSV COPY FROM regression, even from the V3 COPY FROM patch, is about 5%. The v4.2 COPY FROM patch always performs worse than the v3 COPY FROM patch in worst-case scenarios.

Does it seem reasonable to stop performance testing the v4.2 COPY FROM patch? Have we collected enough benchmark data to be confident that the v3 COPY FROM patch is the one we should be moving forward with?
For the case of v4.2 using the 1/3 specials benchmark, it will always take the decision to not use SIMD after sampling and that 3%-4% regression is the combination of the small overhead of counting special characters and 2-4 branches and its effect on the general layout, branch prediction, pipeline ..etc, while i don't think it's more complex than v3 but this is the only thing i can think of.
And since it assumes uniformity of special characters between lines so yes IMHO v3 is generally better.

Regards,
Ayoub

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Having problems generating a code coverage report
Next
From: Álvaro Herrera
Date:
Subject: Re: Race conditions in logical decoding