Home > mailing lists

Re: Speed up COPY FROM text/CSV parsing using SIMD - Mailing list pgsql-hackers

From	Nazir Bilal Yavuz
Subject	Re: Speed up COPY FROM text/CSV parsing using SIMD
Date	August 14, 2025 13:29:35
Msg-id	CAN55FZ0houfWHn8_MEEefhprZvc33jr07GrBYo+Bp2yw=TVnKA@mail.gmail.com Whole thread
In response to	Re: Speed up COPY FROM text/CSV parsing using SIMD (KAZAR Ayoub <ma_kazar@esi.dz>)
Responses	Re: Speed up COPY FROM text/CSV parsing using SIMD
List	pgsql-hackers

Tree view

Hi,

On Thu, 14 Aug 2025 at 05:25, KAZAR Ayoub <ma_kazar@esi.dz> wrote:
>
> Following Nazir's findings about 4096 bytes being the performant line length, I did more benchmarks from my side on
bothTEXT and CSV formats with two different cases of normal data (no special characters) and data with many special
characters.
>
> Results are con good as expected and similar to previous benchmarks
>  ~30.9% faster copy in TEXT format
>  ~32.4% faster copy in CSV format
> 20%-30% reduces cycles per instructions
>
> In the case of doing a lot of special characters in the lines (e.g., tables with large numbers of columns maybe), we
obviouslyexpect regressions here because of the overhead of many fallbacks to scalar processing.
 
> Results for a 1/3 of line length of special characters:
> ~43.9% slower copy in TEXT format
> ~16.7% slower copy in CSV format
> So for even less occurrences of special characters or wider distance between there might still be some regressions in
thiscase, a non-significant case maybe, but can be treated in other patches if we consider to not use SIMD path
sometimes.
>
> I hope this helps more and confirms the patch.

Thanks for running that benchmark! Would you mind sharing a reproducer
for the regression you observed?

--
Regards,
Nazir Bilal Yavuz
Microsoft

pgsql-hackers by date:

From: Bertrand Drouvot
Date: 14 August 2025, 13:20:56
Subject: Re: Report reorder buffer size

From: shveta malik
Date: 14 August 2025, 13:34:39
Subject: Re: Conflict detection for update_deleted in logical replication

Re: Speed up COPY FROM text/CSV parsing using SIMD - Mailing list pgsql-hackers

Previous

Next