pgsql: Optimize COPY FROM (FORMAT {text,csv}) using SIMD. - Mailing list pgsql-committers

From Nathan Bossart
Subject pgsql: Optimize COPY FROM (FORMAT {text,csv}) using SIMD.
Date
Msg-id E1w153S-003r6u-18@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Optimize COPY FROM (FORMAT {text,csv}) using SIMD.

Presently, such commands scan the input buffer one byte at a time
looking for special characters.  This commit adds a new path that
uses SIMD instructions to skip over chunks of data without any
special characters.  This can be much faster.

To avoid regressions, SIMD processing is disabled for the remainder
of the COPY FROM command as soon as we encounter a short line or a
special character (except for end-of-line characters, else we'd
always disable it after the first line).  This is perhaps too
conservative, but it could probably be made more lenient in the
future via fine-tuned heuristics.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Co-authored-by: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Ayoub Kazar <ma_kazar@esi.dz>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Neil Conway <neil.conway@gmail.com>
Reviewed-by: Greg Burd <greg@burd.me>
Tested-by: Manni Wood <manni.wood@enterprisedb.com>
Tested-by: Mark Wong <markwkm@gmail.com>
Discussion: https://postgr.es/m/CAOzEurSW8cNr6TPKsjrstnPfhf4QyQqB4tnPXGGe8N4e_v7Jig%40mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/e0a3a3fd5361913502ff696ecf47770ca55975ae

Modified Files
--------------
src/backend/commands/copyfrom.c          |   1 +
src/backend/commands/copyfromparse.c     | 185 ++++++++++++++++++++++++++++++-
src/include/commands/copyfrom_internal.h |   1 +
3 files changed, 184 insertions(+), 3 deletions(-)


pgsql-committers by date:

Previous
From: Peter Eisentraut
Date:
Subject: pgsql: Factor out constructSetOpTargetlist() from transformSetOperation
Next
From: Nathan Bossart
Date:
Subject: pgsql: Initialize variable to placate compiler.