Home > mailing lists

Re: CopyReadLineText optimization - Mailing list pgsql-patches

From	Andrew Dunstan
Subject	Re: CopyReadLineText optimization
Date	March 6, 2008 14:45:51
Msg-id	47D03BCE.9030909@dunslane.net Whole thread Raw
In response to	Re: CopyReadLineText optimization ("Heikki Linnakangas" <heikki@enterprisedb.com>)
Responses	Re: CopyReadLineText optimization
List	pgsql-patches

Tree view

Heikki Linnakangas wrote:
> Heikki Linnakangas wrote:
>> Heikki Linnakangas wrote:
>>> Attached is a patch that modifies CopyReadLineText so that it uses
>>> memchr to speed up the scan. The nice thing about memchr is that we
>>> can take advantage of any clever optimizations that might be in libc
>>> or compiler.
>>
>> Here's an updated version of the patch. The principle is the same,
>> but the same optimization is now used for CSV input as well, and
>> there's more comments.
>
> Another update attached: It occurred to me that the memchr approach is
> only safe for server encodings, where the non-first bytes of a
> multi-byte character always have the hi-bit set.
>

We currently make the following assumption in the code:

     * These four characters, and the CSV escape and quote characters, are
     * assumed the same in frontend and backend encodings.
     *

The four characters are the carriage return, line feed, backslash and dot.

I think the requirement might well actually be somewhat stronger than
that: i.e. that none of these will appear as a non-first byte in any
multi-byte client encoding character. If that's right, then we should be
able to write CopyReadLineText without bothering about multi-byte chars.
If it's not right then I suspect we have some cases that can fail now
anyway. (I believe some client encodings at least use backslash in
subsequent chars, and that's a nasty one because the "\." end sequence
is hard coded). I believe all the chars up to 0x2f are safe - that
includes both quote chars and dot)

cheers

andrew

pgsql-patches by date:

From: Alvaro Herrera
Date: 06 March 2008, 08:11:32
Subject: Re: NetBSD/MIPS supports dlopen

From: Bruce Momjian
Date: 06 March 2008, 14:51:31
Subject: Re: DTrace probe patch for OS X Leopard

Re: CopyReadLineText optimization - Mailing list pgsql-patches

Previous

Next