Re: Undocumented feature costs a lot of performance in - Mailing list pgsql-hackers

From Bill Studenmund
Subject Re: Undocumented feature costs a lot of performance in
Date
Msg-id Pine.NEB.4.33.0112041208450.1693-100000@vespasia.home-net.internetconnect.net
Whole thread Raw
In response to Undocumented feature costs a lot of performance in COPY IN  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Undocumented feature costs a lot of performance in COPY IN  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 4 Dec 2001, Tom Lane wrote:

>     By default, a text copy uses a tab ("\t") character as a
>     delimiter between fields. The field delimiter may be changed to
>     any other single character with the keyword phrase USING
>     DELIMITERS. Characters in data fields which happen to match the
>     delimiter character will be backslash quoted. Note that the
>     delimiter is always a single character. If multiple characters
>     are specified in the delimiter string, only the first character
>     is used.
>
> and indeed, only the first character is used by COPY OUT.  But COPY IN
> is presently coded so that if multiple characters are mentioned in
> USING DELIMITERS, any one of them will be taken as a field delimiter.
>
> I would like to change the code to just "if (c == delim[0])",
> which should buy back most of that 20% and make the behavior match the
> documentation.  Question for the list: is this a bad change?  Is anyone
> out there actually using this undocumented behavior?

I think you should make the change. Because, as I understand it, when you
give multiple delimiter characters COPY OUT will not delimit characters
other than the first, since they won't be treated special. But COPY IN
will treat them special; you will read in more columns than you output.
Thus as it is, you can't COPY IN something you COPY OUT'd.

One alternative would be to make the code use different paths for the
just-one and many delimiter cases. But then COPY OUT would need fixing.

Take care,

Bill




pgsql-hackers by date:

Previous
From: Doug McNaught
Date:
Subject: Re: Undocumented feature costs a lot of performance in COPY IN
Next
From: Tom Lane
Date:
Subject: Re: Undocumented feature costs a lot of performance in COPY IN