Re: Undocumented feature costs a lot of performance in - Mailing list pgsql-hackers

From Bill Studenmund
Subject Re: Undocumented feature costs a lot of performance in
Date
Msg-id Pine.NEB.4.33.0112041230020.1693-100000@vespasia.home-net.internetconnect.net
Whole thread Raw
In response to Re: Undocumented feature costs a lot of performance in COPY IN  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 4 Dec 2001, Tom Lane wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
> > One alternative would be to make the code use different paths for the
> > just-one and many delimiter cases. But then COPY OUT would need fixing.
>
> Well, it's not clear what COPY OUT should *do* with multiple
> alternatives, anyway.  Pick one at random?  I guess it does that now,
> if you consider "always use the first one" as a random choice.  The

I think that'd be fine.

> real problem is that it will only backslash the first one, too.  That

Ick. I was thinking that if you gave multiple delimiters, it would escape
each one. Which would be slow, and is why I think seperate code paths
would be good. :-)

> means that data emitted with DELIMITERS "|_=", say, will fail to be
> reloaded correctly if that same DELIMITERS string is given to COPY IN
> --- because any _ or = characters in the data won't be backslashed,
> but would need to be to keep COPY IN from treating them as delimiters.
>
> For COPY OUT's purposes, a sensible interpretation of a multicharacter
> delimiter string would be that the whole string is emitted as the
> delimiter.  Eg,
>
>     COPY OUT WITH DELIMITERS "<TAB>";
>
>     foo<TAB>bar<TAB>baz
>     ...
>
> But as long as COPY IN considers that delimiter spec to mean "any one of
> these characters", and not a multicharacter string, we couldn't do that.
>
> If we restrict DELIMITERS strings to be exactly one character for a
> release or three, we could think about implementing this idea of
> multicharacter delimiter strings later on.  Not sure if anyone really
> needs it though.  In any case, the current behavior is inconsistent.

I think this restriction sounds fine, and quite practical. :-)

Take care,

Bill



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Undocumented feature costs a lot of performance in COPY IN
Next
From: Bruce Momjian
Date:
Subject: Re: Undocumented feature costs a lot of performance in COPY