Home > mailing lists

Re: CopyReadLineText optimization revisited - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: CopyReadLineText optimization revisited
Date	August 26, 2010 16:16:17
Msg-id	26738.1282850166@sss.pgh.pa.us Whole thread Raw
In response to	CopyReadLineText optimization revisited (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses	Re: CopyReadLineText optimization revisited
List	pgsql-hackers

Tree view

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> * we perform possible encoding conversion early, one input block at a 
> time, rather than after splitting the input into lines. This allows us 
> to assume in the later stages that the data is in server encoding, 
> allowing us to search for the '\n' byte without worrying about 
> multi-byte characters.

Seems reasonable, although the need to deal with multibyte characters
crossing a block boundary injects some ugliness that wasn't there before.

> * instead of the byte-at-a-time loop in CopyReadLineText(), use memchr() 
> to find the next NL/CR character. This is where the speedup comes from. 

That seems like the speedup, if any, would be extremely
platform-dependent.  What have you tested on?

> There's a small fly in the ointment: the patch won't recognize backslash 
> followed by a linefeed as an escaped linefeed. I think we should simply 
> drop support for that.

I think this is likely to break apps that have worked for years.  I
can't get excited about doing that in return for an "0-10%" speedup
that might only materialize on some platforms.  If the numbers were
better, it'd be worth paying that price, but ...

> It's not strictly necessary, but how about dropping support for the old 
> COPY protocol, and the EOF marker \. while we're at it? It would allow 
> us to drop some code, making the remaining code simpler, and reduce the 
> testing effort. Thoughts on that?

Again, I think the threshold requirement for breaking compatibility
needs to be a lot higher than this.
        regards, tom lane

pgsql-hackers by date:

From: Alvaro Herrera
Date: 26 August 2010, 16:08:22
Subject: Re: [ADMIN] Unable to drop role

From: Tom Lane
Date: 26 August 2010, 16:22:34
Subject: Re: bg worker: patch 1 of 6 - permanent process

Re: CopyReadLineText optimization revisited - Mailing list pgsql-hackers

Previous

Next