Home > mailing lists

Re: Optimizing COPY - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Optimizing COPY
Date	November 12, 2008 15:21:57
Msg-id	491B0297.5080903@enterprisedb.com Whole thread Raw
In response to	Re: Optimizing COPY (Chuck McDevitt <cmcdevitt@greenplum.com>)
List	pgsql-hackers

Tree view

Chuck McDevitt wrote:
> What if the block of text is split in the middle of a multibyte character?
> I don't think it is safe to assume raw blocks always end on a character boundary.

Yeah, it's not. I realized myself after submitting. The generic approach 
is to loop with pg_mblen() to find out the max. safe length. For UTF-8, 
and probably many other multi-byte encodings as well, we can detect 
whether a byte is the first byte of a multi-byte character, just by 
looking at the few high-bits of the byte.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: Tom Lane
Date: 12 November 2008, 15:21:43
Subject: Re: libpq-events windows gotcha

From: Richard Huxton
Date: 12 November 2008, 15:29:43
Subject: Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

Re: Optimizing COPY - Mailing list pgsql-hackers

Previous

Next