Home > mailing lists

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Bug in UTF8-Validation Code?
Date	March 17, 2007 17:31:20
Msg-id	12375.1174152588@sss.pgh.pa.us Whole thread Raw
In response to	Re: Bug in UTF8-Validation Code? (Andrew Dunstan <andrew@dunslane.net>)
Responses	Re: Bug in UTF8-Validation Code? (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> The problem with that is that it duplicates effort: in many cases
>> (especially COPY IN) the data's already been validated.

> One thought I had was that it might make sense to have a flag that would 
> inhibit the check, that could be set (and reset) by routines that check 
> for themselves, such as COPY IN. Then bulk load performance should not 
> be hit much.

Actually, I have to take back that objection: on closer look, COPY
validates the data only once and does so before applying its own
backslash-escaping rules.  So there is a risk in that path too.

It's still pretty annoying to be validating the data twice in the
common case where no backslash reduction occurred, but I'm not sure
I see any good way to avoid it.  I don't much want to add another
argument to input functions, and the global flag that you suggest
above seems too ugly/risky.

Would someone do some performance checking on the cost of adding
mbverify to textin()?  If it could be shown that it adds only
negligible overhead to COPY, on say hundred-byte-wide text fields,
then we could decide that this isn't worth worrying about.
        regards, tom lane

pgsql-hackers by date:

From: "Simon Riggs"
Date: 17 March 2007, 17:08:10
Subject: Re: CREATE INDEX and HOT (was Question: pg_classattributes and race conditions ?)

From: "Pavan Deolasee"
Date: 17 March 2007, 17:41:17
Subject: Re: CREATE INDEX and HOT (was Question: pg_class attributes and race conditions ?)

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

Previous

Next