Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: Bug in UTF8-Validation Code?
Date
Msg-id 46100A8A.5030006@markdilger.com
Whole thread Raw
In response to Re: Bug in UTF8-Validation Code?  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Bug in UTF8-Validation Code?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Martijn van Oosterhout wrote:
> There's also the performance angle. The current mbverify is very
> inefficient for encodings like UTF-8. You might need to refactor a bit
> there...

There appears to be a lot of function call overhead in the current 
implementation.  In pg_verify_mbstr, the function pointer 
pg_wchar_table.mbverify is called for each multibyte character in a multibyte 
string.

Refactoring the way these table driven functions work would impact lots of other 
code.  Just grep for all files #including mb/pg_wchar.h for the list of them. 
The list includes interfaces/libpq, and I'm wondering if software that links 
against postgres might rely on these function prototypes?

mark


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Macros for typtype (was Re: Arrays of Complex Types)
Next
From: Peter Eisentraut
Date:
Subject: Implicit casts to text