On Mon, Jan 04, 2010 at 06:38:03PM -0500, Andrew Dunstan wrote:
> Andrew Dunstan wrote:
> >>
> >>Yes. I believe the test is highlighting an existing problem: that plperl
> >>function in non-PG_UTF8 databases can't use regular expressions that
> >>require unicode character meta-data.
> >>
> >>Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
> >>should be removed, so the utf8fix function is always called, or the
> >>test should be removed (or hacked to only apply to PG_UTF8 databases).
> >
> >I tried forcing the test, but it doesn't seem to work, possibly
> >because in the case that the db is not utf8 we aren't forcing
> >argument strings to UTF8 :-(
> >
> >I think we might need to remove the test from the patch.
>
> I have not been able to come up with a fix for this - the whole
> thing seems very fragile. I'm going to commit what remains of this
> patch, but not add the extra regression test. I'll add a TODO to
> allow plperl to do utf8 operations in non-utf8 databases.
I see you've not commited it yet, so to help out I've attached
a new diff, over the current CVS head, with two minor changes:
- Removed the test, as noted above.
- Optimized pg_verifymbstr calls to avoid unneeded strlen()s.
This should apply cleanly to cvs, saving you the need to resolve the
conflicts caused by the recent pg_verifymbstr patch.
I'll add it to the commitfest once it reaches the archives.
Tim.