Re: One source of constant annoyance identified - Mailing list pgsql-general

From Markus Wollny
Subject Re: One source of constant annoyance identified
Date
Msg-id 2266D0630E43BB4290742247C8910575014CE2B7@dozer.computec.de
Whole thread Raw
In response to One source of constant annoyance identified  ("Markus Wollny" <Markus.Wollny@computec.de>)
Responses Re: One source of constant annoyance identified
List pgsql-general
Hi!

> -----Ursprüngliche Nachricht-----
> Von: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Gesendet: Freitag, 28. Juni 2002 17:03
> An: Markus Wollny
> Cc: pgsql-general@postgresql.org
> Betreff: Re: [GENERAL] One source of constant annoyance identified
>
>
> "Markus Wollny" <Markus.Wollny@computec.de> writes:
> >                 lower(MESSAGE.TEXT) like '%ich%'
> >             or    lower(MESSAGE.TEXT) like 'ich%'
> >             or    lower(MESSAGE.TEXT) like '%ich'
>
> Is whoever wrote this under the misimpression that % can't match zero
> characters?  You could reduce the number of LIKE tests by a
> factor of 3,
> because the foo% and %foo tests are completely redundant.

Wasn't me :) I think there might be the odd generous wastage of
processing time still in the code just because we could afford it under
Oracle. We intend to implement this very bit using regular expressions,
as we hope that this will improve performance a bit. So we might get
away without using LIKE at all in this particular case. We cannot
however remove LIKE completely from every bit of code.

> But, back to the problem at hand --- it seems like a fair bet that
> we must have a memory leak in lower() or LIKE or both.  Did you build
> with locale or multibyte (or both) enabled?  If so, what locale and
> what database encoding are you using, respectively?

Compilation-options were:

--enable-locale
--enable-recode
--enable-multibyte
--with-perl
--enable-odbc
--enable-syslog

Environment-variables RC_LANG/LC_CTYPE are set to de_DE@euro, encoding
for the databases is SQL_ASCII. As we are hosting german websites and
communities, we need to sort data containing characters ÄÖÜäöüß in the
correct context (AÄ,OÖ,UÜ,sß), so I figured we'd need locale support.

Would it be worth a try recompiling without multibyte enabled? Can I
dump/reimport the current DB afterwards?

Regards,

Markus



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: One source of constant annoyance identified
Next
From: Josh Jore
Date:
Subject: Re: What is a tuple?