Thread: message string fixes

message string fixes

From
Alvaro Herrera
Date:
Hi,

While translating the backend's message catalog I found several things
that should probably be improved.

For example, in regis.c there are several strings talking about "regis
pattern".  I had never heard of regis patterns.  Turns out they are a
fast regex subset, used AFAICT only by the ispell code.  Searching the
web I don't find any other reference to "regises" (regisen? reges?), so
I think we should avoid using the term.  How about just changing the
messages to just say "regular expression" instead?

Additionally, I would like to apply the attached patch.  Are there
objections?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment

Re: [pgtranslation-translators] message string fixes

From
Alvaro Herrera
Date:
Further fix attached.  I think "of character type" suggests that the
column must be of type char, which is not the case -- varchar and text
work fine too AFAICS.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: message string fixes

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> For example, in regis.c there are several strings talking about "regis
> pattern".  I had never heard of regis patterns.  Turns out they are a
> fast regex subset, used AFAICT only by the ispell code.  Searching the
> web I don't find any other reference to "regises" (regisen? reges?), so
> I think we should avoid using the term.  How about just changing the
> messages to just say "regular expression" instead?

Then people would think that full regular expressions were meant.

It seems to me that a proper solution is to say something like "fast
regular expressions" and then document what that means in the ispell
dictionary documentation.

> Additionally, I would like to apply the attached patch.  Are there
> objections?

Let's see, the first change is just so you can fool with the word
ordering, right?  Seems OK to me, but are the other translators
going to complain about changing strings at this late date?
        regards, tom lane


Re: message string fixes

From
Teodor Sigaev
Date:
> For example, in regis.c there are several strings talking about "regis
> pattern".  I had never heard of regis patterns.  Turns out they are a
> fast regex subset, used AFAICT only by the ispell code.  Searching the
> web I don't find any other reference to "regises" (regisen? reges?), so
> I think we should avoid using the term.  How about just changing the
> messages to just say "regular expression" instead?
It's just a combination of  "regular expression for ispell". It implements 
subset of regex. It much faster that any other implementation, and uses subset 
widely used in ispell.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: message string fixes

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
>> web I don't find any other reference to "regises" (regisen? reges?), so
>> I think we should avoid using the term.  How about just changing the
>> messages to just say "regular expression" instead?

> It's just a combination of  "regular expression for ispell".

Maybe the right phrase to use is "ispell regular expression".  In any
case we need to document what the limitations are compared to "regular"
regular expressions (ahem).  Do you know offhand what the rules are?
        regards, tom lane


Re: message string fixes

From
Teodor Sigaev
Date:
> Maybe the right phrase to use is "ispell regular expression".  In any
> case we need to document what the limitations are compared to "regular"
> regular expressions (ahem).  Do you know offhand what the rules are?

There is a  fallback to regex if expression isn't supported by regis (see call 
of RS_isRegis() in spell.c).

Regis supports only matches as is, range of characters ( [abc] ), negotiation of 
characters range ( [^abc] ) and can match begin or end of string. AFAIK, ispell 
allows full regex but in practice I never seen something unsupported by regis.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: message string fixes

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
> There is a  fallback to regex if expression isn't supported by regis (see call 
> of RS_isRegis() in spell.c).

Oh.  So in that case, the messages Alvaro is worried about
               ereport(ERROR,                       (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
errmsg("invalid regis pattern: \"%s\"",                               str)));
 

aren't user-facing errors at all, and should be demoted to elog's,
correct?
               elog(ERROR, "invalid regis pattern: \"%s\"", str);

        regards, tom lane


Re: message string fixes

From
Teodor Sigaev
Date:
> aren't user-facing errors at all, and should be demoted to elog's,
> correct?
> 
>                 elog(ERROR, "invalid regis pattern: \"%s\"", str);

Hmm. If regis detects an error in expression then it will be an error for regex 
library too. At least, it was supposed to be.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: message string fixes

From
Alvaro Herrera
Date:
Teodor Sigaev wrote:
>> aren't user-facing errors at all, and should be demoted to elog's,
>> correct?
>>
>>                 elog(ERROR, "invalid regis pattern: \"%s\"", str);
>
> Hmm. If regis detects an error in expression then it will be an error for 
> regex library too. At least, it was supposed to be.

And those that are not, probably are not what the user intends anyway,
with the pattern language being so narrow.

If all invalid regis patterns are indeed invalid regex patterns, then
just changing "regis" for "regex" should be enough.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: message string fixes

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Teodor Sigaev wrote:
>> Hmm. If regis detects an error in expression then it will be an error for 
>> regex library too. At least, it was supposed to be.

> And those that are not, probably are not what the user intends anyway,
> with the pattern language being so narrow.

It looks to me like RS_isRegis() needs to be tightened up a bit anyway:
it will accept "^foo" which is valid regex but not valid regis, leading
to an error being thrown which is not what we want.

If we tighten it to exactly match what RS_compile() will take ... say
by using the same state-machine logic ... then indeed the ereports
are internal and can be demoted to elog's.  If we make them elogs then
ISTM they ought to keep saying regis, just so we know where to look
if they ever do fail ;-)
        regards, tom lane


Re: message string fixes

From
Tom Lane
Date:
I wrote:
> It looks to me like RS_isRegis() needs to be tightened up a bit anyway:
> it will accept "^foo" which is valid regex but not valid regis, leading
> to an error being thrown which is not what we want.

I experimented with this and verified that the error could be reached
with a hacked-up affix file.

> If we tighten it to exactly match what RS_compile() will take ... say
> by using the same state-machine logic ... then indeed the ereports
> are internal and can be demoted to elog's.  If we make them elogs then
> ISTM they ought to keep saying regis, just so we know where to look
> if they ever do fail ;-)

Patch committed along these lines.
        regards, tom lane


Re: message string fixes

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Additionally, I would like to apply the attached patch.  Are there
> objections?

So far I think you only applied one half of that?
        regards, tom lane