Thread: message string fixes
Hi, While translating the backend's message catalog I found several things that should probably be improved. For example, in regis.c there are several strings talking about "regis pattern". I had never heard of regis patterns. Turns out they are a fast regex subset, used AFAICT only by the ispell code. Searching the web I don't find any other reference to "regises" (regisen? reges?), so I think we should avoid using the term. How about just changing the messages to just say "regular expression" instead? Additionally, I would like to apply the attached patch. Are there objections? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
Further fix attached. I think "of character type" suggests that the column must be of type char, which is not the case -- varchar and text work fine too AFAICS. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > For example, in regis.c there are several strings talking about "regis > pattern". I had never heard of regis patterns. Turns out they are a > fast regex subset, used AFAICT only by the ispell code. Searching the > web I don't find any other reference to "regises" (regisen? reges?), so > I think we should avoid using the term. How about just changing the > messages to just say "regular expression" instead? Then people would think that full regular expressions were meant. It seems to me that a proper solution is to say something like "fast regular expressions" and then document what that means in the ispell dictionary documentation. > Additionally, I would like to apply the attached patch. Are there > objections? Let's see, the first change is just so you can fool with the word ordering, right? Seems OK to me, but are the other translators going to complain about changing strings at this late date? regards, tom lane
> For example, in regis.c there are several strings talking about "regis > pattern". I had never heard of regis patterns. Turns out they are a > fast regex subset, used AFAICT only by the ispell code. Searching the > web I don't find any other reference to "regises" (regisen? reges?), so > I think we should avoid using the term. How about just changing the > messages to just say "regular expression" instead? It's just a combination of "regular expression for ispell". It implements subset of regex. It much faster that any other implementation, and uses subset widely used in ispell. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: >> web I don't find any other reference to "regises" (regisen? reges?), so >> I think we should avoid using the term. How about just changing the >> messages to just say "regular expression" instead? > It's just a combination of "regular expression for ispell". Maybe the right phrase to use is "ispell regular expression". In any case we need to document what the limitations are compared to "regular" regular expressions (ahem). Do you know offhand what the rules are? regards, tom lane
> Maybe the right phrase to use is "ispell regular expression". In any > case we need to document what the limitations are compared to "regular" > regular expressions (ahem). Do you know offhand what the rules are? There is a fallback to regex if expression isn't supported by regis (see call of RS_isRegis() in spell.c). Regis supports only matches as is, range of characters ( [abc] ), negotiation of characters range ( [^abc] ) and can match begin or end of string. AFAIK, ispell allows full regex but in practice I never seen something unsupported by regis. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: > There is a fallback to regex if expression isn't supported by regis (see call > of RS_isRegis() in spell.c). Oh. So in that case, the messages Alvaro is worried about ereport(ERROR, (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION), errmsg("invalid regis pattern: \"%s\"", str))); aren't user-facing errors at all, and should be demoted to elog's, correct? elog(ERROR, "invalid regis pattern: \"%s\"", str); regards, tom lane
> aren't user-facing errors at all, and should be demoted to elog's, > correct? > > elog(ERROR, "invalid regis pattern: \"%s\"", str); Hmm. If regis detects an error in expression then it will be an error for regex library too. At least, it was supposed to be. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev wrote: >> aren't user-facing errors at all, and should be demoted to elog's, >> correct? >> >> elog(ERROR, "invalid regis pattern: \"%s\"", str); > > Hmm. If regis detects an error in expression then it will be an error for > regex library too. At least, it was supposed to be. And those that are not, probably are not what the user intends anyway, with the pattern language being so narrow. If all invalid regis patterns are indeed invalid regex patterns, then just changing "regis" for "regex" should be enough. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Teodor Sigaev wrote: >> Hmm. If regis detects an error in expression then it will be an error for >> regex library too. At least, it was supposed to be. > And those that are not, probably are not what the user intends anyway, > with the pattern language being so narrow. It looks to me like RS_isRegis() needs to be tightened up a bit anyway: it will accept "^foo" which is valid regex but not valid regis, leading to an error being thrown which is not what we want. If we tighten it to exactly match what RS_compile() will take ... say by using the same state-machine logic ... then indeed the ereports are internal and can be demoted to elog's. If we make them elogs then ISTM they ought to keep saying regis, just so we know where to look if they ever do fail ;-) regards, tom lane
I wrote: > It looks to me like RS_isRegis() needs to be tightened up a bit anyway: > it will accept "^foo" which is valid regex but not valid regis, leading > to an error being thrown which is not what we want. I experimented with this and verified that the error could be reached with a hacked-up affix file. > If we tighten it to exactly match what RS_compile() will take ... say > by using the same state-machine logic ... then indeed the ereports > are internal and can be demoted to elog's. If we make them elogs then > ISTM they ought to keep saying regis, just so we know where to look > if they ever do fail ;-) Patch committed along these lines. regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes: > Additionally, I would like to apply the attached patch. Are there > objections? So far I think you only applied one half of that? regards, tom lane