Re: tsearch filenames unlikes special symbols and numbers - Mailing list pgsql-hackers

From Oleg Bartunov
Subject Re: tsearch filenames unlikes special symbols and numbers
Date
Msg-id Pine.LNX.4.64.0709090817530.2767@sn.sai.msu.ru
Whole thread Raw
In response to Re: tsearch filenames unlikes special symbols and numbers  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: tsearch filenames unlikes special symbols and numbers
Re: tsearch filenames unlikes special symbols and numbers
List pgsql-hackers
On Sun, 2 Sep 2007, Tom Lane wrote:

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> I made it reject all but latin letters, which is the same restriction
>>> that's in place for timezone set filenames.  That might be overly
>>> strong, but we definitely have to forbid "." and "/" (and "\" on
>>> Windows).  Do we want to restrict it to letters, digits, underscore?
>>> Or does it need to be weaker than that?
>
>> What's the problem with "."?
>
> ../../../../etc/passwd
>
> Possibly we could allow '.' as long as we forbade /, but the other
> trouble with allowing . is that it encourages people to try to specify
> the filetype suffix (as indeed Oleg was doing).  I'd prefer to keep the
> suffixes out of the SQL object definitions, with an eye to possibly
> someday migrating all the configuration data inside the database.
> There's a reasonable argument for restricting the names used for these
> things in the SQL definitions to be valid SQL identifiers, so that that
> will work nicely...

So, what's the current policy ? Still a-z, A-Z ? I think we should allow
'.' and prevent '/'. Look, how ugly is our current ispell setup, which
depends on 3 files - stop word list, .dict and .aff.

Right now, I can use something like

CREATE TEXT SEARCH DICTIONARY en_ispell (                TEMPLATE = ispell,                DictFile = englishDict,
         AffFile =  englishAff,                StopWords = english        );
 

I'd better use english.dict, english.aff, english.stop, whih is usual for
any user, without dictating user here. We already did a lot of 
restrictions.

I hope we won't require special extension like .dict, .aff, since it's
unknown in advance what files will use other dictionaries.
If we allow '.' without '/', then we'd be happy.
I'd remove requirement for extension of stop words list, which looks
rather artificially to me.

Oh, my god, I see we dictate extensions !

STATEMENT:  CREATE TEXT SEARCH DICTIONARY en_ispell (                TEMPLATE = ispell,                DictFile =
englishDict,               AffFile =  englishAff,                StopWords = englishStop        );
 
ERROR:  could not open dictionary file "/usr/local/pgsql-dev/share/tsearch_data/englishdict.dict": No such file or
directory

Folk, this is too much ! Now, we dictate extensions '.dict, .affix, .stop',
what else ?

Does it defined by ispell template only, or it's global requirements ?
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: invalidly encoded strings
Next
From: Oleg Bartunov
Date:
Subject: ispell dictionary broken in CVS HEAD ?