http://www.sigaev.ru/misc/tsearch_core-0.52.gz
Plan was:
1) rename FULLTEXT to TEXT SEARCH in SQL command
done
2) rework Snowball stemmer's as Tom suggested
done
3) ALTER FULLTEXT CONFIGURATION cfgname ADD/ALTER/DROP MAPPING
done
4) remove support of default configuration per scheme. Default configuration will be only one per locale.
done
5) single encoded files. That will touch snowball, ispell, synonym, thesaurus and simple dictionaries
done
6) use encoding names instead of locale's names in configuration
Ugh. I missed that knowledge of encoding doesn't allow to determine exact
language --- how do many languages use ISO8859-1 locale?. So, it's not done. Tom
pointed that locale's name isn't portable, but there isn't a lot of names of the
same locale (ru_RU.UTF-8, ru_RU.UTF8 for example). So it's possible to use array
of locales instead of one name.
I didn't see comments about security hole pointed by Tom, so I repeat:
About security holes in PARSER/DICTIONARY. I see following ways to resolve it now:
1) Allow to superuser only to do CREATE/ALTER/DROP PARSER/DICTIONARY Disadvantage: hosting users will not be able to
changedictionaries
2) Remove CREATE/ALTER/DROP PARSER, split pg_ts_dict to pg_ts_dict_template and pg_ts_dict and accordingly change
CREATE/ALTER/DROPDICTIONARY Disadvantage: parser and dictionary's template will not dump/restore, it
shouldbe restored manually (just a INSERT into pg_ts_parser/pg_ts_dict_template)
3) Similar to previous point, but: * CREATE/ALTER/DROP PARSER - super-user only * CREATE/ALTER/DROP DICTIONARY
TEMPLATE- super-user only * CREATE/ALTER/DROP DICTIONARY - allowed to non-superuser Disadvantage: new command
CREATE/ALTER/DROPDICTIONARY TEMPLATE
Which way do we choose? or I miss some variant?
I would like to go by 3) way... Comments?
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/