Thread: tsearch core path, v0.58

tsearch core path, v0.58

From
Teodor Sigaev
Date:
http://www.sigaev.ru/misc/tsearch_core-0.58.gz

Changes since 0.52 version:

1) Introduce dictionary's template which contains only methods of dictionary and
can be managed only by superuser.
CREATE TEXT SEARCH DICTIONARY dictname
     TEMPLATE  dicttmplname
     [OPTION  opt_text ]
;

CREATE TEXT SEARCH DICTIONARY TEMPLATE dicttmplname
     LEXIZE  lexize_function
     [INIT  init_function ]
;

DROP  TEXT SEARCH DICTIONARY TEMPLATE [IF EXISTS] dicttmplname  [CASCADE]
ALTER TEXT SEARCH DICTIONARY TEMPLATE dicttmplname RENAME TO newname;

psql has \dFt command operated templates

2) parser and dictionary template could be managed only by superuser (due to
security reasons pointed by Tom). So, they don't have owner columns and removed
ALTER .. PARSER .. OWNER TO command

4) As Bruce suggests, GUC variable tsearch_conf_name is renamed to
default_text_search_config and trigger tsearch is renamed to tsvector_update_trigger

5) remove cfglocale and cfgdefault columns in configuration. So, CREATE/ALTER ..
CONFIGURATION hasn't AS DEFAULT and LOCALE options. Instead of that initdb tries
to find suitable configuration name for selected locale. Or it uses -T,
--text-search-config=CFG switch.

6) pg_dump, psql are changed accordingly.


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: tsearch core path, v0.58

From
Bruce Momjian
Date:
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Teodor Sigaev wrote:
> http://www.sigaev.ru/misc/tsearch_core-0.58.gz
>
> Changes since 0.52 version:
>
> 1) Introduce dictionary's template which contains only methods of dictionary and
> can be managed only by superuser.
> CREATE TEXT SEARCH DICTIONARY dictname
>      TEMPLATE  dicttmplname
>      [OPTION  opt_text ]
> ;
>
> CREATE TEXT SEARCH DICTIONARY TEMPLATE dicttmplname
>      LEXIZE  lexize_function
>      [INIT  init_function ]
> ;
>
> DROP  TEXT SEARCH DICTIONARY TEMPLATE [IF EXISTS] dicttmplname  [CASCADE]
> ALTER TEXT SEARCH DICTIONARY TEMPLATE dicttmplname RENAME TO newname;
>
> psql has \dFt command operated templates
>
> 2) parser and dictionary template could be managed only by superuser (due to
> security reasons pointed by Tom). So, they don't have owner columns and removed
> ALTER .. PARSER .. OWNER TO command
>
> 4) As Bruce suggests, GUC variable tsearch_conf_name is renamed to
> default_text_search_config and trigger tsearch is renamed to tsvector_update_trigger
>
> 5) remove cfglocale and cfgdefault columns in configuration. So, CREATE/ALTER ..
> CONFIGURATION hasn't AS DEFAULT and LOCALE options. Instead of that initdb tries
> to find suitable configuration name for selected locale. Or it uses -T,
> --text-search-config=CFG switch.
>
> 6) pg_dump, psql are changed accordingly.
>
>
> --
> Teodor Sigaev                                   E-mail: teodor@sigaev.ru
>                                                     WWW: http://www.sigaev.ru/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: tsearch core path, v0.58

From
Bruce Momjian
Date:
I did my first minimal review of this patch.  First, it is massive ----
26k line diff for the new commands and functionality, and 31k for
adding snowball stemmer.  I am glad Oleg and Teodor wrote this because
they have been around this code for a while and are available to fix any
problems we find.

The patch looks well structured.  The majority is just standard glue to
add new commands, like CREATE/DROP, grammar, system catalogs, pg_dump,
cache entries, regression tests, etc.

---------------------------------------------------------------------------

Teodor Sigaev wrote:
> http://www.sigaev.ru/misc/tsearch_core-0.58.gz
>
> Changes since 0.52 version:
>
> 1) Introduce dictionary's template which contains only methods of dictionary and
> can be managed only by superuser.
> CREATE TEXT SEARCH DICTIONARY dictname
>      TEMPLATE  dicttmplname
>      [OPTION  opt_text ]
> ;
>
> CREATE TEXT SEARCH DICTIONARY TEMPLATE dicttmplname
>      LEXIZE  lexize_function
>      [INIT  init_function ]
> ;
>
> DROP  TEXT SEARCH DICTIONARY TEMPLATE [IF EXISTS] dicttmplname  [CASCADE]
> ALTER TEXT SEARCH DICTIONARY TEMPLATE dicttmplname RENAME TO newname;
>
> psql has \dFt command operated templates
>
> 2) parser and dictionary template could be managed only by superuser (due to
> security reasons pointed by Tom). So, they don't have owner columns and removed
> ALTER .. PARSER .. OWNER TO command
>
> 4) As Bruce suggests, GUC variable tsearch_conf_name is renamed to
> default_text_search_config and trigger tsearch is renamed to tsvector_update_trigger
>
> 5) remove cfglocale and cfgdefault columns in configuration. So, CREATE/ALTER ..
> CONFIGURATION hasn't AS DEFAULT and LOCALE options. Instead of that initdb tries
> to find suitable configuration name for selected locale. Or it uses -T,
> --text-search-config=CFG switch.
>
> 6) pg_dump, psql are changed accordingly.
>
>
> --
> Teodor Sigaev                                   E-mail: teodor@sigaev.ru
>                                                     WWW: http://www.sigaev.ru/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: tsearch core path, v0.58

From
Bruce Momjian
Date:
Teodor Sigaev wrote:
> http://www.sigaev.ru/misc/tsearch_core-0.58.gz
>
> Changes since 0.52 version:
>
> 1) Introduce dictionary's template which contains only methods of dictionary and
> can be managed only by superuser.
> CREATE TEXT SEARCH DICTIONARY dictname
>      TEMPLATE  dicttmplname
>      [OPTION  opt_text ]
> ;
>
> CREATE TEXT SEARCH DICTIONARY TEMPLATE dicttmplname
>      LEXIZE  lexize_function
>      [INIT  init_function ]
> ;

I am finding the above syntax confusing.  If TEMPLATE appears before the
dictionary name, it is a template, but after, it is using a template.
Can we use a different word instead of TEMPLATE, and have a USING clause
to reference the template?

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: tsearch core path, v0.58

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
> http://www.sigaev.ru/misc/tsearch_core-0.58.gz

What is src/backend/utils/tsearch/dict_ispell/parse.h ?

Well, I know what it *is*: it's bison output.  The question is what is
it doing here?  It doesn't seem to be used, and if it is used then I
do not see the bison grammar file it's made from.

BTW, I would like to shorten some of the path names in this fileset.
Is there a reason not to combine src/backend/utils/adt/tsearch,
src/backend/utils/tsearch, and src/backend/utils/tsearch/dict_ispell
into one place, perhaps src/backend/tsearch?

            regards, tom lane

Re: tsearch core path, v0.58

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> BTW, I would like to shorten some of the path names in this fileset.
> Is there a reason not to combine src/backend/utils/adt/tsearch,
> src/backend/utils/tsearch, and src/backend/utils/tsearch/dict_ispell
> into one place, perhaps src/backend/tsearch?

I for one would like to see all the data types in a file in adt. Even if it's
just a single file containing mostly just glue functions into code in
sc/backend/tsearch.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: tsearch core path, v0.58

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> BTW, I would like to shorten some of the path names in this fileset.
>> Is there a reason not to combine src/backend/utils/adt/tsearch,
>> src/backend/utils/tsearch, and src/backend/utils/tsearch/dict_ispell
>> into one place, perhaps src/backend/tsearch?

> I for one would like to see all the data types in a file in adt. Even if it's
> just a single file containing mostly just glue functions into code in
> sc/backend/tsearch.

The existing file layout doesn't meet that expectation either, since all
the functions were for some reason dumped into a *subdirectory* of adt.
That was where my too-weird, too-much-typing reflex kicked in ...

            regards, tom lane

Re: tsearch core path, v0.58

From
Teodor Sigaev
Date:

Tom Lane wrote:
> Teodor Sigaev <teodor@sigaev.ru> writes:
>> http://www.sigaev.ru/misc/tsearch_core-0.58.gz
>
> What is src/backend/utils/tsearch/dict_ispell/parse.h ?
oops - that is unused file after some experiments. Will be removed from patch.

>
> Well, I know what it *is*: it's bison output.  The question is what is
> it doing here?  It doesn't seem to be used, and if it is used then I
> do not see the bison grammar file it's made from.
>
> BTW, I would like to shorten some of the path names in this fileset.
> Is there a reason not to combine src/backend/utils/adt/tsearch,
> src/backend/utils/tsearch, and src/backend/utils/tsearch/dict_ispell
> into one place, perhaps src/backend/tsearch?

There is no any strong reason - src/backend/utils/tsearch contains fucntions
about processing text, they are rather complex.


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/