Thread: Questions about TSearch2 and PG 8.2

Questions about TSearch2 and PG 8.2

From
"Markus Wollny"
Date:
Hello!

I'm in the process of upgrading our PostgreSQL 8.1 databases to PostgreSQL 8.2.4. I have stumbled over a minor issue
withthe upgrade and some helpful suggestions here: 
http://people.planetpostgresql.org/xzilla/index.php?/archives/291-The-pain-that-is-tsearch2-8.1-8.2-upgrading.html

I shall try tonight with an plain SQL dump, but as some of my DBs are quite large, I usually use the custom dump
format.As I would like to move the tsearch2-stuff in ist own schema as suggested, I tried using a restore list. I'd
liketo report that everything works as expected, but I've got a slight problem with the custom schema part. I created
thetarget-db, created a schema tsearch2 and installed the tsearch2-functions, operators, configuration and whatnot into
thisnew schema. Then I edited the restore list so that the tsearch2-bits would not be created from the dump file again.
However,the binary-dump tries to create the textindex-columns with a tsvector-type which explicitly references the
publicschema.  

Instead of

CREATE TABLE someschema.article
(
id integer,
mytext text,
idxfti tsvector
);

it tries to create the table like this

CREATE TABLE someschema.article
(
id integer,
mytext text,
idxfti public.tsvector
);

As the tsvector-type is defined in the tsearch2-schema, this is bound to fail, even with the search_path set to include
thetsearch2-schema. I assume that this happens because the table article is not in the same schema as the original
tsvector-typeand the default search_path is being ignored on the dump in order to be on the safe side. This
"double-checking"breaks the migration in my case, however, so is there some way that would allow me to change the table
definitionon restore from using just tsvector instead of the explicit public.tsvector? I already tried editing the
binarydump, but that just resulted in a corrupted dump-file. I there's no other way, I'll go the plain dump route, of
course,but I'd just like to check this issue. 

My second question concerns the new Gin (Generalized Inverted Index) index type. Is it stable enough for production yet
andwould it yield a high enough performance gain in comparison the GiST? Does it make much sense using a Gin-index
alongsidethe GiST-one? Would we need to change anything in the application code in order to make use of Gin - like
using 

where idxfti @> to_tsquery('default_german', 'Fundstück')

instead of

where idxfti @@ to_tsquery('default_german', 'Fundstück')

? The docs here http://www.sai.msu.su/~megera/wiki/Gin are still a bit thin, so any hint to some further examples would
begreatly appreciated. 

Kind regards

   Markus


Computec Media AG
Sitz der Gesellschaft und Registergericht: Fürth (HRB 8818)
Vorstandsmitglieder: Johannes S. Gözalan (Vorsitzender) und Niels Herrmann
Vorsitzender des Aufsichtsrates: Jürg Marquard
Umsatzsteuer-Identifikationsnummer: DE 812 575 276



Re: Questions about TSearch2 and PG 8.2

From
Oleg Bartunov
Date:
On Mon, 30 Apr 2007, Markus Wollny wrote:

> Hello!
>
> I'm in the process of upgrading our PostgreSQL 8.1 databases to PostgreSQL 8.2.4. I have stumbled over a minor issue
withthe upgrade and some helpful suggestions here: 
> http://people.planetpostgresql.org/xzilla/index.php?/archives/291-The-pain-that-is-tsearch2-8.1-8.2-upgrading.html
>
> I shall try tonight with an plain SQL dump, but as some of my DBs are quite large, I usually use the custom dump
format.As I would like to move the tsearch2-stuff in ist own schema as suggested, I tried using a restore list. I'd
liketo report that everything works as expected, but I've got a slight problem with the custom schema part. I created
thetarget-db, created a schema tsearch2 and installed the tsearch2-functions, operators, configuration and whatnot into
thisnew schema. Then I edited the restore list so that the tsearch2-bits would not be created from the dump file again.
However,the binary-dump tries to create the textindex-columns with a tsvector-type which explicitly references the
publicschema. 
>
> Instead of
>
> CREATE TABLE someschema.article
> (
> id integer,
> mytext text,
> idxfti tsvector
> );
>
> it tries to create the table like this
>
> CREATE TABLE someschema.article
> (
> id integer,
> mytext text,
> idxfti public.tsvector
> );
>
> As the tsvector-type is defined in the tsearch2-schema, this is bound to fail, even with the search_path set to
includethe tsearch2-schema. I assume that this happens because the table article is not in the same schema as the
originaltsvector-type and the default search_path is being ignored on the dump in order to be on the safe side. This
"double-checking"breaks the migration in my case, however, so is there some way that would allow me to change the table
definitionon restore from using just tsvector instead of the explicit public.tsvector? I already tried editing the
binarydump, but that just resulted in a corrupted dump-file. I there's no other way, I'll go the plain dump route, of
course,but I'd just like to check this issue. 

I think you need plain sql dump.

>
> My second question concerns the new Gin (Generalized Inverted Index) index type. Is it stable enough for production
yetand would it yield a high enough performance gain in comparison the GiST? Does it make much sense using a Gin-index
alongsidethe GiST-one? Would we need to change anything in the application code in order to make use of Gin - like
using
>
> where idxfti @> to_tsquery('default_german', 'Fundst?ck')
>
> instead of
>
> where idxfti @@ to_tsquery('default_german', 'Fundst?ck')

No, use @@ operator with Gin as well, you may need @@@ operator for
Gin, if you use weights in tsquery.  btw, read
http://www.sai.msu.su/~megera/postgres/fts/doc for information about
Gin+Gist index. It's true, that the best combination is
GiST for online stuff and Gin for archived. Also, if you read russian
there are some papers available http://www.sai.msu.su/~megera/postgres/talks/

>
> ? The docs here http://www.sai.msu.su/~megera/wiki/Gin are still a bit thin, so any hint to some further examples
wouldbe greatly appreciated. 

See above
>
> Kind regards
>
>   Markus
>
>
> Computec Media AG
> Sitz der Gesellschaft und Registergericht: F?rth (HRB 8818)
> Vorstandsmitglieder: Johannes S. G?zalan (Vorsitzender) und Niels Herrmann
> Vorsitzender des Aufsichtsrates: J?rg Marquard
> Umsatzsteuer-Identifikationsnummer: DE 812 575 276
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org/
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Questions about TSearch2 and PG 8.2

From
Robert Treat
Date:
On Monday 30 April 2007 13:20, Markus Wollny wrote:
> Hello!
>
> I'm in the process of upgrading our PostgreSQL 8.1 databases to PostgreSQL
> 8.2.4. I have stumbled over a minor issue with the upgrade and some helpful
> suggestions here:
> http://people.planetpostgresql.org/xzilla/index.php?/archives/291-The-pain-
>that-is-tsearch2-8.1-8.2-upgrading.html
>
> I shall try tonight with an plain SQL dump, but as some of my DBs are quite
> large, I usually use the custom dump format. As I would like to move the
> tsearch2-stuff in ist own schema as suggested, I tried using a restore
> list. I'd like to report that everything works as expected, but I've got a
> slight problem with the custom schema part. I created the target-db,
> created a schema tsearch2 and installed the tsearch2-functions, operators,
> configuration and whatnot into this new schema. Then I edited the restore
> list so that the tsearch2-bits would not be created from the dump file
> again. However, the binary-dump tries to create the textindex-columns with
> a tsvector-type which explicitly references the public schema.
>
> Instead of
>
> CREATE TABLE someschema.article
> (
> id integer,
> mytext text,
> idxfti tsvector
> );
>
> it tries to create the table like this
>
> CREATE TABLE someschema.article
> (
> id integer,
> mytext text,
> idxfti public.tsvector
> );
>
> As the tsvector-type is defined in the tsearch2-schema, this is bound to
> fail, even with the search_path set to include the tsearch2-schema. I
> assume that this happens because the table article is not in the same
> schema as the original tsvector-type and the default search_path is being
> ignored on the dump in order to be on the safe side. This "double-checking"
> breaks the migration in my case, however, so is there some way that would
> allow me to change the table definition on restore from using just tsvector
> instead of the explicit public.tsvector? I already tried editing the binary
> dump, but that just resulted in a corrupted dump-file. I there's no other
> way, I'll go the plain dump route, of course, but I'd just like to check
> this issue.
>

I think you want to split it into 2 parts... do a schema only dump in plain
sql to get the schema loaded, then use the custom format to pull the data
over.

> My second question concerns the new Gin (Generalized Inverted Index) index
> type. Is it stable enough for production yet and would it yield a high
> enough performance gain in comparison the GiST? Does it make much sense
> using a Gin-index alongside the GiST-one? Would we need to change anything
> in the application code in order to make use of Gin - like using
>
> where idxfti @> to_tsquery('default_german', 'Fundstück')
>
> instead of
>
> where idxfti @@ to_tsquery('default_german', 'Fundstück')
>
> ? The docs here http://www.sai.msu.su/~megera/wiki/Gin are still a bit
> thin, so any hint to some further examples would be greatly appreciated.
>

Gin is certainly stable enough for production use, we're using it on some
projects at OmniTI at least.

I've got some gin/gist information up as well at
http://people.planetpostgresql.org/xzilla/index.php?/archives/278-PostgreSQL-full-text-search-testing.html,
also be sure check the trackback links... and also I think magnus had a post
on gin/gist wrt the postgresql.org website.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL