Re: Tsearch2 really slower than ilike ? - Mailing list pgsql-performance

From Hervé Piedvache
Subject Re: Tsearch2 really slower than ilike ?
Date
Msg-id 200411171816.10794.herve@elma.fr
Whole thread Raw
In response to Re: Tsearch2 really slower than ilike ?  (Oleg Bartunov <oleg@sai.msu.su>)
Responses Re: Tsearch2 really slower than ilike ?
List pgsql-performance
Oleg,

Sorry but when I do your request I get :
# select id_site from site where idx_site_name @@  'livejourn';
ERROR:  type " " does not exist

What is this ?

(private: I don't know what happend with my mail, but I do nothing special to
disturb the contains when I'm writting to you ...)

Le Mardi 16 Novembre 2004 22:13, Oleg Bartunov a écrit :
> ok, I downloaded dump of table and here is what I found:
>
> zz=# select count(*) from tt;
>   count
> --------
>   183956
> (1 row)
>
> zz=# select * from stat('select tt from tt') order by ndoc desc, nentry
> desc,wo
> rd limit 10;
>       word     | ndoc  | nentry
> --------------+-------+--------
>   blog         | 12710 |  12835
>   weblog       |  4857 |   4859
>   news         |  4402 |   4594
>   life         |  4136 |   4160
>   world        |  1980 |   1986
>   journal      |  1882 |   1883
>   livejourn    |  1737 |   1737
>   thought      |  1669 |   1677
>   web          |  1154 |   1161
>   scotsman.com |  1138 |   1138
> (10 rows)
>
> zz=# explain analyze select tt from tt where tt @@  'blog';
>                                                        QUERY PLAN
> ---------------------------------------------------------------------------
>------------------------------------------- Index Scan using tt_idx on tt
> (cost=0.00..728.83 rows=184 width=32) (actual time=0.047..141.110
> rows=12710 loops=1) Index Cond: (tt @@ '\'blog\''::tsquery)
>     Filter: (tt @@ '\'blog\''::tsquery)
>   Total runtime: 154.105 ms
> (4 rows)
>
> It's really fast ! So, I don't understand your problem.
> I run query on my desktop machine, nothing special.
>
>
>      Oleg
>
> On Tue, 16 Nov 2004, [iso-8859-15] Herv? Piedvache wrote:
> > Hi,
> >
> > I'm completly dispointed with Tsearch2 ...
> >
> > I have a table like this :
> >                                          Table "public.site"
> >    Column     |            Type             |
> > Modifiers
> > ---------------+-----------------------------+---------------------------
> >------------------------------------ id_site       | integer
> >       | not null default
> > nextval('public.site_id_site_seq'::text)
> > site_name     | text                        |
> > site_url      | text                        |
> > url            | text                        |
> > language      | text                        |
> > datecrea      | date                        | default now()
> > id_category   | integer                     |
> > time_refresh  | integer                     |
> > active        | integer                     |
> > error         | integer                     |
> > description   | text                        |
> > version       | text                        |
> > idx_site_name | tsvector                    |
> > lastcheck     | date                        |
> > lastupdate    | timestamp without time zone |
> > Indexes:
> >    "site_id_site_key" unique, btree (id_site)
> >    "ix_idx_site_name" gist (idx_site_name)
> > Triggers:
> >    tsvectorupdate_site_name BEFORE INSERT OR UPDATE ON site FOR EACH ROW
> > EXECUTE PROCEDURE tsearch2('idx_site_name', 'site_name')
> >
> > I have 183 956 records in the database ...
> >
> > SELECT s.site_name, s.id_site, s.description, s.site_url,
> >                case when exists (select id_user
> >                                                    from user_choice u
> >                                                 where u.id_site=s.id_site
> >                                                    and u.id_user = 1)
> > then 1 else 0 end as bookmarked
> >   FROM site s
> > WHERE s.idx_site_name @@ to_tsquery('atari');
> >
> > Explain Analyze :
> >                                                               QUERY PLAN
> > -------------------------------------------------------------------------
> >----------------------------------------------------------------- Index
> > Scan using ix_idx_site_name on site s  (cost=0.00..1202.12 rows=184
> > width=158) (actual time=4687.674..4698.422 rows=1 loops=1)
> >   Index Cond: (idx_site_name @@ '\'atari\''::tsquery)
> >   Filter: (idx_site_name @@ '\'atari\''::tsquery)
> >   SubPlan
> >     ->  Seq Scan on user_choice u  (cost=0.00..3.46 rows=1 width=4)
> > (actual time=0.232..0.232 rows=0 loops=1)
> >           Filter: ((id_site = $0) AND (id_user = 1))
> > Total runtime: 4698.608 ms
> >
> > First time I run the request I have a result in about 28 seconds.
> >
> > SELECT s.site_name, s.id_site, s.description, s.site_url,
> >                case when exists (select id_user
> >                                                    from user_choice u
> >                                                 where u.id_site=s.id_site
> >                                                    and u.id_user = 1)
> > then 1 else 0 end as bookmarked
> >   FROM site_rss s
> > WHERE s.site_name ilike '%atari%'
> >
> >                                                   QUERY PLAN
> > -------------------------------------------------------------------------
> >--------------------------------------- Seq Scan on site_rss s
> > (cost=0.00..11863.16 rows=295 width=158) (actual time=17.414..791.937
> > rows=12 loops=1)
> >   Filter: (site_name ~~* '%atari%'::text)
> >   SubPlan
> >     ->  Seq Scan on user_choice u  (cost=0.00..3.46 rows=1 width=4)
> > (actual time=0.222..0.222 rows=0 loops=12)
> >           Filter: ((id_site = $0) AND (id_user = 1))
> > Total runtime: 792.099 ms
> >
> > First time I run the request I have a result in about 789 miliseconds
> > !!???
> >
> > I'm using PostgreSQL v7.4.6 with a Bi-Penitum III 933 Mhz and 1 Gb of
> > RAM.
> >
> > Any idea ... ? For the moment I'm going back to use the ilike solution
> > ... but I was really thinking that Tsearch2 could be a better solution
> > ...
> >
> > Regards,
>
>      Regards,
>          Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

pgsql-performance by date:

Previous
From: Darcy Buskermolen
Date:
Subject: Re: memcached and PostgreSQL
Next
From: Oleg Bartunov
Date:
Subject: Re: Tsearch2 really slower than ilike ?