Re: Tsearch2 really slower than ilike ? - Mailing list pgsql-performance
From | Hervé Piedvache |
---|---|
Subject | Re: Tsearch2 really slower than ilike ? |
Date | |
Msg-id | 200411171816.10794.herve@elma.fr Whole thread Raw |
In response to | Re: Tsearch2 really slower than ilike ? (Oleg Bartunov <oleg@sai.msu.su>) |
Responses |
Re: Tsearch2 really slower than ilike ?
|
List | pgsql-performance |
Oleg, Sorry but when I do your request I get : # select id_site from site where idx_site_name @@ 'livejourn'; ERROR: type " " does not exist What is this ? (private: I don't know what happend with my mail, but I do nothing special to disturb the contains when I'm writting to you ...) Le Mardi 16 Novembre 2004 22:13, Oleg Bartunov a écrit : > ok, I downloaded dump of table and here is what I found: > > zz=# select count(*) from tt; > count > -------- > 183956 > (1 row) > > zz=# select * from stat('select tt from tt') order by ndoc desc, nentry > desc,wo > rd limit 10; > word | ndoc | nentry > --------------+-------+-------- > blog | 12710 | 12835 > weblog | 4857 | 4859 > news | 4402 | 4594 > life | 4136 | 4160 > world | 1980 | 1986 > journal | 1882 | 1883 > livejourn | 1737 | 1737 > thought | 1669 | 1677 > web | 1154 | 1161 > scotsman.com | 1138 | 1138 > (10 rows) > > zz=# explain analyze select tt from tt where tt @@ 'blog'; > QUERY PLAN > --------------------------------------------------------------------------- >------------------------------------------- Index Scan using tt_idx on tt > (cost=0.00..728.83 rows=184 width=32) (actual time=0.047..141.110 > rows=12710 loops=1) Index Cond: (tt @@ '\'blog\''::tsquery) > Filter: (tt @@ '\'blog\''::tsquery) > Total runtime: 154.105 ms > (4 rows) > > It's really fast ! So, I don't understand your problem. > I run query on my desktop machine, nothing special. > > > Oleg > > On Tue, 16 Nov 2004, [iso-8859-15] Herv? Piedvache wrote: > > Hi, > > > > I'm completly dispointed with Tsearch2 ... > > > > I have a table like this : > > Table "public.site" > > Column | Type | > > Modifiers > > ---------------+-----------------------------+--------------------------- > >------------------------------------ id_site | integer > > | not null default > > nextval('public.site_id_site_seq'::text) > > site_name | text | > > site_url | text | > > url | text | > > language | text | > > datecrea | date | default now() > > id_category | integer | > > time_refresh | integer | > > active | integer | > > error | integer | > > description | text | > > version | text | > > idx_site_name | tsvector | > > lastcheck | date | > > lastupdate | timestamp without time zone | > > Indexes: > > "site_id_site_key" unique, btree (id_site) > > "ix_idx_site_name" gist (idx_site_name) > > Triggers: > > tsvectorupdate_site_name BEFORE INSERT OR UPDATE ON site FOR EACH ROW > > EXECUTE PROCEDURE tsearch2('idx_site_name', 'site_name') > > > > I have 183 956 records in the database ... > > > > SELECT s.site_name, s.id_site, s.description, s.site_url, > > case when exists (select id_user > > from user_choice u > > where u.id_site=s.id_site > > and u.id_user = 1) > > then 1 else 0 end as bookmarked > > FROM site s > > WHERE s.idx_site_name @@ to_tsquery('atari'); > > > > Explain Analyze : > > QUERY PLAN > > ------------------------------------------------------------------------- > >----------------------------------------------------------------- Index > > Scan using ix_idx_site_name on site s (cost=0.00..1202.12 rows=184 > > width=158) (actual time=4687.674..4698.422 rows=1 loops=1) > > Index Cond: (idx_site_name @@ '\'atari\''::tsquery) > > Filter: (idx_site_name @@ '\'atari\''::tsquery) > > SubPlan > > -> Seq Scan on user_choice u (cost=0.00..3.46 rows=1 width=4) > > (actual time=0.232..0.232 rows=0 loops=1) > > Filter: ((id_site = $0) AND (id_user = 1)) > > Total runtime: 4698.608 ms > > > > First time I run the request I have a result in about 28 seconds. > > > > SELECT s.site_name, s.id_site, s.description, s.site_url, > > case when exists (select id_user > > from user_choice u > > where u.id_site=s.id_site > > and u.id_user = 1) > > then 1 else 0 end as bookmarked > > FROM site_rss s > > WHERE s.site_name ilike '%atari%' > > > > QUERY PLAN > > ------------------------------------------------------------------------- > >--------------------------------------- Seq Scan on site_rss s > > (cost=0.00..11863.16 rows=295 width=158) (actual time=17.414..791.937 > > rows=12 loops=1) > > Filter: (site_name ~~* '%atari%'::text) > > SubPlan > > -> Seq Scan on user_choice u (cost=0.00..3.46 rows=1 width=4) > > (actual time=0.222..0.222 rows=0 loops=12) > > Filter: ((id_site = $0) AND (id_user = 1)) > > Total runtime: 792.099 ms > > > > First time I run the request I have a result in about 789 miliseconds > > !!??? > > > > I'm using PostgreSQL v7.4.6 with a Bi-Penitum III 933 Mhz and 1 Gb of > > RAM. > > > > Any idea ... ? For the moment I'm going back to use the ilike solution > > ... but I was really thinking that Tsearch2 could be a better solution > > ... > > > > Regards, > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match -- Hervé Piedvache Elma Ingénierie Informatique 6 rue du Faubourg Saint-Honoré F-75008 - Paris - France Pho. 33-144949901 Fax. 33-144949902
pgsql-performance by date: