Re: Full Text index is not using during OR operation - Mailing list pgsql-performance
From | Oleg Bartunov |
---|---|
Subject | Re: Full Text index is not using during OR operation |
Date | |
Msg-id | Pine.LNX.4.64.1011291801410.12632@sn.sai.msu.ru Whole thread Raw |
In response to | Re: Full Text index is not using during OR operation (AI Rumman <rummandba@gmail.com>) |
List | pgsql-performance |
On Mon, 29 Nov 2010, AI Rumman wrote: > Oh! Actualy it is: > select replace('Dhaka University of Bangladesh:*', ' ',':* & '); > No space at start. So, what are actual problems with full text ? I mostly interesting with server crush. We need test data, test query and error message. > > On Mon, Nov 29, 2010 at 6:37 PM, Oleg Bartunov <oleg@sai.msu.su> wrote: > >> What does replace(' Dhaka University of Bangladesh:*', ' ',':* & ') means ? >> I see it produces something wrong for to_tsquery: >> >> test=# select replace(' Dhaka University of Bangladesh:*', ' ',':* & '); >> >> replace >> --------------------------------------------------- >> :* & Dhaka:* & University:* & of:* & Bangladesh:* >> (1 row) >> >> Oleg >> >> >> On Mon, 29 Nov 2010, AI Rumman wrote: >> >> explain >>> SELECT crmentity.crmid, crmentity.setype, crmentity.modifiedtime, >>> activity.subject,case when ( users.user_name not like '') then >>> users.user_name else groups.groupname end as user_name, >>> activity.date_start >>> FROM crmentity INNER JOIN activity ON crmentity.crmid = >>> activity.activityid >>> and crmentity.deleted = 0 >>> LEFT JOIN activitygrouprelation ON activitygrouprelation.activityid = >>> crmentity.crmid >>> LEFT JOIN groups ON groups.groupname = activitygrouprelation.groupname >>> LEFT join users ON crmentity.smownerid= users.id >>> WHERE >>> to_tsvector(' en', for_fts( activity.subject)) @@ to_tsquery(' en', >>> replace(' Dhaka University of Bangladesh:*', ' ',':* & ')) >>> or >>> to_tsvector(' en', for_fts( crmentity.description)) @@ to_tsquery(' en', >>> replace(' Dhaka University of Bangladesh:*', ' ',':* & ')) >>> ORDER BY crmentity.modifiedtime DESC LIMIT 100 >>> >>> QUERY PLAN >>> >>> >>> ------------------------------------------------------------------------------------------------------------------------------------------ >>> Limit (cost=112724.54..112724.54 rows=1 width=99) >>> -> Sort (cost=112724.54..112724.54 rows=1 width=99) >>> Sort Key: crmentity.modifiedtime >>> -> Nested Loop Left Join (cost=0.00..112724.53 rows=1 width=99) >>> -> Nested Loop Left Join (cost=0.00..112724.24 rows=1 width=82) >>> -> Nested Loop Left Join (cost=0.00..112723.96 rows=1 width=79) >>> -> Nested Loop (cost=0.00..112723.68 rows=1 width=56) >>> Join Filter: ((to_tsvector('en'::regconfig, >>> regexp_replace((activity.subject)::text, >>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ''::text, 'gs'::text)) @@ >>> ''' >>> Dhaka'':* & ''univers'':* & ''bangladesh'':*'::tsquery) OR >>> (to_tsvector('en'::regconfig, regexp_replace(crmentity.description, >>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::text, 'gs'::text)) @@ >>> ''' Dhaka'':* & ''univers'':* & ''bangladesh'':*'::tsquery)) >>> -> Index Scan using activity_pkey on activity (cost=0.00..10223.89 >>> rows=343070 width=36) >>> -> Index Scan using crmentity_pkey on crmentity (cost=0.00..0.27 rows=1 >>> width=151) >>> Index Cond: (crmentity.crmid = activity.activityid) >>> Filter: (crmentity.deleted = 0) >>> -> Index Scan using activitygrouprelation_activityid_idx on >>> activitygrouprelation (cost=0.00..0.27 rows=1 width=27) >>> Index Cond: (activitygrouprelation.activityid = crmentity.crmid) >>> -> Index Scan using groups_groupname_idx on groups (cost=0.00..0.27 >>> rows=1 >>> width=26) >>> Index Cond: ((groups.groupname)::text = >>> (activitygrouprelation.groupname)::text) >>> -> Index Scan using users_pkey on users (cost=0.00..0.27 rows=1 width=25) >>> Index Cond: (crmentity.smownerid = users.id) >>> >>> >>> The above query are not using fts indexes, even hang the server. >>> >>> But, >>> >>> >>> explain >>> SELECT crmentity.crmid, crmentity.setype, crmentity.modifiedtime, >>> activity.subject,case when ( users.user_name not like '') then >>> users.user_name else groups.groupname end as user_name, >>> activity.date_start >>> FROM crmentity INNER JOIN activity ON crmentity.crmid = >>> activity.activityid >>> and crmentity.deleted = 0 >>> LEFT JOIN activitygrouprelation ON activitygrouprelation.activityid = >>> crmentity.crmid >>> LEFT JOIN groups ON groups.groupname = activitygrouprelation.groupname >>> LEFT join users ON crmentity.smownerid= users.id >>> WHERE >>> to_tsvector(' en', for_fts( activity.subject)) @@ to_tsquery(' en', >>> replace(' Dhaka University of Bangladesh:*', ' ',':* & ')) >>> ORDER BY crmentity.modifiedtime DESC LIMIT 100 >>> >>> >>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> Limit (cost=1.46..1.47 rows=1 width=99) (actual time=0.824..0.824 rows=0 >>> loops=1) >>> -> Sort (cost=1.46..1.47 rows=1 width=99) (actual time=0.819..0.819 >>> rows=0 >>> loops=1) >>> Sort Key: crmentity.modifiedtime >>> Sort Method: quicksort Memory: 17kB >>> -> Nested Loop Left Join (cost=0.27..1.45 rows=1 width=99) (actual >>> time=0.752..0.752 rows=0 loops=1) >>> -> Nested Loop Left Join (cost=0.27..1.17 rows=1 width=82) (actual >>> time=0.750..0.750 rows=0 loops=1) >>> -> Nested Loop Left Join (cost=0.27..0.88 rows=1 width=79) (actual >>> time=0.748..0.748 rows=0 loops=1) >>> -> Nested Loop (cost=0.27..0.60 rows=1 width=56) (actual >>> time=0.746..0.746 >>> rows=0 loops=1) >>> -> Bitmap Heap Scan on activity (cost=0.27..0.30 rows=1 width=36) (actual >>> time=0.744..0.744 rows=0 loops=1) >>> Recheck Cond: (to_tsvector('en'::regconfig, >>> regexp_replace((subject)::text, >>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::text, >>> 'gs'::text)) @@ ''' Dhaka'':* & ''univers'':* & >>> ''bangladesh'':*'::tsquery) >>> -> Bitmap Index Scan on ftx_en_activity_subject (cost=0.00..0.27 rows=1 >>> width=0) (actual time=0.740..0.740 rows=0 loops=1) >>> Index Cond: (to_tsvector('en'::regconfig, regexp_replace((subject)::text, >>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::te >>> xt, 'gs'::text)) @@ ''' Dhaka'':* & ''univers'':* & >>> ''bangladesh'':*'::tsquery) >>> -> Index Scan using crmentity_pkey on crmentity (cost=0.00..0.29 rows=1 >>> width=24) (never executed) >>> Index Cond: (crmentity.crmid = activity.activityid) >>> Filter: (crmentity.deleted = 0) >>> -> Index Scan using activitygrouprelation_activityid_idx on >>> activitygrouprelation (cost=0.00..0.27 rows=1 width=27) (never executed) >>> Index Cond: (activitygrouprelation.activityid = crmentity.crmid) >>> -> Index Scan using groups_groupname_idx on groups (cost=0.00..0.27 >>> rows=1 >>> width=26) (never executed) >>> Index Cond: ((groups.groupname)::text = >>> (activitygrouprelation.groupname)::text) >>> -> Index Scan using users_pkey on users (cost=0.00..0.27 rows=1 width=25) >>> (never executed) >>> Index Cond: (crmentity.smownerid = users.id) >>> Total runtime: 1.188 ms >>> >>> >>> >>> >>> explain >>> SELECT crmentity.crmid, crmentity.setype, crmentity.modifiedtime, >>> activity.subject,case when ( users.user_name not like '') then >>> users.user_name else groups.groupname end as user_name, >>> activity.date_start >>> FROM crmentity INNER JOIN activity ON crmentity.crmid = >>> activity.activityid >>> and crmentity.deleted = 0 >>> LEFT JOIN activitygrouprelation ON activitygrouprelation.activityid = >>> crmentity.crmid >>> LEFT JOIN groups ON groups.groupname = activitygrouprelation.groupname >>> LEFT join users ON crmentity.smownerid= users.id >>> WHERE >>> to_tsvector(' en', for_fts( crmentity.description)) @@ to_tsquery(' en', >>> replace(' Dhaka University of Bangladesh:*', ' ',':* & ')) >>> ORDER BY crmentity.modifiedtime DESC LIMIT 100 >>> >>> QUERY PLAN >>> >>> >>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> Limit (cost=1.50..1.51 rows=1 width=99) (actual time=5.044..5.047 rows=1 >>> loops=1) >>> -> Sort (cost=1.50..1.51 rows=1 width=99) (actual time=5.041..5.042 >>> rows=1 >>> loops=1) >>> Sort Key: crmentity.modifiedtime >>> Sort Method: quicksort Memory: 17kB >>> -> Nested Loop Left Join (cost=0.27..1.49 rows=1 width=99) (actual >>> time=4.998..5.012 rows=1 loops=1) >>> -> Nested Loop Left Join (cost=0.27..1.19 rows=1 width=82) (actual >>> time=4.952..4.961 rows=1 loops=1) >>> -> Nested Loop Left Join (cost=0.27..0.90 rows=1 width=79) (actual >>> time=4.949..4.956 rows=1 loops=1) >>> -> Nested Loop (cost=0.27..0.60 rows=1 width=56) (actual >>> time=4.943..4.948 >>> rows=1 loops=1) >>> -> Bitmap Heap Scan on crmentity (cost=0.27..0.30 rows=1 width=24) >>> (actual >>> time=4.727..4.799 rows=3 loops=1) >>> Recheck Cond: (to_tsvector('en'::regconfig, regexp_replace(description, >>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::text, 'gs' >>> ::text)) @@ ''' Dhaka'':* & ''univers'':* & ''bangladesh'':*'::tsquery) >>> Filter: (deleted = 0) >>> -> Bitmap Index Scan on ftx_en_crmentity_description (cost=0.00..0.27 >>> rows=1 width=0) (actual time=4.687..4.687 rows=3 loops=1) >>> Index Cond: (to_tsvector('en'::regconfig, regexp_replace(description, >>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::text, >>> 'gs'::text)) @@ ''' Dhaka'':* & ''univers'':* & >>> ''bangladesh'':*'::tsquery) >>> -> Index Scan using activity_pkey on activity (cost=0.00..0.29 rows=1 >>> width=36) (actual time=0.043..0.043 rows=0 loops=3) >>> Index Cond: (activity.activityid = crmentity.crmid) >>> -> Index Scan using activitygrouprelation_activityid_idx on >>> activitygrouprelation (cost=0.00..0.29 rows=1 width=27) (actual >>> time=0.003..0.003 >>> rows=0 loops=1) >>> Index Cond: (activitygrouprelation.activityid = crmentity.crmid) >>> -> Index Scan using groups_groupname_idx on groups (cost=0.00..0.27 >>> rows=1 >>> width=26) (actual time=0.001..0.001 rows=0 loops=1) >>> Index Cond: ((groups.groupname)::text = >>> (activitygrouprelation.groupname)::text) >>> -> Index Scan using users_pkey on users (cost=0.00..0.29 rows=1 width=25) >>> (actual time=0.033..0.035 rows=1 loops=1) >>> Index Cond: (crmentity.smownerid = users.id) >>> Total runtime: 5.229 ms >>> (22 rows) >>> >>> >>> >>> \d crmentity >>> Table "public.crmentity" >>> Column | Type | Modifiers >>> --------------+-----------------------------+-------------------- >>> crmid | integer | not null >>> smcreatorid | integer | not null default 0 >>> smownerid | integer | not null default 0 >>> modifiedby | integer | not null default 0 >>> setype | character varying(30) | not null >>> description | text | >>> createdtime | timestamp without time zone | not null >>> modifiedtime | timestamp without time zone | not null >>> viewedtime | timestamp without time zone | >>> status | character varying(50) | >>> version | integer | not null default 0 >>> presence | integer | default 1 >>> deleted | integer | not null default 0 >>> Indexes: >>> "crmentity_pkey" PRIMARY KEY, btree (crmid) >>> "crmentity_createdtime_idx" btree (createdtime) >>> "crmentity_modifiedby_idx" btree (modifiedby) >>> "crmentity_modifiedtime_idx" btree (modifiedtime) >>> "crmentity_smcreatorid_idx" btree (smcreatorid) >>> "crmentity_smownerid_idx" btree (smownerid) >>> "ftx_en_crmentity_description" gin (to_tsvector('vcrm_en'::regconfig, >>> for_fts(description))) >>> "crmentity_deleted_idx" btree (deleted) >>> Referenced by: >>> TABLE "service" CONSTRAINT "fk_1_service" FOREIGN KEY (serviceid) >>> REFERENCES crmentity(crmid) ON DELETE CASCADE >>> TABLE "cc2crmentity" CONSTRAINT "fk_cc2crmentity_crmentity" FOREIGN KEY >>> (crm_id) REFERENCES crmentity(crmid) ON UPDATE CASCADE ON DELETE CASCADE >>> >>> >>> \d activity >>> >>> Table "public.activity" >>> Column | Type | Modifiers >>> >>> ------------------+------------------------+------------------------------------------- >>> activityid | integer | not null default 0 >>> subject | character varying(250) | not null >>> semodule | character varying(20) | >>> activitytype | character varying(200) | not null >>> date_start | date | not null >>> due_date | date | >>> time_start | character varying(50) | >>> time_end | character varying(50) | >>> sendnotification | character varying(3) | not null default '0'::character >>> varying >>> duration_hours | character varying(2) | >>> duration_minutes | character varying(200) | >>> status | character varying(200) | >>> eventstatus | character varying(200) | >>> priority | character varying(200) | >>> location | character varying(150) | >>> notime | character varying(3) | not null default '0'::character varying >>> visibility | character varying(50) | not null default 'all'::character >>> varying >>> recurringtype | character varying(200) | >>> end_date | date | >>> end_time | character varying(50) | >>> Indexes: >>> "activity_pkey" PRIMARY KEY, btree (activityid) >>> "activity_activitytype_idx" btree (activitytype) >>> "activity_date_start_idx" btree (date_start) >>> "activity_due_date_idx" btree (due_date) >>> "activity_eventstatus_idx" btree (eventstatus) >>> "activity_status_idx" btree (status) >>> "activity_subject_idx" btree (subject) >>> "activity_time_start_idx" btree (time_start) >>> "ftx_en_activity_subject" gin (to_tsvector('vcrm_en'::regconfig, >>> for_fts(subject::text))) >>> >>> >> Regards, >> Oleg >> _____________________________________________________________ >> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), >> Sternberg Astronomical Institute, Moscow University, Russia >> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ >> phone: +007(495)939-16-83, +007(495)939-23-83 >> > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
pgsql-performance by date: