tsearch2 queries faster than expected - Mailing list pgsql-general
From | Jeffrey Melloy |
---|---|
Subject | tsearch2 queries faster than expected |
Date | |
Msg-id | D459B702-B810-11D8-BB2B-000393C78AC0@visualdistortion.org Whole thread Raw |
Responses |
Re: tsearch2 queries faster than expected
|
List | pgsql-general |
I have been using tsearch2 for quite a while with a fair amount of success. The other day I was playiing around with a query, and randomly changed a few things. I noticed a 10 times speedup and didn't know why. Both queries return identical results. The idea was to do a proximity search, where one word appears within 10 minutes of the other. I'm not sure if this is a bug or something weird. Using postgres 7.4.2, and I think the 7.4.2 version of tsearch2. The two queries: Fast: explain analyze select m1.message_date, m1.message_id from messages m1, messages m2, to_tsquery('haley') q1, to_tsquery('birthday') q2 where m2.message_date between m1.message_date - '5 minutes'::interval and m1.message_date + '5 minutes'::interval and m1.idxfti @@ q1 and m2.idxfti @@ q2 and m1.message_id <> m2.message_id; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ --------------------------------------------------------------------- Nested Loop (cost=485403.85..549229077.83 rows=1651000057 width=12) (actual time=190.952..221.859 rows=4 loops=1) Join Filter: (("outer".message_date >= ("inner".message_date - '00:05:00'::interval)) AND ("outer".message_date <= ("inner".message_date + '00:05:00'::interval)) AND ("inner".message_id <> "outer".message_id)) -> Nested Loop (cost=0.00..484867.85 rows=121898 width=12) (actual time=0.879..33.273 rows=86 loops=1) -> Function Scan on q2 (cost=0.00..12.50 rows=1000 width=32) (actual time=0.034..0.037 rows=1 loops=1) -> Index Scan using fti_idx on messages m2 (cost=0.00..483.33 rows=122 width=44) (actual time=0.831..32.828 rows=86 loops=1) Index Cond: (m2.idxfti @@ "outer".q2) Filter: (m2.idxfti @@ "outer".q2) -> Materialize (cost=485403.85..487158.83 rows=121898 width=12) (actual time=0.189..1.477 rows=160 loops=86) -> Nested Loop (cost=0.00..484867.85 rows=121898 width=12) (actual time=16.132..110.991 rows=160 loops=1) -> Function Scan on q1 (cost=0.00..12.50 rows=1000 width=32) (actual time=0.061..0.065 rows=1 loops=1) -> Index Scan using fti_idx on messages m1 (cost=0.00..483.33 rows=122 width=44) (actual time=16.048..99.997 rows=160 loops=1) Index Cond: (m1.idxfti @@ "outer".q1) Filter: (m1.idxfti @@ "outer".q1) Total runtime: 223.481 ms (14 rows) Slow: explain analyze select m1.message_date, m1.message, m2.message_date from messages m1, messages m2 where m2.message_date BETWEEN m1.message_date - '5 minutes'::interval and m1.message_date + '5 minutes'::interval and m1.idxfti @@ to_tsquery('haley') and m2.idxfti @@ to_tsquery('birthday') and m1.message_id <> m2.message_id; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ --------------------------------------------------------------------- Nested Loop (cost=0.00..59784.68 rows=1654 width=56) (actual time=746.830..3132.006 rows=4 loops=1) Join Filter: (("inner".message_date >= ("outer".message_date - '00:05:00'::interval)) AND ("inner".message_date <= ("outer".message_date + '00:05:00'::interval)) AND ("outer".message_id <> "inner".message_id)) -> Index Scan using fti_idx on messages m1 (cost=0.00..483.33 rows=122 width=52) (actual time=8.770..69.013 rows=160 loops=1) Index Cond: (idxfti @@ '\'haley\''::tsquery) Filter: (idxfti @@ '\'haley\''::tsquery) -> Index Scan using fti_idx on messages m2 (cost=0.00..483.33 rows=122 width=12) (actual time=0.112..18.899 rows=86 loops=160) Index Cond: (idxfti @@ '\'birthday\''::tsquery) Filter: (idxfti @@ '\'birthday\''::tsquery) Total runtime: 3132.665 ms (9 rows)
pgsql-general by date: