Thread: Understanding tsearch2 performance
Here's a query and its EXPLAIN ANALYZE output: cms=> select count(*) from forum; count ------- 90675 (1 row) cms=> explain analyze select id,title from forum where _fts_ @@ 'fer'::tsquery; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on forum (cost=29.21..361.21 rows=91 width=35) (actual time=2.946..63.646 rows=8449 loops=1) Recheck Cond: (_fts_ @@ '''fer'''::tsquery) -> Bitmap Index Scan on forum_fts (cost=0.00..29.19 rows=91 width=0) (actual time=2.119..2.119 rows=8449 loops=1) Index Cond: (_fts_ @@ '''fer'''::tsquery) Total runtime: 113.641 ms (5 rows) The problem is - tsearch2 seems too slow. I have nothing to compare it to but 113 ms for searching through this small table of 90,000 records seems too slow. The forum_fts index is of GIN type and the table certainly fits into RAM. When I issue a dumb query without an index, I get a comparable order of magnitude performance: cms=> explain analyze select id,title from forum where content ilike '%fer%'; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Seq Scan on forum (cost=0.00..7307.44 rows=3395 width=35) (actual time=0.030..798.375 rows=10896 loops=1) Filter: (content ~~* '%fer%'::text) Total runtime: 864.384 ms (3 rows) cms=> explain analyze select id,title from forum where content like '%fer%'; QUERY PLAN ----------------------------------------------------------------------------------------------------------- Seq Scan on forum (cost=0.00..7307.44 rows=3395 width=35) (actual time=0.024..146.959 rows=7596 loops=1) Filter: (content ~~ '%fer%'::text) Total runtime: 191.732 ms (3 rows) Some peculiarities of the setup which might or might not influence this performance: 1) I'm using ICU-patched postgresql because I cannot use my UTF-8 locale otherwise - this is why the difference between the dumb queries is large (but I don't see how this can influence tsearch2 since it pre-builds the tsvector data with lowercase lexemes) 2) My tsearch2 lexer is somewhat slow (but I don't see how it can influence these read-only queries on a pre-built, lexed and indexed data) Any ideas?
Something is not good with statistics, 91 est. vs 8449 actually returned. Returning 8449 rows could be quite long. Oleg On Wed, 14 Jul 2010, Ivan Voras wrote: > Here's a query and its EXPLAIN ANALYZE output: > > cms=> select count(*) from forum; > count > ------- > 90675 > (1 row) > > cms=> explain analyze select id,title from forum where _fts_ @@ > 'fer'::tsquery; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on forum (cost=29.21..361.21 rows=91 width=35) > (actual time=2.946..63.646 rows=8449 loops=1) > Recheck Cond: (_fts_ @@ '''fer'''::tsquery) > -> Bitmap Index Scan on forum_fts (cost=0.00..29.19 rows=91 > width=0) (actual time=2.119..2.119 rows=8449 loops=1) > Index Cond: (_fts_ @@ '''fer'''::tsquery) > Total runtime: 113.641 ms > (5 rows) > > The problem is - tsearch2 seems too slow. I have nothing to compare it > to but 113 ms for searching through this small table of 90,000 records > seems too slow. The forum_fts index is of GIN type and the table > certainly fits into RAM. > > When I issue a dumb query without an index, I get a comparable order of > magnitude performance: > > cms=> explain analyze select id,title from forum where content ilike > '%fer%'; > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------ > Seq Scan on forum (cost=0.00..7307.44 rows=3395 width=35) (actual > time=0.030..798.375 rows=10896 loops=1) > Filter: (content ~~* '%fer%'::text) > Total runtime: 864.384 ms > (3 rows) > > cms=> explain analyze select id,title from forum where content like '%fer%'; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------- > Seq Scan on forum (cost=0.00..7307.44 rows=3395 width=35) (actual > time=0.024..146.959 rows=7596 loops=1) > Filter: (content ~~ '%fer%'::text) > Total runtime: 191.732 ms > (3 rows) > > Some peculiarities of the setup which might or might not influence this > performance: > > 1) I'm using ICU-patched postgresql because I cannot use my UTF-8 locale > otherwise - this is why the difference between the dumb queries is large > (but I don't see how this can influence tsearch2 since it pre-builds the > tsvector data with lowercase lexemes) > > 2) My tsearch2 lexer is somewhat slow (but I don't see how it can > influence these read-only queries on a pre-built, lexed and indexed data) > > Any ideas? > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On 07/14/10 14:31, Oleg Bartunov wrote: > Something is not good with statistics, 91 est. vs 8449 actually returned. I don't think the statistics difference is significant - it's actually using the index so it's ok. And I've run vacuum analyze just before starting the query. > Returning 8449 rows could be quite long. You are right, I didn't test this. Issuing a query which returns a smaller result set is much faster. But, offtopic, why would returning 8500 records, each around 100 bytes long so around 8.5 MB, over local unix sockets, be so slow? The machine in question has a sustained memory bendwidth of nearly 10 GB/s. Does PostgreSQL spend much time marshalling the data through the socket stream?
On Wed, 14 Jul 2010, Ivan Voras wrote: >> Returning 8449 rows could be quite long. > > You are right, I didn't test this. Issuing a query which returns a > smaller result set is much faster. > > But, offtopic, why would returning 8500 records, each around 100 bytes > long so around 8.5 MB, over local unix sockets, be so slow? The machine > in question has a sustained memory bendwidth of nearly 10 GB/s. Does > PostgreSQL spend much time marshalling the data through the socket stream? It's disk access time. in the very bad case it could take ~5 ms (for fast drive) to get one just one row. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On 07/14/10 15:25, Oleg Bartunov wrote: > On Wed, 14 Jul 2010, Ivan Voras wrote: > >>> Returning 8449 rows could be quite long. >> >> You are right, I didn't test this. Issuing a query which returns a >> smaller result set is much faster. >> >> But, offtopic, why would returning 8500 records, each around 100 bytes >> long so around 8.5 MB, over local unix sockets, be so slow? The machine >> in question has a sustained memory bendwidth of nearly 10 GB/s. Does >> PostgreSQL spend much time marshalling the data through the socket >> stream? > > It's disk access time. > in the very bad case it could take ~5 ms (for fast drive) to get one just > one row. No, it's not that. The table fits in RAM, I've verified there is no disk IO involved. Something else is wrong: cms=> explain analyze select id,title from forum where _fts_ @@ 'fer'::tsquery limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..43.31 rows=10 width=35) (actual time=0.194..0.373 rows=10 loops=1) -> Index Scan using forum_fts on forum (cost=0.00..394.10 rows=91 width=35) (actual time=0.182..0.256 rows=10 loops=1) Index Cond: (_fts_ @@ '''fer'''::tsquery) Total runtime: 0.507 ms (4 rows) cms=> explain analyze select id,title from forum where _fts_ @@ 'fer'::tsquery order by id limit 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Limit (cost=363.18..363.20 rows=10 width=35) (actual time=118.358..118.516 rows=10 loops=1) -> Sort (cost=363.18..363.40 rows=91 width=35) (actual time=118.344..118.396 rows=10 loops=1) Sort Key: id Sort Method: top-N heapsort Memory: 25kB -> Bitmap Heap Scan on forum (cost=29.21..361.21 rows=91 width=35) (actual time=3.066..64.091 rows=8449 loops=1) Recheck Cond: (_fts_ @@ '''fer'''::tsquery) -> Bitmap Index Scan on forum_fts (cost=0.00..29.19 rows=91 width=0) (actual time=2.106..2.106 rows=8449 loops=1) Index Cond: (_fts_ @@ '''fer'''::tsquery) Total runtime: 118.689 ms (9 rows) See in the first query where I have a simple LIMIT, it fetches random 10 rows quickly, but in the second one, as soon as I give it to execute and calculate the entire result set before I limit it, the performance is horrible.
* Ivan Voras (ivoras@freebsd.org) wrote: > Total runtime: 0.507 ms [...] > Total runtime: 118.689 ms > > See in the first query where I have a simple LIMIT, it fetches random 10 > rows quickly, but in the second one, as soon as I give it to execute and > calculate the entire result set before I limit it, the performance is > horrible. What you've shown is that it takes 0.5ms for 10 rows, and 118ms for 8500 rows. Now, maybe I've missed something, but that's 0.05ms per row for the first query and 0.01ms per row for the second, and you've added a sort into the mix. The performance of going through the data actually improves on a per-record basis, since you're doing more in bulk. Since you're ordering by 'id', PG has to look at every row returned by the index scan. That's not free. Regarding the statistics, it's entirely possible that the index is *not* the fastest way to pull this data (it's nearly 10% of the table..), if the stats were better it might use a seq scan instead, not sure how bad the cost of the filter itself would be. Thanks, Stephen
Attachment
On 07/14/10 15:49, Stephen Frost wrote: > * Ivan Voras (ivoras@freebsd.org) wrote: >> Total runtime: 0.507 ms > [...] >> Total runtime: 118.689 ms >> >> See in the first query where I have a simple LIMIT, it fetches random 10 >> rows quickly, but in the second one, as soon as I give it to execute and >> calculate the entire result set before I limit it, the performance is >> horrible. > > What you've shown is that it takes 0.5ms for 10 rows, and 118ms for 8500 > rows. Yes, but... > Now, maybe I've missed something, but that's 0.05ms per row for > the first query and 0.01ms per row for the second, and you've added a > sort into the mix. The performance of going through the data actually > improves on a per-record basis, since you're doing more in bulk. > > Since you're ordering by 'id', PG has to look at every row returned by > the index scan. That's not free. This part of the processing is going on on the backend, and the backend needs to sort through 8500 integers. I don't think the sort is significant here. > Regarding the statistics, it's entirely possible that the index is *not* > the fastest way to pull this data (it's nearly 10% of the table..), if > the stats were better it might use a seq scan instead, not sure how bad > the cost of the filter itself would be. I think that what I'm asking here is: is it reasonable for tsearch2 to extract 8,500 rows from an index of 90,000 rows in 118 ms, given that the approximately same task can be done with an unindexed "LIKE" operator in nearly the same time?
Ivan Voras < ivoras@freebsd.org > wrote: > On 07/14/10 15:49, Stephen Frost wrote: >> Regarding the statistics, it's entirely possible that the index >> is *not* the fastest way to pull this data (it's nearly 10% of >> the table..) > > I think that what I'm asking here is: is it reasonable for > tsearch2 to extract 8,500 rows from an index of 90,000 rows in 118 > ms, given that the approximately same task can be done with an > unindexed "LIKE" operator in nearly the same time? The answer is "yes." When it's 10% of the table, a sequential scan can be more efficient than an index, as Stephen indicated. -Kevin
On 07/14/10 16:03, Kevin Grittner wrote: > Ivan Voras < ivoras@freebsd.org > wrote: >> On 07/14/10 15:49, Stephen Frost wrote: > >>> Regarding the statistics, it's entirely possible that the index >>> is *not* the fastest way to pull this data (it's nearly 10% of >>> the table..) >> >> I think that what I'm asking here is: is it reasonable for >> tsearch2 to extract 8,500 rows from an index of 90,000 rows in 118 >> ms, given that the approximately same task can be done with an >> unindexed "LIKE" operator in nearly the same time? > > The answer is "yes." When it's 10% of the table, a sequential scan > can be more efficient than an index, as Stephen indicated. Ok, to verify this I've tried increasing statistics on the field and running vacumm analyze full, which didn't help. Next, I've tried setting enable_indexscan to off, which also didn't do it: cms=> set enable_indexscan=off; SET cms=> explain analyze select id,title from forum where _fts_ @@ 'fer'::tsquery order by id limit 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=363.18..363.20 rows=10 width=35) (actual time=192.243..192.406 rows=10 loops=1) -> Sort (cost=363.18..363.40 rows=91 width=35) (actual time=192.229..192.283 rows=10 loops=1) Sort Key: id Sort Method: top-N heapsort Memory: 25kB -> Bitmap Heap Scan on forum (cost=29.21..361.21 rows=91 width=35) (actual time=12.071..136.130 rows=8449 loops=1) Recheck Cond: (_fts_ @@ '''fer'''::tsquery) -> Bitmap Index Scan on forum_fts (cost=0.00..29.19 rows=91 width=0) (actual time=11.169..11.169 rows=8449 loops=1) Index Cond: (_fts_ @@ '''fer'''::tsquery) Total runtime: 192.686 ms (9 rows) Any ideas on how to verify this?
Ivan, here is explain analyze output - 7122 out of 528155 docs tseval=# select count(*) from document; count -------- 528155 (1 row) Time: 345,562 ms tseval=# explain analyze select docno, title from document where vector @@ to_tsquery('english','mars'); Bitmap Heap Scan on document (cost=1655.97..10518.34 rows=2641 width=13) (actual time=3.127..11.556 rows=7122 loops=1) Recheck Cond: (vector @@ '''mar'''::tsquery) -> Bitmap Index Scan on idx_vector (cost=0.00..1655.31 rows=2641 width=0) (actual time=1.899..1.899 rows=7122 loops=1) Index Cond: (vector @@ '''mar'''::tsquery) Total runtime: 12.303 ms (5 rows) This is PostgreSQL 8.4.4 on Ubuntu machine. Oleg On Wed, 14 Jul 2010, Ivan Voras wrote: > On 07/14/10 16:03, Kevin Grittner wrote: >> Ivan Voras < ivoras@freebsd.org > wrote: >>> On 07/14/10 15:49, Stephen Frost wrote: >> >>>> Regarding the statistics, it's entirely possible that the index >>>> is *not* the fastest way to pull this data (it's nearly 10% of >>>> the table..) >>> >>> I think that what I'm asking here is: is it reasonable for >>> tsearch2 to extract 8,500 rows from an index of 90,000 rows in 118 >>> ms, given that the approximately same task can be done with an >>> unindexed "LIKE" operator in nearly the same time? >> >> The answer is "yes." When it's 10% of the table, a sequential scan >> can be more efficient than an index, as Stephen indicated. > > Ok, to verify this I've tried increasing statistics on the field and > running vacumm analyze full, which didn't help. Next, I've tried setting > enable_indexscan to off, which also didn't do it: > > cms=> set enable_indexscan=off; > SET > cms=> explain analyze select id,title from forum where _fts_ @@ > 'fer'::tsquery order by id limit 10; > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=363.18..363.20 rows=10 width=35) (actual > time=192.243..192.406 rows=10 loops=1) > -> Sort (cost=363.18..363.40 rows=91 width=35) (actual > time=192.229..192.283 rows=10 loops=1) > Sort Key: id > Sort Method: top-N heapsort Memory: 25kB > -> Bitmap Heap Scan on forum (cost=29.21..361.21 rows=91 > width=35) (actual time=12.071..136.130 rows=8449 loops=1) > Recheck Cond: (_fts_ @@ '''fer'''::tsquery) > -> Bitmap Index Scan on forum_fts (cost=0.00..29.19 > rows=91 width=0) (actual time=11.169..11.169 rows=8449 loops=1) > Index Cond: (_fts_ @@ '''fer'''::tsquery) > Total runtime: 192.686 ms > (9 rows) > > Any ideas on how to verify this? > > > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Ivan Voras <ivoras@freebsd.org> wrote: > which didn't help. Didn't help what? You're processing each row in 22.8 microseconds. What kind of performance were you expecting? -Kevin
On 14 July 2010 17:16, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Ivan Voras <ivoras@freebsd.org> wrote: > >> which didn't help. > > Didn't help what? You're processing each row in 22.8 microseconds. > What kind of performance were you expecting? Well, I guess you're right. What I was expecting is a large bump in speed going from LIKE to tsearch2 but then there's also record processing outside the index itself, which is probably where the time goes.