Re: bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search). - Mailing list pgsql-performance
From | Craig Ringer |
---|---|
Subject | Re: bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search). |
Date | |
Msg-id | 1256624077.1709.70.camel@wallace.localnet Whole thread Raw |
In response to | Re: bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search). (Jesper Krogh <jesper@krogh.cc>) |
Responses |
Re: bitmap heap scan way cheaper than seq scan on the same
amount of tuples (fts-search).
|
List | pgsql-performance |
On Tue, 2009-10-27 at 06:44 +0100, Jesper Krogh wrote: > Here you should search for "commonterm" not "commonterm80", commonterm > will go into a seq-scan. You're not testing the same thing as I did. Point taken. I ran the same commands as you, but as the planner picked different plans it wasn't much use. The fact that I didn't notice that is a bit worrying, as it suggests and even worse than normal degree of brain-fade. Sorry for the waste of time. Anyway, testing more usefully: On 8.4 on a different system Pg uses the seq scan by preference, with a runtime of 1148ms. It doesn't seem to want to do a bitmap heap scan when searching for `commonterm' even when enable_seqscan is set to `off'. A search for `commonterm80' also uses a seq scan (1067ms), but if enable_seqscan is set to off it'll use a bitmap heap scan at 237ms. On my 8.3 Pg isn't using a seqscan even for `commonterm', which is ... odd. If I force it not to use a bitmap heap scan it'll use an index scan. Preventing that too results in a seq scan with a runtime of 1500ms vs the 161ms of the bitmap heap scan. I agree that it seems like a pretty strange result on face value. So, on both 8.3 and 8.4 the sequential scan is indeed taking a LOT longer than the bitmap heap scan, though similar numbers of tuples are being read by both. I see the same results when actually reading the results rather than just doing an `explain analyze'. With psql set to send output to /dev/null and with \timing enabled: test=> \o /dev/null test=> set enable_seqscan = on; Time: 0.282 ms test=> select id from ftstest where body_fts @@ to_tsquery('commonterm80'); Time: 988.880 ms test=> set enable_seqscan = off; Time: 0.286 ms test=> select id from ftstest where body_fts @@ to_tsquery('commonterm80'); Time: 159.167 ms so - nearly 1s vs 0.15s is a big difference between what I previously confirmed to be bitmap heap scan and seq scan respectively for the same query. The same number of records are being returned in both cases. If I "select *" rather than just reading the `id' field, the runtimes are much more similar - 4130ms seq scan, and 3285 bitmap heap scan (both reading data to /dev/null), a difference of ~800. `EXPLAIN ANALYZE' results are still quite different, though, at 1020ms seq scan vs 233ms bitmap heap, suggesting that the similarity is created only by the time taken to actually transfer the data to the client. The time difference between the two is much the same. So - for some reason the seq scan takes 800ms or so longer than the bitmap heap scan. I can see why you're puzzled. I can reproduce it on two different machines with two different Pg versions, and using two slightly different methods for loading the data as well. So, I can confirm your test results now that I'm actually testing properly. test=> explain analyze select * from ftstest where body_fts @@ to_tsquery('commonterm80'); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on ftstest (cost=25836.66..36432.95 rows=39753 width=54) (actual time=27.452..175.481 rows=39852 loops=1) Recheck Cond: (body_fts @@ to_tsquery('commonterm80'::text)) -> Bitmap Index Scan on ftstest_gin_idx (cost=0.00..25826.72 rows=39753 width=0) (actual time=25.186..25.186 rows=39852 loops=1) Index Cond: (body_fts @@ to_tsquery('commonterm80'::text)) Total runtime: 233.473 ms (5 rows) test=> set enable_seqscan = on; SET test=> explain analyze select * from ftstest where body_fts @@ to_tsquery('commonterm80'); QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Seq Scan on ftstest (cost=0.00..10750.00 rows=39753 width=54) (actual time=0.141..956.496 rows=39852 loops=1) Filter: (body_fts @@ to_tsquery('commonterm80'::text)) Total runtime: 1020.936 ms (3 rows) By the way, for the 8.4 test I modifed the loader script so it wouldn't take quite so painfully long to run second time 'round. I turned autocommit off, wrapped all the inserts up in a single transaction, and moved the fts index creation to after all the data has been inserted. It's a *LOT* faster, and the test results match yours. > they were run repeatedly. Yeah, just saw that in your original mail. Sorry. -- Craig Ringer
pgsql-performance by date: