Thread: understanding bitmap index benefit

understanding bitmap index benefit

From
Oleg Bartunov
Date:
Tom,

I noticed that along with many improvements in join operations bitmap
index speed up execution of first time query. It's known complain about
slow full text searching when query runs for the first time. But in CVS
version I see very nice behaviour I'd like to understand.

Query below is full text search on titles from pgsql mailing list archive:

8.0.3:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('vacuum&analyze');
                             QUERY PLAN 
 

---------------------------------------------------------------------------------------------------------------------------
IndexScan using fts_idx on titles  (cost=0.00..1124.34 rows=378 width=41) (actual time=20.884..806.073 rows=291
loops=1)  Index Cond: (fts_index @@ '\'vacuum\' & \'analyz\''::tsquery) Total runtime: 808.095 ms
 
(3 rows)


8.1dev:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('vacuum&analyze');
                            QUERY PLAN 
 

------------------------------------------------------------------------------------------------------------------------
BitmapHeap Scan on titles  (cost=5.32..1349.79 rows=378 width=41) (actual time=207.304..258.964 rows=291 loops=1)
Filter:(fts_index @@ '\'vacuum\' & \'analyz\''::tsquery)   ->  Bitmap Index Scan on fts_idx  (cost=0.00..5.32 rows=378
width=0)(actual time=206.993..206.993 rows=291 loops=1)         Index Cond: (fts_index @@ '\'vacuum\' &
\'analyz\''::tsquery)Total runtime: 260.946 ms
 
(5 rows)

Now, more frequent terms:

8.0.3:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('create&table');
                            QUERY PLAN 
 

-----------------------------------------------------------------------------------------------------------------------------
IndexScan using fts_idx on titles  (cost=0.00..1124.34 rows=378 width=41) (actual time=117.904..4463.131 rows=551
loops=1)  Index Cond: (fts_index @@ '\'create\' & \'table\''::tsquery) Total runtime: 4467.814 ms
 
(3 rows)

8.1dev:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('create&table');
                          QUERY PLAN 
 

------------------------------------------------------------------------------------------------------------------------
BitmapHeap Scan on titles  (cost=5.32..1349.79 rows=378 width=41) (actual time=296.310..423.622 rows=551 loops=1)
Filter:(fts_index @@ '\'create\' & \'table\''::tsquery)   ->  Bitmap Index Scan on fts_idx  (cost=0.00..5.32 rows=378
width=0)(actual time=296.001..296.001 rows=551 loops=1)         Index Cond: (fts_index @@ '\'create\' &
\'table\''::tsquery)Total runtime: 426.573 ms
 
(5 rows)

Configurations for two postmasters are the same and they're running on the
same machine (my notebook IBM X40, Linux 2.6.8.1).

my applause, Tom !
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: understanding bitmap index benefit

From
Oleg Bartunov
Date:
yOn Thu, 19 May 2005, Oleg Bartunov wrote:

> Tom,
>
> I noticed that along with many improvements in join operations bitmap
> index speed up execution of first time query. It's known complain about
> slow full text searching when query runs for the first time. But in CVS
> version I see very nice behaviour I'd like to understand.
>

Hmm, after restarting postmasters  I don't see any benefit :( 
Sometimes, I see reversed behaviour. Seems, combination of system's and 
postgres cacheing.

I tried to see io statistics, but it was weird in 8.0X and in  8.1dev I still
don't understand it :)
Below is a stats for fully cached query:

heap_blk, idx_blk show stats in form of 'blks_read:blks_hit' and 
seq_tup,idx_tup - in form of 'number of scan:tuples fetched'.

8.0.3:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('list');
                   QUERY PLAN 
 

---------------------------------------------------------------------------------------------------------------------------
IndexScan using fts_idx on titles  (cost=0.00..1124.34 rows=378 width=42) (actual time=0.088..160.026 rows=4153
loops=1)  Index Cond: (fts_index @@ '\'list\''::tsquery) Total runtime: 184.834 ms
 
(3 rows)

mw=# select * from iostat where relname='titles'; relname | heap_blk | idx_blk | seq_tup | idx_tup 
---------+----------+---------+---------+--------- titles  | 0:3078   | 0:6925  | 0:0     | 1:4154
(1 row)

8.1dev:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('list');
                  QUERY PLAN 
 

-------------------------------------------------------------------------------------------------------------------------
BitmapHeap Scan on titles  (cost=5.32..1349.79 rows=378 width=41) (actual time=105.663..136.422 rows=4153 loops=1)
Filter:(fts_index @@ '\'list\''::tsquery)   ->  Bitmap Index Scan on fts_idx  (cost=0.00..5.32 rows=378 width=0)
(actualtime=104.012..104.012 rows=4154 loops=1)         Index Cond: (fts_index @@ '\'list\''::tsquery) Total runtime:
158.258ms
 
(5 rows)

mw=# select * from iostat where relname='titles';                                relname | heap_blk | idx_blk | seq_tup
|idx_tup relname | heap_blk | idx_blk | seq_tup | idx_tup 
 
---------+----------+---------+---------+--------- titles  | 0:2704   | 0:2797  | 0:0     | 0:0
(1 row)

I see that in 8.1dev, there is no stats for idx_tup !
If I disable bitmap indices in 8.1dev I got

mw=# select * from iostat where relname='titles';                                relname | heap_blk | idx_blk | seq_tup
|idx_tup relname | heap_blk | idx_blk | seq_tup | idx_tup 
 
---------+----------+---------+---------+--------- titles  | 0:3446   | 0:2797  | 0:0     | 1:4154
(1 row)

notice, stats is different from 8.0.3 as one could expect, especially
in the numbers of idx_blk. 
Since, everything is cached there is no visible difference in timings.


after restarting postmasters (pg_ctl restart):

8.0.3:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('list');
                    QUERY PLAN 
 

----------------------------------------------------------------------------------------------------------------------------
IndexScan using fts_idx on titles  (cost=0.00..1124.34 rows=378 width=42) (actual time=0.159..5845.181 rows=4153
loops=1)  Index Cond: (fts_index @@ '\'list\''::tsquery) Total runtime: 5870.105 ms
 
(3 rows)

mw=# select * from iostat where relname='titles'; relname | heap_blk  |  idx_blk  | seq_tup | idx_tup 
---------+-----------+-----------+---------+--------- titles  | 2072:1006 | 1386:5539 | 0:0     | 1:4154
(1 row)


8.1dev:

mw=# explain analyze select title from titles where fts_index @@ to_tsquery('list');
                  QUERY PLAN 
 

-------------------------------------------------------------------------------------------------------------------------
BitmapHeap Scan on titles  (cost=5.32..1349.79 rows=378 width=41) (actual time=577.541..2630.110 rows=4153 loops=1)
Filter:(fts_index @@ '\'list\''::tsquery)   ->  Bitmap Index Scan on fts_idx  (cost=0.00..5.32 rows=378 width=0)
(actualtime=575.808..575.808 rows=4154 loops=1)         Index Cond: (fts_index @@ '\'list\''::tsquery) Total runtime:
2654.472ms
 
(5 rows)

mw=# select * from iostat where relname='titles'; relname | heap_blk |  idx_blk  | seq_tup | idx_tup 
---------+----------+-----------+---------+--------- titles  | 2704:0   | 1399:1398 | 0:0     | 0:0
(1 row)





iostat is my view defined as
    View "public.iostat"  Column  | Type | Modifiers 
----------+------+----------- relname  | name | heap_blk | text | idx_blk  | text | seq_tup  | text | idx_tup  | text |

View definition: SELECT blk.relname, (blk.heap_blks_read::text || ':'::text) || blk.heap_blks_hit::text AS heap_blk,
(blk.idx_blks_read::text|| ':'::text) || blk.idx_blks_hit::text AS idx_blk, (tpl.seq_scan::text || ':'::text) ||
tpl.seq_tup_read::textAS seq_tup, (tpl.idx_scan::text || ':'::text) || tpl.idx_tup_fetch::text AS idx_tup   FROM
pg_statio_user_tablesblk   JOIN pg_stat_user_tables tpl USING (relname);
 



>    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: understanding bitmap index benefit

From
Tom Lane
Date:
Oleg Bartunov <oleg@sai.msu.su> writes:
> I tried to see io statistics, but it was weird in 8.0X and in  8.1dev I still
> don't understand it :)

We aren't yet updating the io statistics for bitmap scans properly.
There was a thread about this but it petered out without any resolution
about what we should do ...
        regards, tom lane


Re: understanding bitmap index benefit

From
Oleg Bartunov
Date:
On Thu, 19 May 2005, Tom Lane wrote:

> Oleg Bartunov <oleg@sai.msu.su> writes:
>> I tried to see io statistics, but it was weird in 8.0X and in  8.1dev I still
>> don't understand it :)
>
> We aren't yet updating the io statistics for bitmap scans properly.
> There was a thread about this but it petered out without any resolution
> about what we should do ...

what's 'Bitmap Heap Scan' ?
 Bitmap Heap Scan on titles  (cost=5.32..1349.79 rows=378 width=41) (actual time=76.601..111.207 rows=4153 loops=1)
Filter:(fts_index @@ '\'list\''::tsquery)   ->  Bitmap Index Scan on fts_idx  (cost=0.00..5.32 rows=378 width=0)
(actualtime=74.943..74.943 rows=4154 loops=1)         Index Cond: (fts_index @@ '\'list\''::tsquery)
 


>
>             regards, tom lane
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83