Re: bitmap scans, btree scans, and tid order - Mailing list pgsql-hackers

From Tom Lane
Subject Re: bitmap scans, btree scans, and tid order
Date
Msg-id 19994.1116268519@sss.pgh.pa.us
Whole thread Raw
In response to Re: bitmap scans, btree scans, and tid order  ("Jeffrey W. Baker" <jwbaker@acm.org>)
Responses Re: bitmap scans, btree scans, and tid order  ("Jeffrey W. Baker" <jwbaker@acm.org>)
Re: bitmap scans, btree scans, and tid order  (Mike Rylander <mrylander@gmail.com>)
Bitmap scan cost model (was Re: bitmap scans, btree scans, and tid order)  ("Jeffrey W. Baker" <jwbaker@acm.org>)
List pgsql-hackers
"Jeffrey W. Baker" <jwbaker@acm.org> writes:
> On Mon, 2005-05-16 at 09:53 -0400, Tom Lane wrote:
>> This is a fallacy, and I think your concern is largely mistaken.  Have
>> you experimented with the cases you are worried about?

> Perhaps I have not stated the problem clearly.  Believe me, I have
> experimented extensively with this problem.

Sorry, perhaps I wasn't clear: have you experimented *with CVS tip*?
The current code is certainly capable of choosing either seqscan,
bitmap scan, or traditional index scan depending on the given query.
For example,

regression=# explain analyze select * from tenk1 where unique1 between 100 and 1000;
                  QUERY PLAN
 

--------------------------------------------------------------------------------------------------------------------------Bitmap
HeapScan on tenk1  (cost=9.58..381.53 rows=930 width=244) (actual time=6.185..18.034 rows=901 loops=1)  Recheck Cond:
((unique1>= 100) AND (unique1 <= 1000))  ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..9.58 rows=930 width=0)
(actualtime=4.522..4.522 rows=901 loops=1)        Index Cond: ((unique1 >= 100) AND (unique1 <= 1000))Total runtime:
23.784ms
 
(5 rows)

regression=# explain analyze select * from tenk1 where unique2 between 100 and 1000;
                   QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------------Index
Scanusing tenk1_unique2 on tenk1  (cost=0.00..45.88 rows=805 width=244) (actual time=0.154..14.856 rows=901 loops=1)
IndexCond: ((unique2 >= 100) AND (unique2 <= 1000))Total runtime: 20.331 ms
 
(3 rows)

(The reason these apparently-identical queries get different plans is
that the unique2 column is physically ordered, so the plain indexscan
gets a very high correlation discount.)  There are enable_foo switches
to let you force selection of any one of the three ways for testing
purposes.

> It's also possible that changing the btree scan to work in
> groups of tuples instead of single tuples would make more sense, which
> is why I ventured two different solution to the problem.

My feeling is that that would add a lot of complexity for dubious win.
The reason it's dubious is that it would penalize some cases, for
instance LIMIT-type queries where you aren't going to fetch many tuples
anyway.  I think that at least for the time being (8.1 time frame) we
should leave traditional indexscans alone and concentrate on being sure
we are getting the most we can out of the new bitmap scan code.  Only
after that dust has settled will we have hard facts with which to argue
about whether compromise behaviors might be wins.

I think the work that's most needed at the moment is to test the
bitmap-scan cost model to see if it has much to do with reality...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Returning the name of a primary key
Next
From: Lamar Owen
Date:
Subject: Re: pgFoundry