Re: bitmap scans, btree scans, and tid order - Mailing list pgsql-hackers

From Jeffrey W. Baker
Subject Re: bitmap scans, btree scans, and tid order
Date
Msg-id 1116265673.9920.18.camel@toonses.gghcwest.com
Whole thread Raw
In response to bitmap scans, btree scans, and tid order  ("Jeffrey W. Baker" <jwbaker@acm.org>)
Responses Re: bitmap scans, btree scans, and tid order  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, 2005-05-16 at 09:53 -0400, Tom Lane wrote:
> Jeffrey Baker <jwbaker@acm.org> writes:
> > Change the planner/executor to use the bitmap scan in all cases where 
> > index order is unimportant.  From my reading of the current code, the 
> > bitmap scan is only used in case of a join.
> 
> This is a fallacy, and I think your concern is largely mistaken.  Have
> you experimented with the cases you are worried about?

Perhaps I have not stated the problem clearly.  Believe me, I have
experimented extensively with this problem.  So here's the statement of
the problem: during an index scan, the heap is visited in index order,
which can be and frequently is random order on the disk.  An index scan
that currently takes 15 minutes on my database takes only 5 when the
tuples are fetched in strict disk order, and takes 8 minutes if the
tuples are fetched in disk order in groups of 1000.  The problem exists
when the scan is expected to return a lot of tuples, but the planner
chooses an index scan anyway.  In many cases, sequential scan would be
faster than index scan + random heap visits.

> It's entirely possible that the current cost model needs adjustment to
> make the planner pick the bitmap scan in more cases.  However, it is
> easy to demonstrate (either by thought-experiment or actual trial) that
> a bitmap scan isn't necessarily a win.

I agree.  There's certainly a threshold below which the bitmap would not
make sense.  It's also possible that changing the btree scan to work in
groups of tuples instead of single tuples would make more sense, which
is why I ventured two different solution to the problem.

>  The overhead of maintaining the
> bitmap is far from zero, and you don't get anything unless the resulting
> pattern of heap page fetches is significantly cleaner than before.

Bitmaps scans naturally come back in TID order.  I realize this isn't
1:1 correspondence with disk order, but it's a good start.  I also like
the way the index scan and heap scan are decoupled in the bitmap code.
It leaves room for more serious optimization of disk access, which is
relatively difficult in the synchronous, one-at-a-time btree code.

> Therefore, a patch that eliminates cost-estimation in favor of trying to
> force one choice or the other is definitely not likely to be received
> favorably ...

That's not the idea.

-jwb




pgsql-hackers by date:

Previous
From: "Mark Cave-Ayland"
Date:
Subject: Re: Cost of XLogInsert CRC calculations
Next
From: Josh Berkus
Date:
Subject: Re: pgFoundry