On Sun, Mar 05, 2006 at 10:00:25PM +0100, PFC wrote:
>
> Bitmap index scan is bliss. Many thanks to the postgres team ! Now
> searching in tables with a lot of fields and conditions is no longer a
> pain.
>
> And just a thought :
>
> SELECT * FROM table WHERE category IN (1,2,3) ORDER BY price LIMIT
> 10;
>
> Suppose you have an index on category, and another index on price.
> Depending on the stats postgres has about the values, you'll either get :
>
> 0- seq scan + sort
> 1- Plain or Bitmap Index scan using "category", then sort by "price"
> 2- Index scan on "price", Filter on "category IN (1,2,3)", no sort.
>
> 1 is efficient if the category is rare. Postgres knows this and uses
> this plan well.
> Without a LIMIT, option 1 should be preferred.
>
> 2 is efficient if the items in the categories 1,2,3 are cheap (close
> to the start of the index on price). However if the items in question are
> on the other side of the index, it will index-scan a large part of the
> table. This can be a big hit. Postgres has no stats about the correlation
> of "category" and "price", so it won't know when there is going to be a
> problem.
>
> Another option would be interesting. It has two steps :
>
> - Build a bitmap using the index on "category" (just like in case 1)
> so we know which pages on the table have relevant rows
>
> - Index scan on "price", but only looking in the heap for pages
> which are flagged in the bitmap, and then "Recheck Cond" on "category".
> In other words, do an index scan to get the rows in the right order,
> but don't bother to check the heap for pages where the bitmap says there
> are no rows.
> In the worst case, you still have to run through the entire index,
> but at least not through the entire table !
>
> It can also speed up some merge joins.
The problem is that you're now talking about doing 2 index scans instead
of just one and a sort. If the correlation on price is high, it could
still win. As the cost estimator for index scan stands right now,
there's no way such a plan would be chosen unless correlation was
extremely high, however.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461