Re: Best way to scan on-disk bitmaps - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Best way to scan on-disk bitmaps
Date
Msg-id 87br7foeza.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Best way to scan on-disk bitmaps  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Best way to scan on-disk bitmaps  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Plan B would be to remove that restriction and teach btree and gist to
> cope.  While a btree couldn't use a nonconsecutive restriction as part
> of its where-to-scan logic, I don't see any good reason why it couldn't
> still perform the test before returning the TID, thus possibly saving a
> trip to the heap.  Offhand it seems this should be true of gist as well,
> but I don't know that code well enough to be sure.

Not long ago there was some discussion about how gist indexes don't really
handle multicolumn indexes usefully currently. They use only the first column
to determine page splits. The discussion wandered and it became clear that it
wasn't even clear what a multicolumn gist index should mean.

I suggested treating each column as independent axes. Independently ask each
column's datatype for the "distance" value and then calculate the inner
product as a kind of geometric n-dimensional distance. There was some
resistance since this limits gist indexes to always basing page splits on a
single "distance" based algorithm. (Though all current gist index methods in
the postgres source tree do work this way, mostly with copy-pasted code in
fact.)

In this model the columns listed in the gist index are unordered. Any subset
of columns can be used to perform an index lookup. Making it more like the
bitmap index behaviour you're looking at than the btree index behaviour.

-- 
greg



pgsql-hackers by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: SQL_ASCII vs. 7-bit ASCII encodings
Next
From: Tom Lane
Date:
Subject: Re: Best way to scan on-disk bitmaps