Thread: Indexam interface proposal

Indexam interface proposal

From
Heikki Linnakangas
Date:
Currently amgettuple returns one matching tuple at a time, in index 
order. I'm proposing two changes to add support for
- candidate matches
- partial ordering

Those two features are essential to make clustered indexes work, and in 
the future, binned bitmap indexes that don't have a bitmap for each 
distinct value but for ranges of values.

There's a third feature looming in the future, that I haven't addressed:
- returning index values, for index-only scans or at least for filtering 
rows before fetching heap tuples.


I'm proposing that we keep the one tuple per call nature of the 
interface, but add a flag to mark candidate matches. index_getnext or 
the executor would need to recheck the original quals for tuples marked 
as candidates.

Another flag would be used to mean "this tuple marks the boundary of a 
partial ordering". Let's call it boundary_mark for now.

For example, if an index scan returned tuples with the following keys, 
with tuples on same line meaning the index doesn't know their relative 
ordering.

1
3 4 2
5 8 6 7
9
10

amgettuple would return the above tuples like this:

1 3 4 2 5 8 6 7 9 10
* *     *       * *

where the tuples marked with * would have the boundary_mark-flag set. If 
the plan requires ordered results, index_getnext would have to sort 
tuples between two markers before returning them to the caller.

amgetmulti would also need to have the candidate-flag added as I 
proposed in the "Bitmapindexscan changes" patch I sent earlier to 
pgsql-patches.

This interface change would solve much of the ugliness of my GIT patch, 
by generalizing the index quals checking and sorting code to index_getnext.

Another source of ugliness in the patch is in inserting new tuples. 
Inserts need to reach to the heap to fetch heap tuples, to compare keys 
when splitting a group. I don't see a clean fix for that, but I don't 
think it's as bad as the index scan code.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Indexam interface proposal

From
Martijn van Oosterhout
Date:
On Mon, Mar 19, 2007 at 12:23:01PM +0000, Heikki Linnakangas wrote:
> Currently amgettuple returns one matching tuple at a time, in index
> order. I'm proposing two changes to add support for
> - candidate matches

IIRC indexes can already ask to have the system recheck conditions on
returned tuples. For example GiST can return more tuples than actually
match. That's what the amopreqcheck column is for in pg_amop.

Or am I misunderstanding?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Indexam interface proposal

From
Heikki Linnakangas
Date:
Martijn van Oosterhout wrote:
> On Mon, Mar 19, 2007 at 12:23:01PM +0000, Heikki Linnakangas wrote:
>> Currently amgettuple returns one matching tuple at a time, in index 
>> order. I'm proposing two changes to add support for
>> - candidate matches
> 
> IIRC indexes can already ask to have the system recheck conditions on
> returned tuples. For example GiST can return more tuples than actually
> match. That's what the amopreqcheck column is for in pg_amop.

Right, except that flag is per operator in operator class, and what I'm 
proposing is that the index could pass a flag per tuple in the scan. 
Some tuples in the scan might need rechecking, some might not. The need 
for rechecking in clustered indexes is orthogonal to the need arising 
from the lossyness of GiST operators.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Indexam interface proposal

From
Teodor Sigaev
Date:
> Right, except that flag is per operator in operator class, and what I'm 
> proposing is that the index could pass a flag per tuple in the scan. 

That might make sense even for GiST. Sometimes complex compressions is used in 
GiST opclasses. If indexing value is rather small then it's stored in index as 
is, but large value is compressed with lossy techniques. So, GiST might return a 
tuple which is allowed to not recheck.
-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: Indexam interface proposal

From
Heikki Linnakangas
Date:
Teodor Sigaev wrote:
>> Right, except that flag is per operator in operator class, and what 
>> I'm proposing is that the index could pass a flag per tuple in the scan. 
> 
> That might make sense even for GiST. Sometimes complex compressions is 
> used in GiST opclasses. If indexing value is rather small then it's 
> stored in index as is, but large value is compressed with lossy 
> techniques. So, GiST might return a tuple which is allowed to not recheck.

Interesting. So we'd add a flag to the index tuples in GiST indicating 
if the tuple is lossily compressed or not. The compress-function would 
set that flag when it performs lossy compression, and gistgettuple would 
return it to the caller.

That would completely replace the current RECHECK-option we have, right?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Indexam interface proposal

From
Martijn van Oosterhout
Date:
On Mon, Mar 19, 2007 at 04:40:52PM +0300, Teodor Sigaev wrote:
> >Right, except that flag is per operator in operator class, and what I'm
> >proposing is that the index could pass a flag per tuple in the scan.
>
> That might make sense even for GiST. Sometimes complex compressions is used
> in GiST opclasses. If indexing value is rather small then it's stored in
> index as is, but large value is compressed with lossy techniques. So, GiST
> might return a tuple which is allowed to not recheck.

Given that rechecking requires Expr and state structures, maybe it would
be easier to make the operators RECHECK so the planner does the right
thing now, but make a flag that tells the index scan *not* to recheck
this tuple. That would seem slightly less work and fit better with the
existing code. (In other words, it's an optimisation rather than a big
change).

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Indexam interface proposal

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Martijn van Oosterhout wrote:
>> IIRC indexes can already ask to have the system recheck conditions on
>> returned tuples. For example GiST can return more tuples than actually
>> match. That's what the amopreqcheck column is for in pg_amop.

> Right, except that flag is per operator in operator class, and what I'm 
> proposing is that the index could pass a flag per tuple in the scan. 

The reason for attaching the flag to operators is so that the system
(particularly the planner) can tell *which* conditions need to be
rechecked, and can prepare the necessary expression infrastructure.
I dislike the idea of having to be prepared to do that every time
for every indexscan.  The notion of having to be prepared to sort
(according to what?) is even worse.
        regards, tom lane


Re: Indexam interface proposal

From
Teodor Sigaev
Date:
> Interesting. So we'd add a flag to the index tuples in GiST indicating 
> if the tuple is lossily compressed or not. The compress-function would 
> set that flag when it performs lossy compression, and gistgettuple would 
> return it to the caller.

Keys in GiST index may be another type than column on which index was created, 
so gistgettuple can not return tuple in original form - but sometimes 
gistgettuple may be sure that recheck isn't needed.

> That would completely replace the current RECHECK-option we have, right?
Yeah, this is possible.


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: Indexam interface proposal

From
Teodor Sigaev
Date:
> Given that rechecking requires Expr and state structures, maybe it would
> be easier to make the operators RECHECK so the planner does the right
> thing now, but make a flag that tells the index scan *not* to recheck
> this tuple. That would seem slightly less work and fit better with the
> existing code. (In other words, it's an optimisation rather than a big
> change).

I like your suggestion - it's useful for GiST/GIN fulltext indexing.


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: Indexam interface proposal

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> Martijn van Oosterhout wrote:
>>> IIRC indexes can already ask to have the system recheck conditions on
>>> returned tuples. For example GiST can return more tuples than actually
>>> match. That's what the amopreqcheck column is for in pg_amop.
> 
>> Right, except that flag is per operator in operator class, and what I'm 
>> proposing is that the index could pass a flag per tuple in the scan. 
> 
> The reason for attaching the flag to operators is so that the system
> (particularly the planner) can tell *which* conditions need to be
> rechecked, and can prepare the necessary expression infrastructure.
> I dislike the idea of having to be prepared to do that every time
> for every indexscan.  

I don't see any big down-side in preparing for that. We'd need to always 
store the original index quals in the executor node, like we do now with 
recheck-flagged operators, but that doesn't seem too bad to me.

I suppose we would want to keep the existing per-operator recheck-flag 
and quals as it is, and add another field like indexqualorig to be used 
to recheck tuples amgetnext flags as candidates.

> The notion of having to be prepared to sort
> (according to what?) is even worse.

That we wouldn't need for clustered indexes, if we change the current 
design a bit. Either:
* store a sorted list of offsetnumbers for each group, instead of a bitmap,
* or store a bitmap like now, but require that heap tuples in a grouped 
index tuple are in cluster order within the heap page.

The first option eats away some of the space savings, the second option 
makes clustered indexes to become declustered quicker if there's 
out-of-order updates or inserts.

Choosing either option would also reduce the CPU overhead of index 
scans, because we could use binary search within a grouped index tuple.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Indexam interface proposal

From
Bruce Momjian
Date:
Added to TODO:
* Allow index scans to return matching index keys  http://archives.postgresql.org/pgsql-hackers/2007-03/msg01079.php


---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> Currently amgettuple returns one matching tuple at a time, in index 
> order. I'm proposing two changes to add support for
> - candidate matches
> - partial ordering
> 
> Those two features are essential to make clustered indexes work, and in 
> the future, binned bitmap indexes that don't have a bitmap for each 
> distinct value but for ranges of values.
> 
> There's a third feature looming in the future, that I haven't addressed:
> - returning index values, for index-only scans or at least for filtering 
> rows before fetching heap tuples.
> 
> 
> I'm proposing that we keep the one tuple per call nature of the 
> interface, but add a flag to mark candidate matches. index_getnext or 
> the executor would need to recheck the original quals for tuples marked 
> as candidates.
> 
> Another flag would be used to mean "this tuple marks the boundary of a 
> partial ordering". Let's call it boundary_mark for now.
> 
> For example, if an index scan returned tuples with the following keys, 
> with tuples on same line meaning the index doesn't know their relative 
> ordering.
> 
> 1
> 3 4 2
> 5 8 6 7
> 9
> 10
> 
> amgettuple would return the above tuples like this:
> 
> 1 3 4 2 5 8 6 7 9 10
> * *     *       * *
> 
> where the tuples marked with * would have the boundary_mark-flag set. If 
> the plan requires ordered results, index_getnext would have to sort 
> tuples between two markers before returning them to the caller.
> 
> amgetmulti would also need to have the candidate-flag added as I 
> proposed in the "Bitmapindexscan changes" patch I sent earlier to 
> pgsql-patches.
> 
> This interface change would solve much of the ugliness of my GIT patch, 
> by generalizing the index quals checking and sorting code to index_getnext.
> 
> Another source of ugliness in the patch is in inserting new tuples. 
> Inserts need to reach to the heap to fetch heap tuples, to compare keys 
> when splitting a group. I don't see a clean fix for that, but I don't 
> think it's as bad as the index scan code.
> 
> -- 
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +