Thread: Grouped Index Tuples

Grouped Index Tuples

From
Heikki Linnakangas
Date:
I've brought the GIT patch up-to-date with CVS head. The latest version 
can be found at http://community.enterprisedb.com/git/

I also reran the CPU bound test cases with the latest patch.

I want this in 8.3 in some form, and I have the time to do any required 
changes. If someone wants to see more tests, I can arrange that as well.

The patch is pretty big at the moment. I think the best way to proceed 
with this is to extract some smaller, incremental patches from it that 
just refactor the current b-tree code. After that, the final patch that 
implements GIT should be much smaller and more readable. And there's 
still a bunch of todo items there as well...

But before I start doing that, I need some review and general agreement 
on the design. What I don't want to happen is that three days after the 
feature freeze, someone finally looks at it and finds a major issue or 
just thinks it's an unreadable mess, and we no longer have the time to 
fix it.

One question that I'm sure someone will ask is do we need this if we 
have bitmap indexes? Both aim at having a smaller index, after all. The 
use cases are quite different; GIT is effective whenever you have a 
table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's 
effectiveness doesn't depend on the number of distinct values, in 
particular it works well with unique indexes. GIT is comparable to 
clustered indexes in other DBMSs (in fact we might want to call GIT that 
in the end).

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Grouped Index Tuples

From
"Simon Riggs"
Date:
On Thu, 2007-02-22 at 12:47 +0000, Heikki Linnakangas wrote:

> One question that I'm sure someone will ask is do we need this if we 
> have bitmap indexes? Both aim at having a smaller index, after all.
> The use cases are quite different; GIT is effective whenever you have
> a table that's reasonably well-clustered. Unlike the bitmap indexam,
> GIT's effectiveness doesn't depend on the number of distinct values,
> in particular it works well with unique indexes. GIT is comparable to 
> clustered indexes in other DBMSs (in fact we might want to call GIT
> that in the end).

A few thoughts:

Whether its proof, or merely strong interest, we need to see where the
cross-over point is between bitmap indexes (BMI) and GIT. I'm assuming
there is one, but others may need convincing that there is a clear win
in common use cases. Graphs, numbers etc..

On your TODO list, it would be good to differentiate between ideas and
must-complete items. My suggested list would be:

Must complete
- cost model for optimizer
- user documentation
- performance tests

Good to complete
- duplicate key flag (should increase range of overlap with BMI)
- range re-check optimization (based upon your results)
- better index AM
- HOT, so that GIT can provide longer term benefits

Others
...

I seem to recall you mentioning there was a way to record that the heap
tuples are stored in monotonic order and so the sort can be avoided when
you return the tuples. Is that still true?

Also
"the patch roughly doubles the amount of WAL generated by WAL"
not sure what that means

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




Re: Grouped Index Tuples

From
Bruce Momjian
Date:
Heikki, are you going to revise this for 8.4?

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> I've brought the GIT patch up-to-date with CVS head. The latest version 
> can be found at http://community.enterprisedb.com/git/
> 
> I also reran the CPU bound test cases with the latest patch.
> 
> I want this in 8.3 in some form, and I have the time to do any required 
> changes. If someone wants to see more tests, I can arrange that as well.
> 
> The patch is pretty big at the moment. I think the best way to proceed 
> with this is to extract some smaller, incremental patches from it that 
> just refactor the current b-tree code. After that, the final patch that 
> implements GIT should be much smaller and more readable. And there's 
> still a bunch of todo items there as well...
> 
> But before I start doing that, I need some review and general agreement 
> on the design. What I don't want to happen is that three days after the 
> feature freeze, someone finally looks at it and finds a major issue or 
> just thinks it's an unreadable mess, and we no longer have the time to 
> fix it.
> 
> One question that I'm sure someone will ask is do we need this if we 
> have bitmap indexes? Both aim at having a smaller index, after all. The 
> use cases are quite different; GIT is effective whenever you have a 
> table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's 
> effectiveness doesn't depend on the number of distinct values, in 
> particular it works well with unique indexes. GIT is comparable to 
> clustered indexes in other DBMSs (in fact we might want to call GIT that 
> in the end).
> 
> -- 
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
> 
>                 http://www.postgresql.org/about/donate

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Grouped Index Tuples

From
"Heikki Linnakangas"
Date:
Bruce Momjian wrote:
> Heikki, are you going to revise this for 8.4?

Probably not. I have other features I want to work on at the moment.

> ---------------------------------------------------------------------------
> 
> Heikki Linnakangas wrote:
>> I've brought the GIT patch up-to-date with CVS head. The latest version 
>> can be found at http://community.enterprisedb.com/git/
>>
>> I also reran the CPU bound test cases with the latest patch.
>>
>> I want this in 8.3 in some form, and I have the time to do any required 
>> changes. If someone wants to see more tests, I can arrange that as well.
>>
>> The patch is pretty big at the moment. I think the best way to proceed 
>> with this is to extract some smaller, incremental patches from it that 
>> just refactor the current b-tree code. After that, the final patch that 
>> implements GIT should be much smaller and more readable. And there's 
>> still a bunch of todo items there as well...
>>
>> But before I start doing that, I need some review and general agreement 
>> on the design. What I don't want to happen is that three days after the 
>> feature freeze, someone finally looks at it and finds a major issue or 
>> just thinks it's an unreadable mess, and we no longer have the time to 
>> fix it.
>>
>> One question that I'm sure someone will ask is do we need this if we 
>> have bitmap indexes? Both aim at having a smaller index, after all. The 
>> use cases are quite different; GIT is effective whenever you have a 
>> table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's 
>> effectiveness doesn't depend on the number of distinct values, in 
>> particular it works well with unique indexes. GIT is comparable to 
>> clustered indexes in other DBMSs (in fact we might want to call GIT that 
>> in the end).
>>
>> -- 
>>    Heikki Linnakangas
>>    EnterpriseDB   http://www.enterprisedb.com
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 7: You can help support the PostgreSQL project by donating at
>>
>>                 http://www.postgresql.org/about/donate
> 


--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com