Re: proposal for smaller indexes on index-ordered tables - Mailing list pgsql-hackers

From Jeffrey Baker
Subject Re: proposal for smaller indexes on index-ordered tables
Date
Msg-id fd145f7d0806241408he74dd04i38f63f857f5d4adf@mail.gmail.com
Whole thread Raw
In response to Re: proposal for smaller indexes on index-ordered tables  (Zoltan Boszormenyi <zb@cybertec.at>)
Responses Re: proposal for smaller indexes on index-ordered tables
Re: proposal for smaller indexes on index-ordered tables
List pgsql-hackers
On Tue, Jun 24, 2008 at 1:59 PM, Zoltan Boszormenyi <zb@cybertec.at> wrote:
Jeffrey Baker írta:
> The way I read it, the current btree index stores the index value and
> the TID of every tuple having that value.  When you have a table with
> three columns, you index one of them and you get an index which is
> practically as large as the table itself.
>
> Supposing the table is generally or strictly ordered by the column to
> be indexed, it would be more compact if the index stored ranges of
> tuples.  Instead of storing the TID of every tuple with that value,
> the index would store a first and last TID, between which all tuples
> have the value.
>
> Example: table with one million rows indexed on a column having one
> thousand distinct values.  Table is in-order by the indexed column.
> The traditional index would contain a million TIDs, whereas a range
> index would contain only two thousand.  The range index would be 500
> times smaller, more likely to be cached, etc.
>
> Thoughts?
>
> -jwb

Example with your theory:
One (not yet committed) transaction changes one tuple
that was in the middle of a range before but the tuple's indexed
column changed. What would you do?

Insert the new tuple at the end of the table and add another range to the index.  Leave the old tuple in place and don't touch the original index range.
 
You need to keep track of multiple index versions:
1. the range has to be split for the not-yet-committed modifier transaction,
   it might need to re-read the same table.
2. the old range has to be kept for reader transactions that still see
the old data

This is only true if you update the tuple in-place. 
 
Imagine you have thousands of UPDATEs in flight on different rows.

I'm quite aware of the problems of maintaining such a table and index, but the fact is that data warehouse type tables may never be updated after being created.  The particular application I'm struggling with does a SELECT ... INTO ... ORDER BY to make an ordered table for querying every night.  The problem is it takes longer, much longer, to create the index than to create the table, and in the end the index is as big as half the table anyway.

So this type of index would only be useful for an essentially read-only table.  I agree.

Quite another proposal would be to somehow instruct the database that the table is strictly in-order by a column and allow a binary search access method.  Then you don't need any index at all.

-jwb

pgsql-hackers by date:

Previous
From: Zoltan Boszormenyi
Date:
Subject: Re: proposal for smaller indexes on index-ordered tables
Next
From: "Stephen R. van den Berg"
Date:
Subject: Re: Dept of ugly hacks: eliminating padding space in system indexes