Re: proposal for smaller indexes on index-ordered tables - Mailing list pgsql-hackers

From Zoltan Boszormenyi
Subject Re: proposal for smaller indexes on index-ordered tables
Date
Msg-id 486168E7.6010109@cybertec.at
Whole thread Raw
In response to Re: proposal for smaller indexes on index-ordered tables  ("Jeffrey Baker" <jwbaker@gmail.com>)
List pgsql-hackers
Jeffrey Baker írta:
> On Tue, Jun 24, 2008 at 1:59 PM, Zoltan Boszormenyi <zb@cybertec.at
> <mailto:zb@cybertec.at>> wrote:
>
>     Jeffrey Baker írta:
>     > The way I read it, the current btree index stores the index
>     value and
>     > the TID of every tuple having that value.  When you have a table
>     with
>     > three columns, you index one of them and you get an index which is
>     > practically as large as the table itself.
>     >
>     > Supposing the table is generally or strictly ordered by the
>     column to
>     > be indexed, it would be more compact if the index stored ranges of
>     > tuples.  Instead of storing the TID of every tuple with that value,
>     > the index would store a first and last TID, between which all tuples
>     > have the value.
>     >
>     > Example: table with one million rows indexed on a column having one
>     > thousand distinct values.  Table is in-order by the indexed column.
>     > The traditional index would contain a million TIDs, whereas a range
>     > index would contain only two thousand.  The range index would be 500
>     > times smaller, more likely to be cached, etc.
>     >
>     > Thoughts?
>     >
>     > -jwb
>
>     Example with your theory:
>     One (not yet committed) transaction changes one tuple
>     that was in the middle of a range before but the tuple's indexed
>     column changed. What would you do?
>
>
> Insert the new tuple at the end of the table and add another range to
> the index.  Leave the old tuple in place and don't touch the original
> index range.

This is what I described below but I only mentioned the index part:

>     You need to keep track of multiple index versions:
>     1. the range has to be split for the not-yet-committed modifier
>     transaction,
>        it might need to re-read the same table.
>     2. the old range has to be kept for reader transactions that still see
>     the old data
>
>
> This is only true if you update the tuple in-place.

Why? If you update in-place then the above is not needed.
You just need to serialize transactions but there goes concurrency.

>     Imagine you have thousands of UPDATEs in flight on different rows.
>
>
> I'm quite aware of the problems of maintaining such a table and index,
> but the fact is that data warehouse type tables may never be updated
> after being created.  The particular application I'm struggling with
> does a SELECT ... INTO ... ORDER BY to make an ordered table for
> querying every night.  The problem is it takes longer, much longer, to
> create the index than to create the table, and in the end the index is
> as big as half the table anyway.
>
> So this type of index would only be useful for an essentially
> read-only table.  I agree.
>
> Quite another proposal would be to somehow instruct the database that
> the table is strictly in-order by a column and allow a binary search
> access method.  Then you don't need any index at all.

CLUSTER tablename USING indexname;
It's useful for little changing very large tables.

> -jwb
>


-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCHES] Patch for Prevent pg_dump/pg_restore from being affected by statement_timeout
Next
From: Tom Lane
Date:
Subject: Re: proposal for smaller indexes on index-ordered tables