Re: Index AM change proposals, redux - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Index AM change proposals, redux
Date
Msg-id 20080424111309.GE14647@svana.org
Whole thread Raw
In response to Re: Index AM change proposals, redux  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Index AM change proposals, redux  (Simon Riggs <simon@2ndquadrant.com>)
Re: Index AM change proposals, redux  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-hackers
On Thu, Apr 24, 2008 at 10:11:02AM +0100, Simon Riggs wrote:
> Index compression is possible in many ways, depending upon the
> situation. All of the following sound similar at a high level, but each
> covers a different use case.

True, but there is one significant difference:

> * For Long, Similar data e.g. Text we can use Prefix Compression
> * For Highly Non-Unique Data we can use Duplicate Compression
> * Multi-Column Leading Value Compression - if you have a multi-column

These are all not lossy and so are candidate to use on any b-tree even
by default. They don't affect plan-construction materially, except
perhaps in cost calculations. Given the index tuple overhead I don't
see how you could lose.

> * For Unique/nearly-Unique indexes we can use Range Compression

This one is lossy and so does affect possible plans. I beleive the term
for this is "sparse" index. Also, it's seems harder to optimise, since
it would seem to me the index would have to have some idea on how
"close" values are together.

> As with HOT, all of these techniques need to be controlled by
> heuristics. Taken to extremes, these techniques can hurt other aspects
> of performance, so its important we don't just write the code but also
> investigate reasonable limits for behaviour also.

For the first three I can't imagine what the costs would be, except the
memory usage to store the unduplicated data once when its read in.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Index AM change proposals, redux
Next
From: Simon Riggs
Date:
Subject: Re: Batch update of indexes on data loading