Home > mailing lists

Re: Including Snapshot Info with Indexes - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Including Snapshot Info with Indexes
Date	October 8, 2007 05:41:31
Msg-id	4709ECFA.4020905@enterprisedb.com Whole thread Raw
In response to	Including Snapshot Info with Indexes ("Gokulakannan Somasundaram" <gokul007@gmail.com>)
Responses	Re: Including Snapshot Info with Indexes Re: Including Snapshot Info with Indexes
List	pgsql-hackers

Tree view

Gokulakannan Somasundaram wrote:
>     Currently The index implementation in Postgresql does not store the
> Snapshot information in the Index. If we add the snapshot information into
> the indexing structure, we will have the following advantages.

This idea has been discussed to death many times before. Please search
the archives.

> a) There can be index only scans like Oracle

IMO, the most promising approach to achieving index-only-scans at the
moment is the Dead Space Map, as discussed in the 8.3 dev cycle.

> b) Unique indexes will become less costly, as older index tuples can be
> found out.

Doesn't seem like a big benefit, considering that in most cases there
won't be any tuples in the index with a duplicate key. A common
exception to that is (non-HOT) updating a row. But in that case, the
page containing the old tuple is already in cache, so the lookup of the
visibility from the heap is cheap.

> c) Even the index scans will get faster, since some of the index tuples
> won't translate into HeapScans.

That's the same as doing an index-only-scan, right?

> d) Deletes and Updates will become slightly costly, as they have to update
> these indexes.

I think you're grossly underestimating the cost of that. For example, on
a table with 3 indexes. a delete currently requires one index lookup +
one heap lookup. With visibility in the indexes, that would require 3
index lookups + one heap lookup. That's 4 vs. 2 page accesses, not
taking into account the non-leaf b-tree pages. The real impact will
depend on what's in cache, but the cost can be very high.

Also, the full visibility information would need 12 bytes of space per
tuple. An index tuple on an int4 key currently takes 12 bytes, so that
would double the index size. Storage size has a big impact on
performance. More bytes means more I/O, less data fits in cache, and
more WAL traffic.

There's non-trivial implementation issues involved as well. You'd need a
way to reliably find all the index pointers for a given heap tuple
(search the archives for "retail vacuum" for the issues involved in
that. Broken user defined functions are a problem for example). And
you'd need to keep them all locked at the same time to modify them all
atomically, which is prone to deadlocks.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: "Gokulakannan Somasundaram"
Date: 08 October 2007, 03:42:17
Subject: Including Snapshot Info with Indexes

From: Magnus Hagander
Date: 08 October 2007, 06:07:20
Subject: Re: 8.4 TODO item: make src/port support libpq and ecpg directly

Re: Including Snapshot Info with Indexes - Mailing list pgsql-hackers

Previous

Next