Home > mailing lists

Re: Including Snapshot Info with Indexes - Mailing list pgsql-hackers

From	Gokulakannan Somasundaram
Subject	Re: Including Snapshot Info with Indexes
Date	October 13, 2007 03:01:12
Msg-id	9362e74e0710122300y44665e75s9a937891d561191@mail.gmail.com Whole thread Raw
In response to	Re: Including Snapshot Info with Indexes (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Including Snapshot Info with Indexes Re: Including Snapshot Info with Indexes
List	pgsql-hackers

Tree view

On 10/12/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andreas Joseph Krogh <andreak@officenet.no> writes:
> Will $SUBJECT make it possible for count(*) to use index? This is a much
> wanted feature.

If you mean "count(*) will become instantaneous", no it won't.  It would
get faster, but probably not by more than a factor of 10 or so,
corresponding to the size ratio between index and table.  Remember that
the proposal is going to bloat indexes pretty badly: a simple index on
an int4 column will double in size, more or less.  So that ratio will
be much less favorable than it is today.

I accept that the indexes will be bigger in size for this approach. You might need more disk-space and you might need more memory to accomodate the same amount of information. But i think disk costs and memory costs have come down a lot, People can afford to buy more disk and memory. But the only fact that remains true is that the disk, the last mechanical device is slow in addressing Random I/Os. So this proposal is aimed at occupying more memory and disk space to reduce Random I/Os. Say, if we are accomodating 200 tuples per index page in today's index(for a typical configuration), and as you said in the worst case (only one index field), we will be occupying 100 tuples per index page. But we would take away probably 100 random I/Os (say with bitmap scan it reduces to 75). 1GB of memory is around $100 and 1GB of disk is around $1. But one hour of Performance tuner would cost around $200 :)). So that's the trade-off for the enterprises, if they want to shift between the two indexes.

Personally I think the bloat problem will doom this entire approach.
The distributed costs of that will outweigh the speedups that can be
achieved for specific queries.  The OP is free to try to prove this
fear wrong, but no amount of discussion will constitute such a proof;
only a testable implementation.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

pgsql-hackers by date:

From: "Gokulakannan Somasundaram"
Date: 13 October 2007, 02:43:40
Subject: Re: Another Idea: Try Including snapshot with TOAS (was: Including Snapshot Info with Indexes)

From: Simon Riggs
Date: 13 October 2007, 03:44:33
Subject: Re: Another Idea: Try Including snapshot with TOAS (was: Including Snapshot Info with Indexes)

Re: Including Snapshot Info with Indexes - Mailing list pgsql-hackers

Previous

Next