Re: Including Snapshot Info with Indexes - Mailing list pgsql-hackers

From Gokulakannan Somasundaram
Subject Re: Including Snapshot Info with Indexes
Date
Msg-id 9362e74e0710122300y44665e75s9a937891d561191@mail.gmail.com
Whole thread Raw
In response to Re: Including Snapshot Info with Indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Including Snapshot Info with Indexes
Re: Including Snapshot Info with Indexes
List pgsql-hackers
 


On 10/12/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andreas Joseph Krogh <andreak@officenet.no> writes:
> Will $SUBJECT make it possible for count(*) to use index? This is a much
> wanted feature.

If you mean "count(*) will become instantaneous", no it won't.  It would
get faster, but probably not by more than a factor of 10 or so,
corresponding to the size ratio between index and table.  Remember that
the proposal is going to bloat indexes pretty badly: a simple index on
an int4 column will double in size, more or less.  So that ratio will
be much less favorable than it is today.

I accept that the indexes will be bigger in size for this approach. You might need more disk-space  and you might need more memory to accomodate the same amount of information. But i think disk costs and memory costs have come down a lot, People can afford to buy more disk and memory. But the only fact that remains true is that the disk, the last mechanical device is slow in addressing Random I/Os. So this proposal is aimed  at occupying more memory and disk space to reduce Random I/Os. Say, if we are accomodating 200 tuples per index page in today's index(for a typical configuration), and as you said in the worst case (only one index field), we will be occupying 100 tuples per index page. But we would take away probably 100 random I/Os (say with bitmap scan it reduces to 75). 1GB of memory is around $100 and 1GB of disk is around $1. But one hour of Performance tuner would cost around $200 :)). So that's the trade-off for the enterprises, if they want to shift between the two indexes.

Personally I think the bloat problem will doom this entire approach.
The distributed costs of that will outweigh the speedups that can be
achieved for specific queries.  The OP is free to try to prove this
fear wrong, but no amount of discussion will constitute such a proof;
only a testable implementation.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

pgsql-hackers by date:

Previous
From: "Gokulakannan Somasundaram"
Date:
Subject: Re: Another Idea: Try Including snapshot with TOAS (was: Including Snapshot Info with Indexes)
Next
From: Simon Riggs
Date:
Subject: Re: Another Idea: Try Including snapshot with TOAS (was: Including Snapshot Info with Indexes)