Re: Why is indexonlyscan so darned slow? - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: Why is indexonlyscan so darned slow?
Date
Msg-id CAHyXU0xT=Se_zMdjEaC9W6uGuuP7LEOLKpJ7CRwrH8z+k4SQ-A@mail.gmail.com
Whole thread Raw
In response to Re: Why is indexonlyscan so darned slow?  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Why is indexonlyscan so darned slow?
Re: Why is indexonlyscan so darned slow?
List pgsql-hackers
On Mon, May 21, 2012 at 4:17 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Mon, May 21, 2012 at 1:42 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>
>>> Earlier you said that this should be an ideal setup for IOS.  But it
>>> isn't really--the ideal set up is one in which the alternative to an
>>> IOS is a regular index scan which makes many uncached scattered reads
>>> into the heap.  I don't think that that situation can't really be
>>> engineered with a where-less query.
>>
>> Can you give me some suggested comparisons which *would* be ideal, then?
>
> Are you looking for vaguely real-life examples, or highly contrived
> examples used to dissect the server?
>
> For vaguely real life, take your example of pgbench -i -s200 -F 50,
> and I have 2Gig RAM, which seems to be the same as you do.
>
> With select only work load (pgbench -S -M prepared -T 30), I get
>
> tps = 193
>
> But now enable index-only scans:
>
> psql -c "create index on pgbench_accounts(aid, abalance);"
>
> and it goes up to.
>
> tps = 10137

Right -- the main driver here is that your index fits neatly in ram
and the heap does not -- so you're effectively measuring the
difference between a buffered and non-buffered access.  That's highly
contrived as you noted and unlikely to come up all *that* often in the
real world.

Generally though the real world wins (although the gains will be
generally less spectacular) are heavily i/o bound queries where the
indexed subset of data you want is nicely packed and the (non
clustered) heap records are all over the place.  By skipping the semi
random heap lookups you can see enormous speedups.  I figure 50-90%
improvement would be the norm there, but this is against queries that
are taking forever, being i/o bound.

See here: http://www.devheads.net/database/postgresql/performance/index-all-necessary-columns-postgres-vs-mssql.htm
for a 'in the wild' gripe about about not having index scans.

merlin


pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Why is indexonlyscan so darned slow?
Next
From: Florian Pflug
Date:
Subject: Re: read() returns ERANGE in Mac OS X