Jeff Janes wrote on 26.08.2012 22:26:
>>> The seq scan is estimated to use sequential reads, while the
>>> index-only scan is estimated to use random reads (because the index is
>>> scanned in logical order, not physical order).
Sounds like scanning the index in physical order would be an enhancement.
That could improve the performance of for queries that don't require a special order and can be retrieved from an
index.
>> I tried it also with 750000 rows filled with 3 text columns of random string
>> (between 20 and 15000 characters long).
>
> Could you show that in a copy-and-paste-able example?
Not really - it's about 70MB.
I used a Benerator script (http://databene.org/databene-benerator) to generate the test
data which is attached to this email. The table itself is pretty straight forward:
CREATE TABLE pages
(
id integer NOT NULL,
url text,
last_error text,
content text
);
ALTER TABLE pages
ADD CONSTRAINT pages_pkey PRIMARY KEY (id);
> Did you change random_page_cost?
Yes I did. When setting it to 3.0 the planner will indeed use an index only scan.
And it does confirm Tom's suspicion that the planner is actually correct, because the index scan is indeed slower than
theseq scan.
I was inspired by this question on StackOverflow:
http://stackoverflow.com/questions/12128501/fastest-way-to-count-the-rows-in-any-database-table/12128545#12128545
Which shows Oracle's behaviour with an index scan for the count(*) operation.
I thought I'd try 9.2 to see how it compares, but it seems there is quite some way to go for my
beloved Postgres to catch up ;)
Thanks for all the input.
Regards
Thomas