Re: annoying query/planner choice - Mailing list pgsql-performance

From Tom Lane
Subject Re: annoying query/planner choice
Date
Msg-id 21520.1073886013@sss.pgh.pa.us
Whole thread Raw
In response to annoying query/planner choice  (Andrew Rawnsley <ronz@ravensfield.com>)
Responses Re: annoying query/planner choice  (Andrew Rawnsley <ronz@ravensfield.com>)
List pgsql-performance
Andrew Rawnsley <ronz@ravensfield.com> writes:
> I have a situation that is giving me small fits, and would like to see
> if anyone can shed any light on it.

In general, pulling 10% of a table *should* be faster as a seqscan than
an indexscan, except under the most extreme assumptions about clustering
(is the table clustered on site_id, by any chance?).  What I suspect is
that the table is a bit larger than your available RAM, so that a
seqscan ends up flushing all of the kernel's cache and forcing a lot of
I/O, whereas an indexscan avoids the cache flush by not touching (quite)
all of the table.  The trouble with this is that the index only looks
that good under test conditions, ie, when you repeat it just after an
identical query that pulled all of the needed pages into RAM.  Under
realistic load conditions where different site_ids are being hit, the
indexscan is not going to be as good as you think, because it will incur
substantial I/O.

You should try setting up a realistic test load hitting different random
site_ids, and see whether it's really a win to force seqscan off for
this query or not.

            regards, tom lane

pgsql-performance by date:

Previous
From: Christopher Browne
Date:
Subject: Re: annoying query/planner choice
Next
From: Richard Huxton
Date:
Subject: Re: COUNT & Pagination