Re: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb) - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb)
Date
Msg-id 37C2EE31.7B9A6CDC@trust.ee
Whole thread Raw
In response to Sorting costs (was Caution: tonight's commits force initdb)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb)
RE: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb)
List pgsql-hackers
Tom Lane wrote:
> 
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > Hmm,Index scan is chosen to select all rows.
> > AFAIK,sequential scan + sort is much faster than index scan in
> > most cases.
> >       cost of index scan < cost of sequential scan + cost of sort
> > I have felt that the current cost estimation of index scan is too small,
> > though I have no alternative.

can the optimizer make use of LIMIT, or some other hint that reaction 
time is preferred over speed of full query ?

In web apps the index scan may often be fastre than seq scan + sort as
one 
may not actually need all the tuples but only a small fraction from near 
the beginning.

Getting the beginning fast also gives better responsiveness for other 
interactive uses.

> I am also suspicious that indexscan costs are underestimated.  The
> cost of reading the index is probably not too far off, but the cost
> of accessing the main table is bogus.  Worst case, for a table whose
> tuples are thoroughly scattered, you would have a main-table page fetch
> for *each* returned tuple.  In practice it's probably not anywhere near
> that bad, since you may have some clustering of tuples (so that the same
> page is hit several times in a row), and also the Postgres buffers and
> the underlying Unix system's disk cache will save trips to disk if there
> is any locality of reference at all.  I have no idea how to estimate
> that effect --- anyone?  But cost_index is clearly producing a
> ridiculously optimistic estimate at the moment.

The one way to find out would be actual benchmarking - if current 
optimizer prefers index scans it is possible to do a query using 
index scan, dro the index, somehow flush disk cache and then do the 
same query using seqscan+sort. 

If the latter is preferred anyway we would have no way to test ...

------------------
Hannu


pgsql-hackers by date:

Previous
From: Brian E Gallew
Date:
Subject: Re: [HACKERS] vacuum process size
Next
From: "Hub.Org News Admin"
Date:
Subject: ...