Re: Sort and index

From: Jim C. Nasby
Subject: Re: Sort and index
Date: ,
Msg-id: 20050423030002.GY58835@decibel.org
(view: Whole thread, Raw)
In response to: Re: Sort and index  (Tom Lane)
Responses: Re: Sort and index  (Tom Lane)
List: pgsql-performance

Tree view

Sort and index  (Andrei Gaspar, )
 Re: Sort and index  ("Dave Held", )
  Re: Sort and index  (Andrei Gaspar, )
  Re: Sort and index  (Michael Fuhr, )
   Re: Sort and index  (Andrei Gaspar, )
  Re: Sort and index  ("Jim C. Nasby", )
   Re: Sort and index  (Tom Lane, )
    Re: Sort and index  ("Jim C. Nasby", )
     Re: Sort and index  ("Jim C. Nasby", )
      Re: Sort and index  (Tom Lane, )
       Re: Sort and index  ("Jim C. Nasby", )
        Re: Sort and index  (Tom Lane, )
         Re: Sort and index  ("Jim C. Nasby", )
          Re: Sort and index  (Manfred Koizar, )
           Re: Sort and index  ("Jim C. Nasby", )
            Re: Sort and index  (Manfred Koizar, )
             Re: Sort and index  ("Jim C. Nasby", )

On Fri, Apr 22, 2005 at 10:08:06PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <> writes:
> > I've run some performance tests. The actual test case is at
> > http://stats.distributed.net/~decibel/timing.sql, and the results are at
> > http://stats.distributed.net/~decibel/timing.log. In a nutshell, doing
> > an index scan appears to be about 2x faster than a sequential scan and a
> > sort.
>
> ... for one test case, on one platform, with a pretty strong bias to the
> fully-cached state since you ran the test multiple times consecutively.

The table is 6.5G and the box only has 4G, so I suspect it's not cached.

> Past experience has generally been that an explicit sort is quicker,
> so you'll have to pardon me for suspecting that this case may be
> atypical.  Is the table nearly in order by pkey, by any chance?

It might be, but there's no way I can check with a multi-key index,
right?

I'll re-run the tests with a single column index on a column with a
correlation of 16%

> > In any case, it's clear that the planner is making the wrong choice
> > here. BTW, changing random_page_cost to 3 or 4 doesn't change the plan.
>
> Feel free to propose better cost equations.

Where would I look in code to see what's used now?
--
Jim C. Nasby, Database Consultant               
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


pgsql-performance by date:

From: Greg Stark
Date:
Subject: Re: Bad n_distinct estimation; hacks suggested?
From: "Joel Fradkin"
Date:
Subject: Re: Joel's Performance Issues WAS : Opteron vs Xeon