Re: Custom operator class costs - Mailing list pgsql-performance

From Robert Haas
Subject Re: Custom operator class costs
Date
Msg-id BANLkTim+L++hVHwvYRLPkuHsHptvBWKLow@mail.gmail.com
Whole thread Raw
In response to Custom operator class costs  (Ben Beecher <benbeecher@gmail.com>)
List pgsql-performance
On Wed, Mar 16, 2011 at 10:10 AM, Ben Beecher <benbeecher@gmail.com> wrote:
> Hey!
> I'm having some trouble optimizing a query that uses a custom operator class.
> #Postgres has given me a solution for natural sort -
> http://www.rhodiumtoad.org.uk/junk/naturalsort.sql
>
> I'm trying to run it over a huge table - when running it on demand,
> the data needs to be dumped to memory and sorted.
>
> Sort  (cost=31299.83..31668.83 rows=369 width=31)
>  Sort Key: name
>  ->  Seq Scan on solutions_textbookpage  (cost=0.00..25006.55
> rows=369 width=31)
>        Filter: (active AND (textbook_id = 263))
>
> That's obviously too slow. I've created an index using the custom
> operator class, so I don't have to do the sort every time I try to
> sort.
>
>  Index Scan Backward using natural_page_name_textbook on
> solutions_textbookpage  (cost=0.00..650.56 rows=371 width=31) (actual
> time=0.061..0.962 rows=369 loops=1)
>   Index Cond: (textbook_id = 263)
>   Filter: active
>
> Obviously a little faster!

Not totally obvious, since the sort output doesn't show how long it
actually took.

> The problem I'm having is that because operator classes have a low
> cost estimation pg missestimates and tries to do the sort on demand
> rather than using the index.
>
> I can get pg to use the index by either jacking up cpu_operator_cost
> or lowering random_page_cost. Is this the best way to do that, or is
> there a smarter way to ensure that pg uses this index when I need it.

It's pretty often necessary to lower random_page_cost, and sometimes
seq_page_cost, too.  If, for example, the database is fully cached,
you might try 0.1/0.1 rather than the default 4/1.  Raising the cpu_*
costs is equivalent, but I think it's easier to keep in your head if
you think about 1 as the nominal cost of reading a page sequentially
from disk, and then lower the value you actually assign to reflect the
fact that you'll normally be reading from the OS cache or perhaps even
hitting shared_buffers.

You might also need to tune effective_cache_size.

Is your operator class function unusually expensive?  Are you having
trouble with PG not using other indexes it should be picking up, or
just your custom one?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-performance by date:

Previous
From: Robert Haas
Date:
Subject: Re: big distinct clause vs. group by
Next
From: Robert Haas
Date:
Subject: Re: Adding additional index causes 20,000x slowdown for certain select queries - postgres 9.0.3