Re: Bad query plan with high-cardinality column - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: Bad query plan with high-cardinality column
Date
Msg-id 1361566076.10158.YahooMailNeo@web162905.mail.bf1.yahoo.com
Whole thread Raw
In response to Bad query plan with high-cardinality column  (Alexander Staubo <alex@bengler.no>)
Responses Re: Bad query plan with high-cardinality column
List pgsql-performance
Alexander Staubo <alex@bengler.no> wrote:

> This is my schema:
>
>   create table comments (
>     id serial primary key,
>     conversation_id integer,
>     created_at timestamp
>   );
>   create index comments_conversation_id_index on comments (conversation_id);
>   create index comments_created_at_index on comments (created_at);

I suspect you would be better off without those two indexes, and
instead having an index on (conversation_id, created_at).  Not just
for the query you show, but in general.

>   select comments.id from comments where
>     conversation_id = 3975979 order by created_at limit 13
>
> This filters about 5000 rows and returns the oldest 13 rows. But
> the query is consistently planned wrong:

> [planner thinks it will be cheaper to read index in ORDER BY
> sequence and filter rows until it has 13 than to read 5471 rows
> and sort them to pick the top 13 after the sort.]

In my experience these problems come largely from the planner not
knowing the cost of dealing with each tuple.  I see a lot less of
this if I raise cpu_tuple_cost to something in the 0.03 to 0.05
range.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bad query plan with high-cardinality column
Next
From: Alexander Staubo
Date:
Subject: Re: Bad query plan with high-cardinality column