Re: multiple joins + Order by + LIMIT query performance issue - Mailing list pgsql-performance

From Shaun Thomas
Subject Re: multiple joins + Order by + LIMIT query performance issue
Date
Msg-id 1210098273.14833.55.camel@berners-lee
Whole thread Raw
In response to Re: multiple joins + Order by + LIMIT query performance issue  (Antoine Baudoux <ab@taktik.be>)
Responses Re: multiple joins + Order by + LIMIT query performance issue
List pgsql-performance
On Tue, 2008-05-06 at 18:24 +0100, Antoine Baudoux wrote:

> Isnt the planner fooled by the index on the sorting column?
> If I remove the index the query runs OK.

In your case, for whatever reason, the stats say doing the index scan on
the sorted column will give you the results faster.  That isn't always
the case, and sometimes you can give the same query different where
clauses and that same slow-index-scan will randomly be fast.  It's all
based on the index distribution and the particular values being fetched.

This goes back to what Tom said.  If you know a "miss" can result in
terrible performance, it's best to just recode the query to avoid the
situation.

> This is crazy, so simply by adding a LIMIT to a query, the planning is
> changed in a very bad way. Does the planner use the LIMIT as a sort of
> hint?

Yes.  That's actually what tells it the index scan can be a "big win."
If it scans the index backwards on values returned from some of your
joins, it may just have to find 25 rows and then it can immediately stop
scanning and just give you the results.  In normal cases, this is a
massive performance boost when you have an order clause and are
expecting a ton of results, (say you're getting the first 25 rows of
10000 or something).  But if it would be faster to generate the results
and *then* sort, but Postgres thinks otherwise, you're pretty much
screwed.

But that's the long answer.  You have like 3 ways to get around this
now, so pick one. ;)

--

Shaun Thomas
Database Administrator

Leapfrog Online
807 Greenwood Street
Evanston, IL 60201
Tel. 847-440-8253
Fax. 847-570-5750
www.leapfrogonline.com



pgsql-performance by date:

Previous
From: Shaun Thomas
Date:
Subject: Re: multiple joins + Order by + LIMIT query performance issue
Next
From: Craig James
Date:
Subject: Re: RAID 10 Benchmark with different I/O schedulers