Re: Need advice to avoid ORDER BY - Mailing list pgsql-general

From Merlin Moncure
Subject Re: Need advice to avoid ORDER BY
Date
Msg-id CAHyXU0yzU9S+oSmYXxzBPDg4+yy9-H5gm8721bvH7g0E9m=QzQ@mail.gmail.com
Whole thread Raw
In response to Re: Need advice to avoid ORDER BY  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Need advice to avoid ORDER BY
List pgsql-general
On Thu, Apr 4, 2013 at 5:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> problem is that you are looking for needles (valids = 0) in the
>> haystack.   the problem wasn't really the order, but the fact that you
>> had to scan an arbitrary amount of rows before finding a candidate
>> record.  so the partial index manages this problem by creating index
>> entries *only for records that match a criteria*, and the planner
>> recognizes this and prefers that index when the criteria is also
>> present in the query.  In other words, index only the needles.
>
> The other way to fix it is a two-column index on (valids, id), which
> will be more useful if sometimes you need the minimum/maximum id
> for some nonzero value of valids.

right -- that's a more general solution -- here we are exploiting that
A: the OP only needs access to "=0" rows and especially B: "=0" rows
are a tiny fraction of the overall set (we know this because otherwise
the query would have returned quickly anyways).  So we get to squeak
out with a tiny index pointing to only the candidate rows.

Partial indexes are an underutilized trick -- the efficiency savings
can be enormous.  They are often useful when coding ad hoc queue
operations in the database where the queued items are intermixed with
items that have been resolved.

merlin


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Need advice to avoid ORDER BY
Next
From: David Noel
Date:
Subject: Re: PostgreSQL: CPU utilization creeping to 100%