Thread: quickly getting the top N rows

quickly getting the top N rows

From

Ben

Date:

04 October 2007, 15:00:47

If I have this:

create table foo (bar int primary key);

...then in my ideal world, Postgres would be able to use that index on bar
to help me with this:

select bar from foo order by bar desc limit 20;

But in my experience, PG8.2 is doing a full table scan on foo, then
sorting it, then doing the limit. I have a more complex primary key, but I
was hoping the same concept would still apply. Am I doing something wrong,
or just expecting something that doesn't exist?

Re: quickly getting the top N rows

From

Mark Lewis

Date:

04 October 2007, 15:11:59

On Thu, 2007-10-04 at 11:00 -0700, Ben wrote:
> If I have this:
>
> create table foo (bar int primary key);
>
> ...then in my ideal world, Postgres would be able to use that index on bar
> to help me with this:
>
> select bar from foo order by bar desc limit 20;
>
> But in my experience, PG8.2 is doing a full table scan on foo, then
> sorting it, then doing the limit. I have a more complex primary key, but I
> was hoping the same concept would still apply. Am I doing something wrong,
> or just expecting something that doesn't exist?

It has to do with the way that NULL values are stored in the index.
This page has details and instructions for how to get it to work:

http://developer.postgresql.org/pgdocs/postgres/indexes-ordering.html

-- Mark Lewis

Re: quickly getting the top N rows

From

Bill Moran

Date:

04 October 2007, 15:14:24

In response to Ben <bench@silentmedia.com>:

> If I have this:
>
> create table foo (bar int primary key);
>
> ...then in my ideal world, Postgres would be able to use that index on bar
> to help me with this:
>
> select bar from foo order by bar desc limit 20;
>
> But in my experience, PG8.2 is doing a full table scan on foo, then
> sorting it, then doing the limit. I have a more complex primary key, but I
> was hoping the same concept would still apply. Am I doing something wrong,
> or just expecting something that doesn't exist?

Show us the explain.

However, 2 guesses:
1) You never analyzed the table, thus PG has awful statistics and
   doesn't know how to pick a good plan.
2) You have so few rows in the table that a seq scan is actually
   faster than an index scan, which is why PG uses it instead.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: quickly getting the top N rows

From

Andreas Kretschmer

Date:

04 October 2007, 15:17:31

Ben <bench@silentmedia.com> schrieb:

> If I have this:
>
> create table foo (bar int primary key);
>
> ...then in my ideal world, Postgres would be able to use that index on bar
> to help me with this:
>
> select bar from foo order by bar desc limit 20;
>
> But in my experience, PG8.2 is doing a full table scan on foo, then sorting
> it, then doing the limit. I have a more complex primary key, but I was

Please show us the output from

EXPLAIN ANALYSE select bar from foo order by bar desc limit 20;

I try it, with 8.1, and i can see an index scan. You have, maybe, wrong
statistics ot not enough (to few) rows in your table.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."    (unknow)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: quickly getting the top N rows

From

Ben

Date:

04 October 2007, 15:33:48

On Thu, 4 Oct 2007, Bill Moran wrote:

> However, 2 guesses:
> 1) You never analyzed the table, thus PG has awful statistics and
>   doesn't know how to pick a good plan.
> 2) You have so few rows in the table that a seq scan is actually
>   faster than an index scan, which is why PG uses it instead.

No, the tables are recently analyzed and there are a couple hundred
thousand rows in there. But I think I just figured it out.... it's a
3-column index, and two columns of that index are the same for every row.
When I drop those two columns from the ordering restriction, the index
gets used and things speed up 5 orders of magnitude.

Maybe the planner is smart enough to think that if a column in the order
by clause is identical for most rows, then using an index won't help....
but not smart enough to realize that if said column is at the *end* of the
order by arguments, after columns which do sort quite well, then it should
use an index after all.

Re: quickly getting the top N rows

From

Tom Lane

Date:

04 October 2007, 15:52:26

Ben <bench@silentmedia.com> writes:
> No, the tables are recently analyzed and there are a couple hundred
> thousand rows in there. But I think I just figured it out.... it's a
> 3-column index, and two columns of that index are the same for every row.
> When I drop those two columns from the ordering restriction, the index
> gets used and things speed up 5 orders of magnitude.

> Maybe the planner is smart enough to think that if a column in the order
> by clause is identical for most rows, then using an index won't help....
> but not smart enough to realize that if said column is at the *end* of the
> order by arguments, after columns which do sort quite well, then it should
> use an index after all.

You're being about as clear as mud here, except that you obviously lied
about what you were doing in your first message.  If you have a planner
problem, show us the *exact* query, the *exact* table definition, and
unfaked EXPLAIN ANALYZE output.

            regards, tom lane

Re: quickly getting the top N rows

From

Ben

Date:

04 October 2007, 16:52:57

On Thu, 4 Oct 2007, Tom Lane wrote:

> You're being about as clear as mud here, except that you obviously lied
> about what you were doing in your first message.  If you have a planner
> problem, show us the *exact* query, the *exact* table definition, and
> unfaked EXPLAIN ANALYZE output.

I didn't realize that simplification was viewed as so sinister, but
thanks, I'll remember that in the future.

The table:
                          Table "public.log"
      Column     |            Type             |      Modifiers
----------------+-----------------------------+---------------------
  clientkey      | character(30)               | not null
  premiseskey    | character(30)               | not null
  logkey         | character(30)               | not null
  logicaldel     | character(1)                | default 'N'::bpchar
  lockey         | character(30)               |
  devlockey      | character(30)               |
  eventkey       | character(30)               |
  logshorttext   | character varying(255)      |
  logdesc        | character varying(255)      |
  loguserkey     | character(30)               |
  logassetkey    | character(30)               |
  logresourcekey | character(30)               |
  logtime        | timestamp without time zone |
  logip          | character varying(50)       |
  logarchived    | character(1)                |
  logarchivedate | timestamp without time zone |
  loghasvideo    | character(1)                |
  loghasaudio    | character(1)                |
  resvehiclekey  | character(30)               |
  synccreated    | character(1)                |
  logtypekey     | character(30)               |
Indexes:
     "log_pkey" PRIMARY KEY, btree (clientkey, premiseskey, logkey)
     "eventkey_idx" btree (eventkey),
     "log_ak1" btree (clientkey, premiseskey, logtime, logkey)


The original, slow query:

explain analyze SELECT * FROM log WHERE clientkey in
('000000004500000000010000000001')  AND premiseskey in
('000000004500000000010000000001') and logicaldel = 'N'
ORDER BY logtime desc, logkey desc, clientkey desc, premiseskey desc LIMIT 20 offset 0;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=356402.58..356402.63 rows=20 width=563) (actual time=215858.481..215858.527 rows=20 loops=1)
    ->  Sort  (cost=356402.58..357598.25 rows=478267 width=563) (actual time=215858.478..215858.498 rows=20 loops=1)
          Sort Key: logtime, logkey, clientkey, premiseskey
          ->  Seq Scan on log  (cost=0.00..52061.67 rows=478267 width=563) (actual time=29.340..100043.313 rows=475669
loops=1)
                Filter: ((clientkey = '000000004500000000010000000001'::bpchar) AND (premiseskey =
'000000004500000000010000000001'::bpchar)AND (logicaldel = 'N'::bpchar)) 
  Total runtime: 262462.582 ms
(6 rows)


Every row in log has identical clientkey and premiseskey values, so if I
just remove those columns from the order by clause, I get this far
superior plan:

explain analyze SELECT * FROM log WHERE clientkey in
('000000004500000000010000000001') AND premiseskey in
('000000004500000000010000000001') and logicaldel = 'N'
ORDER BY logtime desc, logkey desc LIMIT 20 offset 0;
                                                                 QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..12.33 rows=20 width=563) (actual time=0.047..0.105 rows=20 loops=1)
    ->  Index Scan Backward using log_ak1 on log  (cost=0.00..294735.70 rows=478267 width=563) (actual
time=0.044..0.076rows=20 loops=1) 
          Index Cond: ((clientkey = '000000004500000000010000000001'::bpchar) AND (premiseskey =
'000000004500000000010000000001'::bpchar))
          Filter: (logicaldel = 'N'::bpchar)
  Total runtime: 0.165 ms
(5 rows)


...which made me to think that maybe postgres is not using log_ak1 in the
former case because two of the columns in the order by match every row.

Unfortunately, in this case it's not an option to alter the query. I'm
just trying to figure out an explaination.

Re: quickly getting the top N rows

From

Simon Riggs

Date:

04 October 2007, 17:23:20

On Thu, 2007-10-04 at 12:52 -0700, Ben wrote:

> The original, slow query:
>
> explain analyze SELECT * FROM log WHERE clientkey in
> ('000000004500000000010000000001')  AND premiseskey in
> ('000000004500000000010000000001') and logicaldel = 'N'
> ORDER BY logtime desc, logkey desc, clientkey desc, premiseskey desc LIMIT 20 offset 0;
>
> QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>   Limit  (cost=356402.58..356402.63 rows=20 width=563) (actual time=215858.481..215858.527 rows=20 loops=1)
>     ->  Sort  (cost=356402.58..357598.25 rows=478267 width=563) (actual time=215858.478..215858.498 rows=20 loops=1)
>           Sort Key: logtime, logkey, clientkey, premiseskey
>           ->  Seq Scan on log  (cost=0.00..52061.67 rows=478267 width=563) (actual time=29.340..100043.313
rows=475669loops=1) 
>                 Filter: ((clientkey = '000000004500000000010000000001'::bpchar) AND (premiseskey =
'000000004500000000010000000001'::bpchar)AND (logicaldel = 'N'::bpchar)) 
>   Total runtime: 262462.582 ms
> (6 rows)
>
>
> Every row in log has identical clientkey and premiseskey values, so if I
> just remove those columns from the order by clause, I get this far
> superior plan:
>
> explain analyze SELECT * FROM log WHERE clientkey in
> ('000000004500000000010000000001') AND premiseskey in
> ('000000004500000000010000000001') and logicaldel = 'N'
> ORDER BY logtime desc, logkey desc LIMIT 20 offset 0;
>                                                                  QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------
>   Limit  (cost=0.00..12.33 rows=20 width=563) (actual time=0.047..0.105 rows=20 loops=1)
>     ->  Index Scan Backward using log_ak1 on log  (cost=0.00..294735.70 rows=478267 width=563) (actual
time=0.044..0.076rows=20 loops=1) 
>           Index Cond: ((clientkey = '000000004500000000010000000001'::bpchar) AND (premiseskey =
'000000004500000000010000000001'::bpchar))
>           Filter: (logicaldel = 'N'::bpchar)
>   Total runtime: 0.165 ms
> (5 rows)
>
>
> ...which made me to think that maybe postgres is not using log_ak1 in the
> former case because two of the columns in the order by match every row.
>
> Unfortunately, in this case it's not an option to alter the query. I'm
> just trying to figure out an explaination.

In the first query, Postgres cannot use the index because the sort order
of the index does not match the sort order of the query. When you change
the sort order of the query so that it matches that of the index, then
the index is used.

If you define your index on (logtime, logkey, clientkey, premiseskey)
rather than on (clientkey, premiseskey, logtime, logkey) you will have a
fast query. Yes, the column order matters.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

Re: quickly getting the top N rows

From

"Scott Marlowe"

Date:

04 October 2007, 17:33:06

On 10/4/07, Ben <bench@silentmedia.com> wrote:
> If I have this:
>
> create table foo (bar int primary key);
>
> ...then in my ideal world, Postgres would be able to use that index on bar
> to help me with this:
>
> select bar from foo order by bar desc limit 20;
>
> But in my experience, PG8.2 is doing a full table scan on foo, then
> sorting it, then doing the limit. I have a more complex primary key, but I
> was hoping the same concept would still apply. Am I doing something wrong,
> or just expecting something that doesn't exist?

pg uses an intelligent planner.  It looks at the table, the number of
rows, the distribution of values, and makes a decision whether to use
seq scan or index.  Do you have any evidence that in your case seq
scan is a bad choice?

try this experiment:

psql mydb
=# select * from foo; -- this will prime the db and put the table in
memory if it will fit
=# \timing
=# set enable_seqscan=off;
=# select bar from foo order by bar desc limit 20;
=# set enable_seqscan=on;
=# select bar from foo order by bar desc limit 20;

and compare the times each takes.  run each way a few times to be sure
you're not getting random variance.

On my reporting db with somewhere around 75 million tables, a similar
query 0.894 mS and uses an index scan.  Which is good, because a
sequential scan on that table takes about 15 to 30 minutes.

Re: quickly getting the top N rows

From

Ben

Date:

04 October 2007, 17:34:32

On Thu, 4 Oct 2007, Simon Riggs wrote:

> In the first query, Postgres cannot use the index because the sort order
> of the index does not match the sort order of the query. When you change
> the sort order of the query so that it matches that of the index, then
> the index is used.
>
> If you define your index on (logtime, logkey, clientkey, premiseskey)
> rather than on (clientkey, premiseskey, logtime, logkey) you will have a
> fast query. Yes, the column order matters.

I thought that might explain it, but then I'm surprised that it can still
use an index when the first two columns of the index aren't in the query.
Wouldn't that mean that it might have to walk the entire index to find
matching rows?

....unless it's smart enough to realize that the first two columns will
match everything. Which would be cool.

Re: quickly getting the top N rows

From

"Scott Marlowe"

Date:

04 October 2007, 17:35:55

On 10/4/07, Ben <bench@silentmedia.com> wrote:
> On Thu, 4 Oct 2007, Tom Lane wrote:
>
> > You're being about as clear as mud here, except that you obviously lied
> > about what you were doing in your first message.  If you have a planner
> > problem, show us the *exact* query, the *exact* table definition, and
> > unfaked EXPLAIN ANALYZE output.
>
> I didn't realize that simplification was viewed as so sinister, but
> thanks, I'll remember that in the future.

It's not sinister, it's counterproductive and wastes people's time.

Re: quickly getting the top N rows

From

Richard Huxton

Date:

04 October 2007, 18:01:56

Scott Marlowe wrote:
> On 10/4/07, Ben <bench@silentmedia.com> wrote:
>> On Thu, 4 Oct 2007, Tom Lane wrote:
>>
>>> You're being about as clear as mud here, except that you obviously lied
>>> about what you were doing in your first message.  If you have a planner
>>> problem, show us the *exact* query, the *exact* table definition, and
>>> unfaked EXPLAIN ANALYZE output.
>> I didn't realize that simplification was viewed as so sinister, but
>> thanks, I'll remember that in the future.
>
> It's not sinister, it's counterproductive and wastes people's time.

Unless "Ben" is an alias for someone high up with an unnamed rival
database, seeking to distract us all... How do you find out the IP
address of a yacht? ;-)

--
   Richard Huxton
   Archonet Ltd

Re: quickly getting the top N rows

From

Tom Lane

Date:

04 October 2007, 18:05:07

Ben <bench@silentmedia.com> writes:
> On Thu, 4 Oct 2007, Simon Riggs wrote:
> I thought that might explain it, but then I'm surprised that it can still
> use an index when the first two columns of the index aren't in the query.
> Wouldn't that mean that it might have to walk the entire index to find
> matching rows?

> ....unless it's smart enough to realize that the first two columns will
> match everything. Which would be cool.

There's some limited smarts in there about deciding that leading columns
of an index don't matter to the sort ordering if they're constrained to
just one value by the query.  But it doesn't catch the case you need,
which is that columns of an ORDER BY request are no-ops when they're
constrained to just one value.

That whole area has been rewritten for 8.3 and I believe it will handle
this case.  No time to try it right now though.

            regards, tom lane

Re: quickly getting the top N rows

From

Ben

Date:

04 October 2007, 18:17:03

On Thu, 4 Oct 2007, Tom Lane wrote:

> There's some limited smarts in there about deciding that leading columns
> of an index don't matter to the sort ordering if they're constrained to
> just one value by the query.  But it doesn't catch the case you need,
> which is that columns of an ORDER BY request are no-ops when they're
> constrained to just one value.

Oh, no, that explains it perfectly, because that's precisely the case I
have - I dropped the columns from the ordering, but not the where clause.
Thanks, now I understand the current behavior.