Re: Window functions and index usage - Mailing list pgsql-performance

From Tom Lane
Subject Re: Window functions and index usage
Date
Msg-id 7900.1317739011@sss.pgh.pa.us
Whole thread Raw
In response to Window functions and index usage  (Anssi Kääriäinen <anssi.kaariainen@thl.fi>)
Responses Re: Window functions and index usage
List pgsql-performance
=?ISO-8859-1?Q?Anssi_K=E4=E4ri=E4inen?= <anssi.kaariainen@thl.fi> writes:
> I have the following setup:
> create table test(id integer, seq integer);
> insert into test select generate_series(0, 100), generate_series(0, 1000);
> create unique index test_idx on test(id, seq);
> analyze test;

> Now I try to fetch the latest 5 values per id, ordered by seq from the
> table:

> select * from (
> select id, seq, row_number() over (partition by id order by seq)
>    from test
>   where id in (1, 2, 3)
> ) where row_number() <= 5;

> This does not use the index on id, seq for sorting the data. It uses a
> bitmap index scan, and sequential scan when issued SET enable_bitmapscan
> to false.

The cost estimates I get are 806 for bitmap scan and sort, 2097 for
seqscan and sort, 4890 for indexscan without sort.  It *can* use the
index for that query ... it just doesn't think it's a good idea.  It's
probably right, too.  At least, the actual runtimes go in the same order
on my machine.  Seqscan-and-sort very often beats an indexscan for
sorting a table, unless the table is clustered on the index or nearly so.

Note that it cannot use the index for both ordering and satisfying the
IN condition.  If it used the =ANY clause as an index condition, what
that would imply is three separate index searches and so the results
wouldn't necessarily be correctly ordered.  This is why the plain
indexscan costs out so expensive: it's a full-table scan.

            regards, tom lane

pgsql-performance by date:

Previous
From: pasman pasmański
Date:
Subject: Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter
Next
From: Anssi Kääriäinen
Date:
Subject: Re: Window functions and index usage