Thread: Joint index including MAX() ?

Joint index including MAX() ?

From

Richard Neill

Date:

09 January 2010, 08:46:21

Dear All,

I'm trying to optimise the speed of some selects with the where condition:

WHERE id =
  (SELECT MAX(id) FROM tbl_sort_report WHERE parcel_id_code='43024')


This is relatively slow, taking about 15-20ms, even though I have a
joint index on both fields:

CREATE INDEX testidx3 ON tbl_sort_report (id, parcel_id_code);


So, my question is, is there any way to improve this? I'd expect that an
index on   ( max(id),parcel_id_code ) would be ideal, excepting that
postgres won't allow that (and such an index probably doesn't make much
conceptual sense).


Explain Analyze is below.

Thanks,

Richard



Here is part of the schema. id is the primary key; parcel_id_code loops
from 0...99999 and back again every few hours.

fsc_log=> \d tbl_sort_report
                                     Table "public.tbl_sort_report"
         Column        |           Type           |
  Modifiers
----------------------+--------------------------+-----------------------------------------------------
  id                   | bigint                   | not null default
nextval('master_id_seq'::regclass)
  timestamp            | timestamp with time zone |
  parcel_id_code       | integer                  |
(etc)




EXPLAIN ANALYZE (SELECT MAX(id) FROM tbl_sort_report WHERE
parcel_id_code='43024');

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
  Result  (cost=7.34..7.35 rows=1 width=0) (actual time=17.712..17.714
rows=1 loops=1)
    InitPlan 1 (returns $0)
      ->  Limit  (cost=0.00..7.34 rows=1 width=8) (actual
time=17.705..17.705 rows=0 loops=1)
            ->  Index Scan Backward using testidx3 on tbl_sort_report
(cost=0.00..14.67 rows=2 width=8) (actual time=17.700..17.700 rows=0
loops=1)
                  Index Cond: (parcel_id_code = 43024)
                  Filter: (id IS NOT NULL)
  Total runtime: 17.786 ms

Re: Joint index including MAX() ?

From

Grzegorz Jaśkiewicz

Date:

09 January 2010, 08:51:54

you can also try :

select val FROM table ORDER BY val DESC LIMIT 1;

which usually is much quicker.

Re: Joint index including MAX() ?

From

Lefteris

Date:

09 January 2010, 08:52:44

Hi,

I first suggestion would be to either build the index only on
parcel_id_code or on (parcel_id_code, id).

But I am not sure because I am new in pg:)

cheers,
lefteris

On Sat, Jan 9, 2010 at 1:46 PM, Richard Neill <rn214@cam.ac.uk> wrote:
> Dear All,
>
> I'm trying to optimise the speed of some selects with the where condition:
>
> WHERE id =
>  (SELECT MAX(id) FROM tbl_sort_report WHERE parcel_id_code='43024')
>
>
> This is relatively slow, taking about 15-20ms, even though I have a joint
> index on both fields:
>
> CREATE INDEX testidx3 ON tbl_sort_report (id, parcel_id_code);
>
>
> So, my question is, is there any way to improve this? I'd expect that an
> index on   ( max(id),parcel_id_code ) would be ideal, excepting that
> postgres won't allow that (and such an index probably doesn't make much
> conceptual sense).
>
>
> Explain Analyze is below.
>
> Thanks,
>
> Richard
>
>
>
> Here is part of the schema. id is the primary key; parcel_id_code loops from
> 0...99999 and back again every few hours.
>
> fsc_log=> \d tbl_sort_report
>                                    Table "public.tbl_sort_report"
>        Column        |           Type           |  Modifiers
> ----------------------+--------------------------+-----------------------------------------------------
>  id                   | bigint                   | not null default
> nextval('master_id_seq'::regclass)
>  timestamp            | timestamp with time zone |
>  parcel_id_code       | integer                  |
> (etc)
>
>
>
>
> EXPLAIN ANALYZE (SELECT MAX(id) FROM tbl_sort_report WHERE
> parcel_id_code='43024');
>
> QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------
>  Result  (cost=7.34..7.35 rows=1 width=0) (actual time=17.712..17.714 rows=1
> loops=1)
>   InitPlan 1 (returns $0)
>     ->  Limit  (cost=0.00..7.34 rows=1 width=8) (actual time=17.705..17.705
> rows=0 loops=1)
>           ->  Index Scan Backward using testidx3 on tbl_sort_report
> (cost=0.00..14.67 rows=2 width=8) (actual time=17.700..17.700 rows=0
> loops=1)
>                 Index Cond: (parcel_id_code = 43024)
>                 Filter: (id IS NOT NULL)
>  Total runtime: 17.786 ms
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: Joint index including MAX() ?

From

Tom Lane

Date:

09 January 2010, 13:07:27

Richard Neill <rn214@cam.ac.uk> writes:
> I'm trying to optimise the speed of some selects with the where condition:
> WHERE id =
>   (SELECT MAX(id) FROM tbl_sort_report WHERE parcel_id_code='43024')
> This is relatively slow, taking about 15-20ms, even though I have a
> joint index on both fields:
> CREATE INDEX testidx3 ON tbl_sort_report (id, parcel_id_code);

You've got the index column order backwards: to make this query fast,
it has to be on (parcel_id_code, id).  The reason should be apparent
if you think about the index ordering.  With the correct index, the
backend can descend the btree looking for the last entry with
parcel_id_code='43024', and when it hits it, that's the max id.
The other way round, the best available strategy using the index
is to search backwards from the end (highest id) hoping to hit a
row with parcel_id_code='43024'.  That could take a long time.
Frequently the planner will think it's so slow that it shouldn't
even bother with the index, just seqscan.

            regards, tom lane