Re: Multi-column indexes - Mailing list pgsql-general

From Tom Lane
Subject Re: Multi-column indexes
Date
Msg-id 12422.1105828057@sss.pgh.pa.us
Whole thread Raw
In response to Multi-column indexes  (Edmund Dengler <edmundd@eSentire.com>)
Responses Re: Multi-column indexes
List pgsql-general
Edmund Dengler <edmundd@eSentire.com> writes:
>     "record_to_process_idx" unique, btree (host_luid, log_luid, luid) WHERE (error IS NULL)

> explain analyze
> select record
> from agent.record
> where host_luid = 3::bigint
>   and log_luid = 2::bigint
>   and error is null
> order by host_luid desc, log_luid desc, luid desc
> limit 1

>  Limit  (cost=0.00..1.47 rows=1 width=286) (actual time=249064.949..249064.950 rows=1 loops=1)
>    ->  Index Scan Backward using record_to_process_idx on record  (cost=0.00..13106.73 rows=8898 width=286) (actual
time=249064.944..249064.944rows=1 loops=1) 
>          Index Cond: ((host_luid = 3::bigint) AND (log_luid = 2::bigint))
>          Filter: (error IS NULL)
>  Total runtime: 249065.004 ms

Are there a whole lotta rows with that host_luid/log_luid combination?

What's happening is that the index search initially finds the first such
row, and then it has to step to the last such row to start the backwards
scan.  This is fixed as of 8.0, but all earlier releases are going to be
slow in that scenario.  It's got nothing to do with single vs multi
column indexes, it is just a shortcoming of the startup code for
backwards index scans.  (I get the impression that the original
implementation of Postgres' btree indexes only supported unique indexes,
because there were a number of places where it was horridly inefficient
for large numbers of equal keys.  I think this 8.0 fix is the last such
issue.)

Since your index has an additional column, there is a hack you can use
to get decent performance in 7.4 and before.  Add a dummy condition on
the last column:
    where host_luid = 3::bigint
      and log_luid = 2::bigint
      AND LUID <= someverylargevalue::bigint
      and error is null
    order by host_luid desc, log_luid desc, luid desc
    limit 1
Now, instead of positioning to the first row with value (3,2,anything)
the initial btree descent will home in on the end of that range, and
so the expensive stepping over all the rows between is avoided.

            regards, tom lane

pgsql-general by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Multi-column indexes
Next
From: Bo Lorentsen
Date:
Subject: Re: Index optimization ?