Home > mailing lists

Multi-column indexes - Mailing list pgsql-general

From	Edmund Dengler
Subject	Multi-column indexes
Date	January 15, 2005 21:00:04
Msg-id	Pine.BSO.4.58.0501151533260.17212@cyclops4.internal Whole thread Raw
Responses	Re: Multi-column indexes Re: Multi-column indexes
List	pgsql-general

Tree view

Greetings!

I have a technical question concerning multi-column indexes and their
implementation. I tried looking for the answr in the docs but couldn't
find anything.

I have the following table:

eventlog=> \d agent.record
                                          Table "agent.record"
       Column       |           Type           |
Modifiers
--------------------+--------------------------+---------------------------------------------------------
 luid               | bigint                   | not null default nextval('agent.record_luid_seq'::text)
 host_luid          | bigint                   |
 remote_system_luid | bigint                   |
 log_luid           | bigint                   | not null
 time_logged        | timestamp with time zone | not null default now()
 record             | bytea                    | not null
 error              | boolean                  |
 error_reason       | text                     |
Indexes:
    "record_pkey" primary key, btree (luid)
    "record_to_process_idx" unique, btree (host_luid, log_luid, luid) WHERE (error IS NULL)
    "record_to_process_idx2" unique, btree (luid) WHERE (error IS NULL)
    "record_last_logged_idx" btree (time_logged, host_luid, log_luid, luid)
Foreign-key constraints:
    "$1" FOREIGN KEY (host_luid) REFERENCES eventlog.host(luid) ON UPDATE CASCADE ON DELETE CASCADE
    "$2" FOREIGN KEY (remote_system_luid) REFERENCES eventlog.remote_system(luid)
    "$3" FOREIGN KEY (log_luid) REFERENCES eventlog.log(luid) ON UPDATE CASCADE ON DELETE CASCADE

consisting of 27306578 rows.


So I try running the following query:

explain analyze
select record
from agent.record
where host_luid = 3::bigint
  and log_luid = 2::bigint
  and error is null
order by host_luid desc, log_luid desc, luid desc
limit 1

I get the following query plan:


----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..1.47 rows=1 width=286) (actual time=249064.949..249064.950 rows=1 loops=1)
   ->  Index Scan Backward using record_to_process_idx on record  (cost=0.00..13106.73 rows=8898 width=286) (actual
time=249064.944..249064.944rows=1 loops=1) 
         Index Cond: ((host_luid = 3::bigint) AND (log_luid = 2::bigint))
         Filter: (error IS NULL)
 Total runtime: 249065.004 ms
(5 rows)

Now, this plan seems kinda slow, in the sense of scanning backwards. And
it takes quite a long time (compared to seeking the last row based only on
<luid>, for example). It feels that if I have <host_luid> values of
(1,2,3,4,5), that the above is scanning through _all_ 5 entries, then 4
entries, and then finally gets to 3.

So, now to my question: is this really happening?

I guess it breaks down to how these indexes are implemented. Are
multi-column indexes implemented as true multiple level indexes, in the
sense there is a level 1 index on <host_luid>, pointing to a level 2 index
on <log_luid>, pointing to a level 3 index on <luid>? Or are they the
equivalent of a <host_luid,log_luid,luid> single index (ie, as if I
created a functional index consisting of
        host_luid || ',' || log_luid || ',' || luid
)?

My initial guess was that Postgresql would search first to the <host_luid>
desc, then from there to the specific <log_luid> desc, and then from there
to the <luid> (ie, the equivalent of 3 btree jumps), essentialy skipping
over the inappropriate <host_luid>'s of 5 and 4. But it seems to be
scanning through them, even though I have a low cost for random page
accesses within my postgresql.conf file. Are they components of the index
to allow it to "skip" backwards lots of pages rather than loading them
from disk?

Any ideas? How does multi-column indexes really work? I would hate to have
to define specific indexes for each <host_luid> as this is an
unmaintainable situation.

Thanks!
Ed

pgsql-general by date:

From: Martijn van Oosterhout
Date: 15 January 2005, 20:28:39
Subject: Re: PQisBusy() always returns TRUE

From: Tom Lane
Date: 15 January 2005, 21:00:27
Subject: Re: OID Usage

Multi-column indexes - Mailing list pgsql-general

Previous

Next