Index Performance Help - Mailing list pgsql-performance

From Damien Dougan
Subject Index Performance Help
Date
Msg-id 004b01c3ebe1$7e0c9a70$7701a8c0@pestilence
Whole thread Raw
Responses Re: Index Performance Help
Re: Index Performance Help
List pgsql-performance
Hi All,

I've been seeing very slow read performance on a database of 1 million
indexed subscribers, which I believe is nothing to do with the data
itself, but delays on processing the index.

If I make a random jump into the index (say X), it can take about 50ms
to read the subscriber. If I then make a "close by" lookup (say X+10),
it takes only about 0.5ms to read the subscriber. Making another lookup
to a "far away" (say X+1000), it again takes about 50ms to read.

From the analyze output, it looks like most of the work is being done in
the index scan of the subscriber table - reading the actual data from
the PublicView is quite fast.

Am I correct in my analysis? Is there anything I can do to improve the
performance of the index lookups?

(The indexes in question are all created as B-TREE.)

I've tried increasing the index memory and making a number of queries
around the index range, but a stray of several hundred indexes from a
cached entry always results in a major lookup delay.

I've also increased the shared memory available to Postgres to 80MB
incase this is a paging of the index, but it hasn't seemed to have any
effect.




Sample analyze output for an initial query:

hydradb=# explain analyze select * from pvsubscriber where actorid =
'b3432-asdas-232-Subscriber793500';

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------------------------
 Nested Loop Left Join  (cost=0.00..13.19 rows=1 width=100) (actual
time=49.688..49.699 rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..10.16 rows=1 width=69) (actual
time=49.679..49.689 rows=1 loops=1)
         Join Filter: ("inner".mc_childactor_id = "outer".id)
         ->  Nested Loop  (cost=0.00..10.15 rows=1 width=69) (actual
time=49.669..49.677 rows=1 loops=1)
               ->  Nested Loop  (cost=0.00..7.12 rows=1 width=73)
(actual time=43.969..43.974 rows=1 loops=1)
                     ->  Index Scan using mc_actor_key on mc_actor
(cost=0.00..4.08 rows=1 width=69) (actual time=39.497..39.499 rows=1
loops=1)
                           Index Cond: ((actorid)::text =
'b3432-asdas-232-Subscriber793500'::text)
                     ->  Index Scan using rel_actor_has_subscriber_idx1
on rel_actor_has_subscriber rel_sub  (cost=0.00..3.02 rows=1 width=8)
(actual time=4.458..4.460 rows=1 loops=1)
                           Index Cond: ("outer".id =
rel_sub.mc_actor_id)
               ->  Index Scan using mc_subscriber_id_idx on
mc_subscriber sub  (cost=0.00..3.02 rows=1 width=4) (actual
time=5.689..5.691 rows=1 loops=1)
                     Index Cond: (sub.id = "outer".mc_subscriber_id)
         ->  Seq Scan on rel_actor_has_actor rel_parent
(cost=0.00..0.00 rows=1 width=8) (actual time=0.002..0.002 rows=0
loops=1)
   ->  Index Scan using mc_actor_id_idx on mc_actor  (cost=0.00..3.02
rows=1 width=39) (actual time=0.002..0.002 rows=0 loops=1)
         Index Cond: ("outer".mc_parentactor_id = mc_actor.id)
 Total runtime: 49.845 ms
(15 rows)



And the analyze output for a "nearby" subscriber (10 indexes away):

hydradb=# explain analyze select * from pvsubscriber where actorid =
'b3432-asdas-232-Subscriber793510';

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------------------------
 Nested Loop Left Join  (cost=0.00..13.19 rows=1 width=100) (actual
time=0.278..0.288 rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..10.16 rows=1 width=69) (actual
time=0.271..0.280 rows=1 loops=1)
         Join Filter: ("inner".mc_childactor_id = "outer".id)
         ->  Nested Loop  (cost=0.00..10.15 rows=1 width=69) (actual
time=0.264..0.272 rows=1 loops=1)
               ->  Nested Loop  (cost=0.00..7.12 rows=1 width=73)
(actual time=0.246..0.251 rows=1 loops=1)
                     ->  Index Scan using mc_actor_key on mc_actor
(cost=0.00..4.08 rows=1 width=69) (actual time=0.220..0.221 rows=1
loops=1)
                           Index Cond: ((actorid)::text =
'b3432-asdas-232-Subscriber793510'::text)
                     ->  Index Scan using rel_actor_has_subscriber_idx1
on rel_actor_has_subscriber rel_sub  (cost=0.00..3.02 rows=1 width=8)
(actual time=0.017..0.018 rows=1 loops=1)
                           Index Cond: ("outer".id =
rel_sub.mc_actor_id)
               ->  Index Scan using mc_subscriber_id_idx on
mc_subscriber sub  (cost=0.00..3.02 rows=1 width=4) (actual
time=0.012..0.013 rows=1 loops=1)
                     Index Cond: (sub.id = "outer".mc_subscriber_id)
         ->  Seq Scan on rel_actor_has_actor rel_parent
(cost=0.00..0.00 rows=1 width=8) (actual time=0.002..0.002 rows=0
loops=1)
   ->  Index Scan using mc_actor_id_idx on mc_actor  (cost=0.00..3.02
rows=1 width=39) (actual time=0.001..0.001 rows=0 loops=1)
         Index Cond: ("outer".mc_parentactor_id = mc_actor.id)
 Total runtime: 0.428 ms
(15 rows)




Many thanks,

Damien

----------------------------------------------------------------------
Damien Dougan, Software Architect
Mobile Cohesion - http://www.mobilecohesion.com
Email: damien.dougan@mobilecohesion.com
Mobile: +44 7766477997
----------------------------------------------------------------------



pgsql-performance by date:

Previous
From: "Carlos Eduardo Smanioto"
Date:
Subject: Re: 7.3 vs 7.4 performance
Next
From: Christopher Browne
Date:
Subject: Re: 7.3 vs 7.4 performance