Re: CREATE INDEX speeds up query on 31 row table ... - Mailing list pgsql-hackers

From Marc G. Fournier
Subject Re: CREATE INDEX speeds up query on 31 row table ...
Date
Msg-id 20040930165011.F23868@ganymede.hub.org
Whole thread Raw
In response to Re: CREATE INDEX speeds up query on 31 row table ...  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, 30 Sep 2004, Tom Lane wrote:

> Greg Stark <gsstark@mit.edu> writes:
>> You say it's "*very* busy" is it possible there are hundreds or thousands of
>> tuples in there that are uncommitted or committed after this query starts?
>
> More specifically, I bet there's a huge number of completely empty
> pages, which would be read by a seqscan but not an indexscan.  VACUUM
> FULL should fix it nicely, but it's odd that autovacuum isn't keeping
> a lid on the file size.  Maybe with so few live rows, it's confused into
> thinking it doesn't need to vacuum the table often?

in the last hour or so:

[2004-09-30 03:41:18 PM] Performing: ANALYZE "public"."shown"
[2004-09-30 03:59:50 PM] Performing: VACUUM ANALYZE "public"."shown"
[2004-09-30 04:10:27 PM] Performing: VACUUM ANALYZE "public"."shown"
[2004-09-30 04:21:36 PM] Performing: ANALYZE "public"."shown"
[2004-09-30 04:27:05 PM] Performing: VACUUM ANALYZE "public"."shown"
[2004-09-30 04:32:17 PM] Performing: ANALYZE "public"."shown"
[2004-09-30 04:37:50 PM] Performing: VACUUM ANALYZE "public"."shown"
[2004-09-30 04:49:05 PM] Performing: VACUUM ANALYZE "public"."shown"

database directory itself is:

db# du 17144
28696   17144

after a vacuum full:

du 17144
6530    17144

Based on the following, I don't think that I need to raise my fsm settings 
enough to make much of a difference ... or am I reading it wrong?

INFO:  free space map: 674 relations, 9298 pages stored; 21360 total pages needed
DETAIL:  Allocated FSM size: 1000 relations + 20000 pages = 178 kB shared memory.

Running with the index now results in:
 Limit  (cost=4.85..6.57 rows=1 width=50) (actual time=3.370..3.376 rows=1 
loops=1)   InitPlan     ->  Aggregate  (cost=3.43..3.43 rows=1 width=8) (actual 
time=2.562..2.567 rows=1 loops=1)           ->  Hash Join  (cost=1.57..3.35 rows=29 width=8) (actual 
time=1.739..2.338 rows=27 loops=1)                 Hash Cond: ("outer".banner_id = "inner".banner_id)
-> Seq Scan on shown s  (cost=0.00..1.33 rows=33 
 
width=12) (actual time=0.011..0.309 rows=33 loops=1)                 ->  Hash  (cost=1.50..1.50 rows=29 width=4)
(actual
 
time=0.589..0.589 rows=0 loops=1)                       ->  Seq Scan on banner b  (cost=0.00..1.50 rows=29 
width=4) (actual time=0.122..0.399 rows=27 loops=1)                             Filter: ((end_date >= now()) OR
(end_dateIS 
 
NULL))   ->  Hash Join  (cost=1.42..3.14 rows=1 width=50) (actual 
time=3.355..3.355 rows=1 loops=1)         Hash Cond: ("outer".banner_id = "inner".banner_id)         ->  Seq Scan on
bannerb  (cost=0.00..1.66 rows=10 width=50) 
 
(actual time=0.030..0.151 rows=16 loops=1)               Filter: ((start_date <= now()) AND ((end_date >= now()) OR 
(end_date IS NULL)))         ->  Hash  (cost=1.41..1.41 rows=3 width=4) (actual 
time=2.800..2.800 rows=0 loops=1)               ->  Seq Scan on shown s  (cost=0.00..1.41 rows=3 width=4) 
(actual time=2.684..2.735 rows=6 loops=1)                     Filter: (counter = $0) Total runtime: 3.913 ms
(17 rows)

and without:
 Limit  (cost=4.85..6.57 rows=1 width=50) (actual time=2.111..2.116 rows=1 
loops=1)   InitPlan     ->  Aggregate  (cost=3.43..3.43 rows=1 width=8) (actual 
time=1.430..1.435 rows=1 loops=1)           ->  Hash Join  (cost=1.57..3.35 rows=29 width=8) (actual 
time=0.676..1.236 rows=27 loops=1)                 Hash Cond: ("outer".banner_id = "inner".banner_id)
-> Seq Scan on shown s  (cost=0.00..1.33 rows=33 
 
width=12) (actual time=0.007..0.290 rows=33 loops=1)                 ->  Hash  (cost=1.50..1.50 rows=29 width=4)
(actual
 
time=0.422..0.422 rows=0 loops=1)                       ->  Seq Scan on banner b  (cost=0.00..1.50 rows=29 
width=4) (actual time=0.025..0.246 rows=27 loops=1)                             Filter: ((end_date >= now()) OR
(end_dateIS 
 
NULL))   ->  Hash Join  (cost=1.42..3.14 rows=1 width=50) (actual 
time=2.098..2.098 rows=1 loops=1)         Hash Cond: ("outer".banner_id = "inner".banner_id)         ->  Seq Scan on
bannerb  (cost=0.00..1.66 rows=10 width=50) 
 
(actual time=0.024..0.225 rows=25 loops=1)               Filter: ((start_date <= now()) AND ((end_date >= now()) OR 
(end_date IS NULL)))         ->  Hash  (cost=1.41..1.41 rows=3 width=4) (actual 
time=1.562..1.562 rows=0 loops=1)               ->  Seq Scan on shown s  (cost=0.00..1.41 rows=3 width=4) 
(actual time=1.517..1.537 rows=1 loops=1)                     Filter: (counter = $0) Total runtime: 2.393 ms
(17 rows)

so now we're in the same ball park, at least ...

I'll keep an eye on things to see if pg_autovacuum can 'keep up' without 
having to re-add the index ...


----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Index locking considerations
Next
From: Bruce Momjian
Date:
Subject: Re: More pgindent bizarreness