Improving a simple query? - Mailing list pgsql-performance
From | Steve Wampler |
---|---|
Subject | Improving a simple query? |
Date | |
Msg-id | 1058119515.23012.114.camel@weaver.tuc.noao.edu Whole thread Raw |
Responses |
Re: Improving a simple query?
Re: Improving a simple query? |
List | pgsql-performance |
I'm not an SQL or PostgreSQL expert. I'm getting abysmal performance on a nested query and need some help on finding ways to improve the performance: Background: RH 8.0 dual-CPU machine (1.2GHz athlon) Postgresql 7.2 1GB ram (Machine is dedicated to postgres, so there's not much else running.) The table has ~500K rows. Table definition: lab.devel.configdb=# \d attributes_table Table "attributes_table" Column | Type | Modifiers --------+--------------------------+--------------- id | character varying(64) | not null name | character varying(64) | not null units | character varying(32) | value | text | time | timestamp with time zone | default now() Indexes: id_index, name_index Primary key: attributes_table_pkey Triggers: trigger_insert View definition: lab.devel.configdb=# \d attributes; View "attributes" Column | Type | Modifiers --------+-----------------------+----------- id | character varying(64) | name | character varying(64) | units | character varying(32) | value | text | View definition: SELECT attributes_table.id, attributes_table.name, attributes_table.units, attributes_table.value FROM attributes_table; Query: select * from attributes_table where id in (select id from attributes where (name='obsid') and (value='oid00066')); Now, the inner SELECT is fast: lab.devel.configdb=# explain analyze select id from attributes where (name='obsid') and (value='oid00066'); NOTICE: QUERY PLAN: Index Scan using name_index on attributes_table (cost=0.00..18187.48 rows=15 width=25) (actual time=0.33..238.06 rows=2049 loops=1) Total runtime: 239.28 msec EXPLAIN But the outer SELECT insists on using a sequential scan [it should pick up about 20K-40K rows (normally, access is through a script]. How slow? Slow enough that: explain analyze select * from attributes_table where id in (select id from attributes where (name='obsid') and (value='oid00066')); hasn't completed in the last 15 minutes. Removing the analyze gives: lab.devel.configdb=# explain select * from attributes_table where id in (select id from attributes where (name='obsid') and (value='oid00066')); NOTICE: QUERY PLAN: Seq Scan on attributes_table (cost=100000000.00..8873688920.07 rows=241201 width=59) SubPlan -> Materialize (cost=18187.48..18187.48 rows=15 width=25) -> Index Scan using name_index on attributes_table (cost=0.00..18187.48 rows=15 width=25) EXPLAIN Obviously, something is forcing the outer select into a sequential scan, which is what I assume is the bottleneck (see above about lack of expert-ness...). I've played with the settings in postgresql.conf, using the on-line performance tuning guide: shared_buffers = 8192 # 2*max_connections, min 16 max_fsm_relations = 1000 # min 10, fsm is free space map max_fsm_pages = 10000 # min 1000, fsm is free space map max_locks_per_transaction = 128 # min 10 wal_buffers = 64 # min 4 sort_mem = 128 # min 32 vacuum_mem = 4096 # min 1024 wal_files = 32 # range 0-64 (default was 0) effective_cache_size = 96000 # default in 8k pages random_page_cost = 3 but haven't noticed an significant change with these settings over more conservative settings. Any suggestions? Is there a better way to phrase the query that would provide order-of-magnitude improvement? Thanks! Steve -- Steve Wampler -- swampler@noao.edu Quantum materiae materietur marmota monax si marmota monax materiam possit materiari?
pgsql-performance by date: