'best practises' to speed up sorting? tuning postgresql.conf - Mailing list pgsql-performance
From | Chris Hutchinson |
---|---|
Subject | 'best practises' to speed up sorting? tuning postgresql.conf |
Date | |
Msg-id | IDEOKBCDGGIDOBADNGAPEEGADNAA.chris@hutchinsonsoftware.com Whole thread Raw |
Responses |
Re: 'best practises' to speed up sorting? tuning postgresql.conf
Re: 'best practises' to speed up sorting? tuning postgresql.conf |
List | pgsql-performance |
I'm wondering how to speed up sorting which is slowing a query run regularly on a linux postgresql 7.3.3 system. The box is a dual PIII with 1Gb ram. The database is located on a 20Gb SCSI disk, with WAL on a separate disk. The only changes I've made to postgresql.conf so far are: shared_buffers=4000 sort_mem=4000 The results of explain analyze are below. The largest tables in the join, genotypes and ebv, contain 440681 and 3060781 rows respectively, the others contain perhaps a couple of hundred each. I've reduced the runtime of the query to 2 minutes down from 10 minutes by eliminating a left join and ORDER BY, but it would be nice to speed it up further. The time goes up to nearly 6 minutes if I put the ORDER BY back into the query. Any suggestions gratefully received. Regards, Chris Hutchinson ---------------------------------------------------------------------- 'fast query' without sort explain analyze SELECT A.genid,A.name,B.familyid,B.name,B.mumid,B.mumtype,B.dadid, B.dadtype,CBA.title,CA.sctrait,C.nvalue,C.accuracy FROM genotypes A , ebv C , sctraitdefinition CA , tpr CB , tpsystems CBA , geneticfamily B WHERE (C.genid=A.genid) and ((A.speciesid='2') and (B.familyid=A.familyid)) and ((C.runid='72') and (CA.sctraitid=C.sctraitid) and (CB.runid=C.runid) and (CBA.systemid=CB.systemid)); QUERY PLAN ------------------------------------------------------- Hash Join (cost=18114.84..287310.15 rows=678223 width=130) (actual time=11824.24..139737.82 rows=1460290 loops=1) Hash Cond: ("outer".sctraitid = "inner".sctraitid) -> Hash Join (cost=18113.32..273744.17 rows=678223 width=110) (actual time=11813.27..124934.74 rows=1460290 loops=1) Hash Cond: ("outer".runid = "inner".runid) -> Hash Join (cost=18106.69..260173.07 rows=678223 width=73) (actual time=11782.12..107005.61 rows=1460290 loops=1) Hash Cond: ("outer".genid = "inner".genid) -> Seq Scan on ebv c (cost=0.00..195640.76 rows=1392655 width=20) (actual time=4684.71..68728.26 rows=1460290 loops=1) Filter: (runid = 72) -> Hash (cost=15474.16..15474.16 rows=214612 width=53) (actual time=7089.57..7089.57 rows=0 loops=1) -> Hash Join (cost=147.83..15474.16 rows=214612 width=53) (actual time=58.74..6597.01 rows=226561 loops=1) Hash Cond: ("outer".familyid = "inner".familyid) -> Seq Scan on genotypes a (cost=0.00..9424.51 rows=214612 width=17) (actual time=10.47..3914.89 rows=226561 loops=1) Filter: (speciesid = 2) -> Hash (cost=132.46..132.46 rows=6146 width=36) (actual time=48.06..48.06 rows=0 loops=1) -> Seq Scan on geneticfamily b (cost=0.00..132.46 rows=6146 width=36) (actual time=2.90..36.74 rows=6146 loops=1) -> Hash (cost=6.60..6.60 rows=13 width=37) (actual time=31.06..31.06 rows=0 loops=1) -> Hash Join (cost=5.16..6.60 rows=13 width=37) (actual time=30.93..31.04 rows=13 loops=1) Hash Cond: ("outer".systemid = "inner".systemid) -> Seq Scan on tpsystems cba (cost=0.00..1.16 rows=16 width=29) (actual time=6.68..6.72 rows=16 loops=1) -> Hash (cost=5.13..5.13 rows=13 width=8) (actual time=24.17..24.17 rows=0 loops=1) -> Seq Scan on tpr cb (cost=0.00..5.13 rows=13 width=8) (actual time=4.75..24.15 rows=13 loops=1) -> Hash (cost=1.41..1.41 rows=41 width=20) (actual time=10.90..10.90 rows=0 loops=1) -> Seq Scan on sctraitdefinition ca (cost=0.00..1.41 rows=41 width=20) (actual time=10.72..10.82 rows=41 loops=1) Total runtime: 140736.98 msec (24 rows) 'slow query' with sort explain analyze SELECT A.genid,A.name,B.familyid,B.name,B.mumid,B.mumtype,B.dadid, B.dadtype,CBA.title,CA.sctrait,C.nvalue,C.accuracy FROM genotypes A , ebv C , sctraitdefinition CA , tpr CB , tpsystems CBA , geneticfamily B WHERE (C.genid=A.genid) and ((A.speciesid='2') and (B.familyid=A.familyid)) and ((C.runid='72') and (CA.sctraitid=C.sctraitid) and (CB.runid=C.runid) and (CBA.systemid=CB.systemid)) ORDER BY A.genid ASC,B.familyid ASC,CA.sctrait ASC; QUERY PLAN ------------------------------------------------------- Sort (cost=540740.81..542436.37 rows=678223 width=130) (actual time=322602.06..346710.43 rows=1460290 loops=1) Sort Key: a.genid, b.familyid, ca.sctrait -> Hash Join (cost=18114.84..287310.15 rows=678223 width=130) (actual time=10398.55..144991.62 rows=1460290 loops=1) Hash Cond: ("outer".sctraitid = "inner".sctraitid) -> Hash Join (cost=18113.32..273744.17 rows=678223 width=110) (actual time=10384.84..129637.23 rows=1460290 loops=1) Hash Cond: ("outer".runid = "inner".runid) -> Hash Join (cost=18106.69..260173.07 rows=678223 width=73) (actual time=10353.69..111239.94 rows=1460290 loops=1) Hash Cond: ("outer".genid = "inner".genid) -> Seq Scan on ebv c (cost=0.00..195640.76 rows=1392655 width=20) (actual time=4499.94..74509.34 rows=1460290 loops=1) Filter: (runid = 72) -> Hash (cost=15474.16..15474.16 rows=214612 width=53) (actual time=5845.85..5845.85 rows=0 loops=1) -> Hash Join (cost=147.83..15474.16 rows=214612 width=53) (actual time=58.75..5346.04 rows=226561 loops=1) Hash Cond: ("outer".familyid = "inner".familyid) -> Seq Scan on genotypes a (cost=0.00..9424.51 rows=214612 width=17) (actual time=7.00..2799.43 rows=226561 loops=1) Filter: (speciesid = 2) -> Hash (cost=132.46..132.46 rows=6146 width=36) (actual time=51.54..51.54 rows=0 loops=1) -> Seq Scan on geneticfamily b (cost=0.00..132.46 rows=6146 width=36) (actual time=2.88..39.66 rows=6146 loops=1) -> Hash (cost=6.60..6.60 rows=13 width=37) (actual time=31.05..31.05 rows=0 loops=1) -> Hash Join (cost=5.16..6.60 rows=13 width=37) (actual time=30.92..31.03 rows=13 loops=1) Hash Cond: ("outer".systemid = "inner".systemid) -> Seq Scan on tpsystems cba (cost=0.00..1.16 rows=16 width=29) (actual time=6.67..6.72 rows=16 loops=1) -> Hash (cost=5.13..5.13 rows=13 width=8) (actual time=24.17..24.17 rows=0 loops=1) -> Seq Scan on tpr cb (cost=0.00..5.13 rows=13 width=8) (actual time=4.72..24.14 rows=13 loops=1) -> Hash (cost=1.41..1.41 rows=41 width=20) (actual time=13.62..13.62 rows=0 loops=1) -> Seq Scan on sctraitdefinition ca (cost=0.00..1.41 rows=41 width=20) (actual time=13.43..13.54 rows=41 loops=1) Total runtime: 347780.04 msec (26 rows) The tables look like this: Table "public.ebv" Column | Type | Modifiers -----------+---------+----------- runid | integer | not null genid | integer | not null sctraitid | integer | not null nvalue | real | not null accuracy | real | Indexes: idx_ebg btree (genid), idx_ebv btree (runid), idx_ebvr btree (runid) Table "public.genotypes" Column | Type | Modifiers ----------------+--------------------------+-------------------------------- ------------------------- genid | integer | not null default nextval('"genotypes_genid_seq"'::text) speciesid | integer | not null familyid | integer | not null created | timestamp with time zone | not null batchid | integer | not null lastmodifiedby | integer | not null lastmodified | timestamp with time zone | not null name | character varying(100) | Indexes: genotypes_pkey primary key btree (genid), idx_gnm unique btree (speciesid, name), idx_gb btree (batchid), idxc btree (created), idxgf btree (familyid), idxgsp btree (speciesid) Table "public.tpr" Column | Type | Modifiers ---------------+--------------------------+--------------------------------- -------------------------- runid | integer | not null default nextval('"tpr_runid_seq"'::text) description | text | not null systemid | integer | not null runlog | text | ranby | integer | not null whenrun | timestamp with time zone | not null valid_p | character(1) | not null default 'f' whenvalidated | timestamp with time zone | whovalidated | integer | notes | text | Indexes: tpr_pkey primary key btree (runid), idx_tprv btree (valid_p) Table "public.geneticfamily" Column | Type | Modifiers ----------------+--------------------------+-------------------------------- -------------------------------- familyid | integer | not null default nextval('"geneticfamily_familyid_seq"'::text) speciesid | integer | not null mumid | integer | not null mumtype | character(1) | not null dadid | integer | not null dadtype | character(1) | not null batchid | integer | not null lastmodifiedby | integer | not null lastmodified | timestamp with time zone | name | character varying(100) | Indexes: geneticfamily_pkey primary key btree (familyid), idu_gfam unique btree (mumid, dadid, mumtype, dadtype), idx_gfnm unique btree (speciesid, name), idx_gfb btree (batchid), idx_gfun btree (upper(name)), idxgfd btree (dadid), idxgff btree (mumid, dadid), idxgfm btree (mumid), idxpsp btree (speciesid) Table "public.tpsystems" Column | Type | Modifiers ------------------+--------------------------+------------------------------ ------------------------------ systemid | integer | not null default nextval('"tpsystems_systemid_seq"'::text) speciesid | integer | not null title | character varying(250) | not null description | text | lastmodifiedby | integer | not null lastmodified | timestamp with time zone | not null variancechecksum | character varying(32) | Indexes: tpsystems_pkey primary key btree (systemid) Table "public.sctraitdefinition" Column | Type | Modifiers --------------------+--------------------------+---------------------------- ----------------------------------------- sctraitid | integer | not null default nextval('"sctraitdefinition_sctraitid_seq"'::text) speciesid | integer | not null sctrait | character varying(32) | not null lastmodifiedby | integer | not null lastmodified | timestamp with time zone | not null datatype | character varying(32) | datatransformation | character varying(32) | Indexes: sctraitdefinition_pkey primary key btree (sctraitid), idu_sctd unique btree (speciesid, sctrait) -------------------------------------------------------------------------- 'top' looks like this most of the time.. (except when queries are running) 8:40pm up 54 days, 7:45, 2 users, load average: 0.00, 0.19, 0.32 44 processes: 43 sleeping, 1 running, 0 zombie, 0 stopped CPU0 states: 0.1% user, 0.0% system, 0.0% nice, 99.4% idle CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle Mem: 1028428K av, 846776K used, 181652K free, 0K shrd, 1580K buff Swap: 530104K av, 53540K used, 476564K free 804148K cached
pgsql-performance by date: