A "set seqscan='f'" query improvement example - Mailing list pgsql-general
From | Jon Lapham |
---|---|
Subject | A "set seqscan='f'" query improvement example |
Date | |
Msg-id | 3CFE2EE2.3080709@extracta.com.br Whole thread Raw |
List | pgsql-general |
Hello- I'm not sure this is a problem, but I thought I woud report this anyway. If I "set seqscan='f'" on the following query, I see a HUGE improvement in query time (factor of 6). Maybe this example will be helpful to someone working on the planner. Note that tmp_platedataset is a temporary table. Everything has been VACUUM ANALYZE'd before issuing any queries, including the temp table. The 3 table join query: SELECT pp.sampleid FROM tmp_platedataset tpd, plate_dataset_result pdr, plate_pos pp WHERE pdr.pp_plateid=pp.plateid AND pdr.col=pp.col AND pdr.row=pp.row AND pdr.platedatasetid=tpd.platedatasetid AND tpd.toview='t'; The tables: It seems that "\d" doesn't work with temp tables... hmm, anyway, tmp_platedataset contains 8 columns, but the important one is "platedatasetid int4". There are no triggers, rules, associated with this table. main_v0_8=# select count(*) from tmp_platedataset; count ------- 401 (1 row) main_v0_8=# \d plate_dataset_result Table "plate_dataset_result" Column | Type | Modifiers ----------------+--------------+----------- platedatasetid | integer | not null row | character(1) | col | smallint | result | real | pp_plateid | integer | not null Indexes: pdsr_pp_plateid Unique keys: plate_dataset_platedataseti_key Triggers: RI_ConstraintTrigger_26388004, RI_ConstraintTrigger_23833439 main_v0_8=# select count(*) from plate_dataset_result; count --------- 2192544 (1 row) main_v0_8=# \d plate_pos Table "plate_pos" Column | Type | Modifiers -------------+------------------------+----------- plateid | integer | not null sampleid | integer | description | character varying(255) | row | character(1) | col | smallint | conc | real | Indexes: plate_pos_sampleid Unique keys: plate_pos_plateid_key Triggers: RI_ConstraintTrigger_23833631, RI_ConstraintTrigger_23833625 main_v0_8=# select count(*) from plate_pos; count ------- 46393 (1 row) Here is the long query. main_v0_8=# set enable_seqscan='t'; SET VARIABLE main_v0_8=# explain ANALYZE SELECT pp.sampleid FROM tmp_platedataset tpd, plate_dataset_result pdr, plate_pos pp WHERE pdr.pp_plateid=pp.plateid AND pdr.col=pp.col AND pdr.row=pp.row AND pdr.platedatasetid=tpd.platedatasetid AND tpd.toview='t'; NOTICE: QUERY PLAN: Merge Join (cost=48222.32..50949.26 rows=30380 width=34) (actual time=18350.80..19147.17 rows=32078 loops=1) -> Index Scan using plate_pos_plateid_key on plate_pos pp (cost=0.00..1989.92 rows=46393 width=15) (actual time=0.05..195.70 rows=19342 loops=1) -> Sort (cost=48222.32..48222.32 rows=35759 width=19) (actual time=18350.60..18396.26 rows=38496 loops=1) -> Hash Join (cost=22.02..45517.89 rows=35759 width=19) (actual time=10184.60..17237.49 rows=38496 loops=1) -> Seq Scan on plate_dataset_result pdr (cost=0.00..35073.93 rows=1994993 width=15) (actual time=0.07..12277.11 rows=2192544 loops=1) -> Hash (cost=21.01..21.01 rows=401 width=4) (actual time=3.57..3.57 rows=0 loops=1) -> Seq Scan on tmp_platedataset tpd (cost=0.00..21.01 rows=401 width=4) (actual time=0.41..2.60 rows=401 loops=1) Total runtime: 19199.12 msec Now, if I run this with heavy seqscan penalties, I can use "EXPLAIN ANALYZE": main_v0_8=# set enable_seqscan='f'; SET VARIABLE main_v0_8=# explain ANALYZE SELECT pp.sampleid FROM tmp_platedataset tpd, plate_dataset_result pdr, plate_pos pp WHERE pdr.pp_plateid=pp.plateid AND pdr.col=pp.col AND pdr.row=pp.row AND pdr.platedatasetid=tpd.platedatasetid AND tpd.toview='t'; NOTICE: QUERY PLAN: Merge Join (cost=100134246.14..100136973.08 rows=30380 width=34) (actual time=1697.54..3011.73 rows=32078 loops=1) -> Sort (cost=100134246.14..100134246.14 rows=35759 width=19) (actual time=1697.30..1745.48 rows=38496 loops=1) -> Nested Loop (cost=100000000.00..100131541.72 rows=35759 width=19) (actual time=0.44..619.23 rows=38496 loops=1) -> Seq Scan on tmp_platedataset tpd (cost=100000000.00..100000021.01 rows=401 width=4) (actual time=0.37..4.05 rows=401 loops=1) -> Index Scan using plate_dataset_platedataseti_key on plate_dataset_result pdr (cost=0.00..326.87 rows=89 width=15) (actual time=0.03..1.01 rows=96 loops=401) -> Index Scan using plate_pos_plateid_key on plate_pos pp (cost=0.00..1989.92 rows=46393 width=15) (actual time=0.08..429.31 rows=43181 loops=1) Total runtime: 3069.29 msec -Jon PS: If I can provide or do anything else that may be useful to anyone, let me know! -- -**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*--- Jon Lapham Extracta Moléculas Naturais, Rio de Janeiro, Brasil email: lapham@extracta.com.br web: http://www.extracta.com.br/ ***-*--*----*-------*------------*--------------------*---------------
pgsql-general by date: