Re: How does the query planner make its plan? - Mailing list pgsql-general
From | Reg Me Please |
---|---|
Subject | Re: How does the query planner make its plan? |
Date | |
Msg-id | 200711061644.34621.regmeplease@gmail.com Whole thread Raw |
In response to | How does the query planner make its plan? (Christian Schröder <cs@deriva.de>) |
List | pgsql-general |
It may depend on the index itself against the locales. 1. define the insex with the text_pattern_ops operato class 2. run vacuum analyze on the table 3. re-run the explain See chapter 11 (especiallu 11.8) for v8.2 Il Tuesday 06 November 2007 16:25:09 Christian Schröder ha scritto: > Hi list, > once again I do not understand how the query planner works and why it > apparently does not find the best result. > I have a table with about 125 million rows. There is a char(5) column > with a (non-unique) index. When I try to find the distinct values in > this column using the following sql statement: > > select distinct exchange from foo > > the query planner chooses not to use the index, but performs a > sequential scan. When I disfavour the use of sequential scans ("set > enable_seqscan = off") the performance is more than 6 times better. Why > does the query planner's plan go wrong? The table has been vacuum > analyzed just before I ran the queries. > > Here is the plan when I let the query planner alone: > > QUERY PLAN > --------------------------------------------------------------------------- >----- Unique (cost=23057876.40..23683350.48 rows=4 width=9) > -> Sort (cost=23057876.40..23370613.44 rows=125094816 width=9) > Sort Key: exchange > -> Seq Scan on quotes (cost=0.00..3301683.16 rows=125094816 > width=9) > (4 rows) > > This is what really happens: > > QUERY PLAN > --------------------------------------------------------------------------- >------------------------------------------------------------ Unique > (cost=23057876.40..23683350.48 rows=4 width=9) (actual > time=1577159.744..1968911.024 rows=4 loops=1) > -> Sort (cost=23057876.40..23370613.44 rows=125094816 width=9) > (actual time=1577159.742..1927400.118 rows=125094818 loops=1) > Sort Key: exchange > -> Seq Scan on quotes (cost=0.00..3301683.16 rows=125094816 > width=9) (actual time=0.022..169744.162 rows=125094818 loops=1) > Total runtime: 1969844.753 ms > (5 rows) > > With "enable_seqscan = off" I get this plan: > > QUERY PLAN > --------------------------------------------------------------------------- >---------------------------- Unique (cost=0.00..89811549.81 rows=4 width=9) > -> Index Scan using quotes_exchange_key on quotes > (cost=0.00..89498812.77 rows=125094816 width=9) > (2 rows) > > And again with execution times: > > > QUERY PLAN > --------------------------------------------------------------------------- >---------------------------------------------------------------------------- >------- Unique (cost=0.00..89811549.81 rows=4 width=9) (actual > time=0.079..313068.922 rows=4 loops=1) > -> Index Scan using quotes_exchange_key on quotes > (cost=0.00..89498812.77 rows=125094816 width=9) (actual > time=0.078..273787.493 rows=125094818 loops=1) > Total runtime: 313068.967 ms > (3 rows) > > I understand that from looking at the estimations (89811549.81 with > index scan vs. 23683350.48 with sequential scan) the query planner had > to choose the sequential scan. So maybe I have to tune the planner cost > constants? Indeed I did some changes to these values, but in my opinion > this should make index scans preferable: > > #seq_page_cost = 1.0 # measured on an arbitrary scale > #random_page_cost = 4.0 # same scale as above > random_page_cost = 1.0 > #cpu_tuple_cost = 0.01 # same scale as above > #cpu_index_tuple_cost = 0.005 # same scale as above > cpu_index_tuple_cost = 0.001 > #cpu_operator_cost = 0.0025 # same scale as above > #effective_cache_size = 128MB > effective_cache_size = 4GB > > The machine is a dedicated database server with two dual-core xeon > processors and 8 GB memory. > > Thanks for your help, > Christian -- Reg me Please
pgsql-general by date: