Thread: factors determining the execution plan
The osdl-dbt3 test starts with building and vacuuming the database. The execution plans were taken after the vacuuming. I did two tests with the same database parameters: 1. run two osdl-dbt3 runs on one box without rebooting the stystem. Though the execution plans are the same, the costs are different. The system status are different for the two runs, for example, some page cache are not released after the first one. Does that make the cost different? 2. run two osdl-dbt3 runs on two boxes. The hardware of the two boxes are the same. And each run starts from scratch (building linux kernel, pgsql ect, and reboot). To my surprise, not only the cost are different between the two runs, the execution plan changed for Query 9. The execution plans can be found at: http://khack.osdl.org/stp/277780/results/plan/power_query9.txt http://khack.osdl.org/stp/277779/results/plan/power_query9.txt My test leads me to the following questions: What are the factors which determine the execution plan and cost? Does PostgreSQL check the system resource dynamically? If Inserts and Updates happened but vacuuming is not executed, dose the execution plan change? Thanks, -- Jenny Zhang Open Source Development Lab Inc 12725 SW Millikan Way Suite 400 Beaverton, OR 97005 (503)626-2455 ext 31
Hi, Last week I reported getting different execution plans for the same query against the same database. I did further investigation. After building the database, I did: vacuumdb -z DBT3 psql DBT3 -c "analyze supplier" psql DBT3 -c "analyze part" psql DBT3 -c "analyze partsupp" psql DBT3 -c "analyze customer" psql DBT3 -c "analyze orders" psql DBT3 -c "analyze lineitem" psql DBT3 -c "analyze nation" psql DBT3 -c "analyze region" Then I check the data in pg_class and pg_statistic and get the execution plans. I did this twice (say run_8 and run_9). I found that while the pg_class are the same for run_8 and run_9, the pg_statistic output are different. I posted the query, the execution plan, pg_class and pg_statistic output at: http://www.osdl.org/archive/jenny/ The difference in pg_statistic results in a different execution plan for query 18.sql. In fact, I updated the pg_statistic table for run_9 with the values got from run_8, then it gives me the run_8 execution plan, and the cost is very close(within 1). This makes me to believe that the optimizer picks the plan based on the values in pg_statistic and pg_class. But why the pg_statistic value changes each time I build the database? Did I do something wrong? Thanks for your help, Jenny
On 22 Aug 2003, Jenny Zhang wrote: > Last week I reported getting different execution plans for the same > query against the same database. I did further investigation. After > building the database, I did: > vacuumdb -z DBT3 > psql DBT3 -c "analyze supplier" > psql DBT3 -c "analyze part" > psql DBT3 -c "analyze partsupp" > psql DBT3 -c "analyze customer" > psql DBT3 -c "analyze orders" > psql DBT3 -c "analyze lineitem" > psql DBT3 -c "analyze nation" > psql DBT3 -c "analyze region" > > Then I check the data in pg_class and pg_statistic and get the execution > plans. > > I did this twice (say run_8 and run_9). I found that while the pg_class > are the same for run_8 and run_9, the pg_statistic output are > different. I posted the query, the execution plan, pg_class and > pg_statistic output at: > http://www.osdl.org/archive/jenny/ > > The difference in pg_statistic results in a different execution plan for > query 18.sql. In fact, I updated the pg_statistic table for run_9 with > the values got from run_8, then it gives me the run_8 execution plan, > and the cost is very close(within 1). This makes me to believe that the > optimizer picks the plan based on the values in pg_statistic and > pg_class. > > But why the pg_statistic value changes each time I build the database? > Did I do something wrong? Analyze only does a statistical sampling so it'd be expected that you can get different results from analyze on the same data. There are ways to affect the sample size and some other analyze parameters if you're interested in look at it (default_statistics_target and ALTER TABLE ALTER COLUMN SET STATISTICS).