Thread: comparison of 8.3.10 to 8.3.14 reveals unexpected difference in explain plan

comparison of 8.3.10 to 8.3.14 reveals unexpected difference in explain plan

From
Mark Rostron
Date:

Hi

 

My question is:         Was there any major optimizer change between 8.3.10 to 8.3.14?

 

I’m getting a difference in explain plans that I need to account for.

 

We are running production pg8.3.10, and are considering upgrading to 8.4.x (maybe 9.0), because we expected to benefit from some of the performance fixes of 8.4, in particular the improved use of the posix fadvise on bitmap index scans, mentioned in the 8.4.0 release notes.

 

So, I installed the latest 8.3.14 and did a comparison query between the test machine and prod 8.3.10, to establish a machine power difference.

To do this, I am running a query across two pg 8.3.x installs – prod 8.3.10 and new 8.3.14.

The database used as the test in each instance is a new database, with an identical data import on both.

The 8.3.10 prod machine is a faster cpu ( prod 8.3.10: 3ghz intel E5700, 8.3.14: 2.2ghz intel E5345 (and less ram)).

The memory settings (shared_buffers, effective_cache_size, work_mem, maintenance_work_mem) are equal.

The results are against repeated queries, so there is no I/O component in the comparison – it is simply cpu and memory.

 

So, I expected the query response on  8.3.14 to be slower, due to being on a less powerful machine.

However, not so: I am actually getting a faster result on the 8.3.14 installation (in spite of the machine being less powerful).

 

Looking at the explain plan, something changed.

 

For some reason, the “index scan” and “index cond” ops used by 8.3.10 are replaced by a “bitmap index scan” and “index cond” in the 8.3.14.

I’m pretty sure this is giving me the better result in 8.3.14.

(in spite of the reduced machine power).

 

 

Obviously this result is quite unexpected and I am trying to work out why.

(The only other mention that I have seen of bitmap index scan improvements was in the 8.4.0 release notes).

 

So, I am looking for information as to why this change occurred.

I reckon either it is a real version difference between 8.3.10 and 8.3.14, or else a difference in configuration.

 

Does anyone have any comments?

 

 

 

The 8.3.10 plan is:

 

explain select * from view_v1 where action_date between '2010-10-01' and '2010-12-08'

                                                  QUERY PLAN

--------------------------------------------------------------------------------------------------------------

Hash Join  (cost=755.53..546874.02 rows=3362295 width=99)

   Hash Cond: (gp.ql2_id = dt.ql2_id)

   ->  Index Scan using gpn_nk_1 on data_stuff_new gp  (cost=0.00..495732.14 rows=4465137 width=40)

         Index Cond: ((action_date >= '2010-10-01'::date) AND (action_date <= '2010-12-08'::date))

         Filter: (action_hour = ANY ('{8,10,11,12,13,14,15,16}'::integer[]))

   ->  Hash  (cost=720.80..720.80 rows=2779 width=67)

         ->  Hash Join  (cost=561.38..720.80 rows=2779 width=67)

               Hash Cond: (dtxgm.ql2_id = dt.ql2_id)

               ->  Seq Scan on data_thindt_xref_group_membership dtxgm  (cost=0.00..93.41 rows=2779 width=10)

                     Filter: (org_id = 1288539986)

               ->  Hash  (cost=451.17..451.17 rows=8817 width=57)

                     ->  Seq Scan on data_thing dt  (cost=0.00..451.17 rows=8817 width=57)

(12 rows)

 

 

The plan on 8.3.14 is:

 

explain select * from view_v1 where action_date between '2010-10-01' and '2010-12-08'

 

                                                   QUERY PLAN

-----------------------------------------------------------------------------------------------------------------

Hash Join  (cost=190151.42..684420.67 rows=3403329 width=99)

   Hash Cond: (gp.ql2_id = dt.ql2_id)

   ->  Bitmap Heap Scan on data_stuff_new gp  (cost=189395.38..633046.20 rows=4471358 width=40)

         Recheck Cond: ((action_date >= '2010-10-01'::date) AND (action_date <= '2010-12-08'::date))

         Filter: (action_hour = ANY ('{8,10,11,12,13,14,15,16}'::integer[]))

         ->  Bitmap Index Scan on gpn_nk_1  (cost=0.00..188277.54 rows=7090513 width=0)

               Index Cond: ((action_date >= '2010-10-01'::date) AND (action_date <= '2010-12-08'::date))

   ->  Hash  (cost=721.13..721.13 rows=2793 width=67)

         ->  Hash Join  (cost=561.38..721.13 rows=2793 width=67)

               Hash Cond: (dtxgm.ql2_id = dt.ql2_id)

               ->  Seq Scan on data_thindt_xref_group_membership dtxgm  (cost=0.00..93.41 rows=2793 width=10)

                     Filter: (org_id = 1288539986)

               ->  Hash  (cost=451.17..451.17 rows=8817 width=57)

                     ->  Seq Scan on data_thing dt  (cost=0.00..451.17 rows=8817 width=57)

(14 rows)

 

 

 

Re: comparison of 8.3.10 to 8.3.14 reveals unexpected difference in explain plan

From
Scott Marlowe
Date:
If you diff the postgresql.conf files for both installs, what's different?

Mark Rostron wrote:

Was there any major optimizer change between 8.3.10 to 8.3.14? 

I’m getting a difference in explain plans that I need to account for.


There were some major changes in terms of how hashing is used for some types of query plans.  And one of the database parameters, default_statistics_target, increased from 10 to 100 between those two versions.  You can check what setting you have on each by doing:

show default_statistics_target;

From within psql.  It's possible the 8.3 optimizer is just getting lucky running without many statistics, and collecting more of them is making things worse.  It's also possible you're running into a situation where one of the new hash approaches in 8.4 just isn't working out well for you.

It would be easier to suggest what might be wrong if you included "EXPLAIN ANALYZE" output instead of just EXPLAIN.  It's not obvious whether 8.3 or 8.4 is estimating things better.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: comparison of 8.3.10 to 8.3.14 reveals unexpected difference in explain plan

From
Mark Rostron
Date:
I found the difference.
Random_page_cost is 1 in the production 8.3.10, I guess weighting the decision to use "index scan".
Thanks for the replies, gentlemen.

> If you diff the postgresql.conf files for both installs, what's different?

In the list below, 8.3.10 parameter value is in the clear, (8.3.14 is in brackets)

Max_fsm_pages 819200 vs (204800)
Max_fsm_relations 4000 vs (dflt 1000)
Synchronous_commit off vs (dflt on)
Wal_buffers 256kb vs (dflt 64kb)
Checkpoint_segments 128 vs (dflt 3)
Random_page_cost 1 vs (dflt 4)            #!!! Actually this is the difference in the explain plans
Constraint_exclusion on vs (dflt off)
.... a bunch of logging parameters have been set ....
Autovacuum_freeze_max_age 900000000 vs (dflt 200000000)
vacuum_freeze_min_age = 50000000 vs (dflt 100000000)
deadlock_timeout = 20s (vs dflt 1s)
add_missing_from = on (vs dflt off)




Re: comparison of 8.3.10 to 8.3.14 reveals unexpected difference in explain plan

From
Mark Rostron
Date:


> It would be easier to suggest what might be wrong if you included "EXPLAIN ANALYZE" output instead of just EXPLAIN. 

> It's not obvious whether 8.3 or 8.4 is estimating things better.

Thanks for reply man

Turns out random_page_cost was set low in the 8.3.10 version – when I reset it to 4(dflt), the explain plans are the same.

We’ll double check our other queries, and then I’ll see if I can reset it to dflt for the database.