Re: 8.4.4, 9.0, and 9.1 Planner Differences - Mailing list pgsql-performance

From Tom Lane
Subject Re: 8.4.4, 9.0, and 9.1 Planner Differences
Date
Msg-id 9317.1319299106@sss.pgh.pa.us
Whole thread Raw
In response to 8.4.4, 9.0, and 9.1 Planner Differences  (Anthony Presley <anthony@resolution.com>)
Responses Re: 8.4.4, 9.0, and 9.1 Planner Differences
List pgsql-performance
Anthony Presley <anthony@resolution.com> writes:
> We have a dev machine running 9.0.1 (an i3 laptop, with a regular hard disk,
> with 4GB of RAM, and a mostly untuned postgresql.conf file).  The changed
> lines are:
>   shared_buffers = 512MB
>   temp_buffers = 48MB
>   work_mem = 32MB
>   maintenance_work_mem = 348MB
>   checkpoint_segments = 10
>   effective_cache_size = 512MB

> The same database is loaded onto a production server running 9.1.1 (dual QC
> processors, RAID-10 SAS drives, 36GB of RAM), which replicates to a backup
> server.  This has a lot of changed properties:
>   shared_buffers = 8500MB
>   work_mem = 35MB
>   maintenance_work_mem = 512MB
>   wal_level = hot_standby
>   checkpoint_segments = 50
>   max_wal_senders = 3
>   wal_keep_segments = 144
>   random_page_cost = 1.0
>   effective_cache_size = 16384MB
>   effective_io_concurrency = 6

That random_page_cost setting is going to have a huge effect on the
planner's choices, and the larger effective_cache_size setting will
likely affect plans too.  I don't find it surprising in the least
that you're getting different plan choices ... and even less so when
your "dev" and "production" DBs aren't even the same major version.
You might want to think about making your dev environment more like
your production.

> The same DB is loaded on both the production and the dev environment, and in
> all cases (about 5000 distinct different queries), the production
> environment is about 500x faster, except for one type of query (both
> databases were loaded from the same pg_dump on an 8.4.4 database):

>     On the dev box, we have:  http://explain.depesz.com/s/rwU   - about 131
> seconds
>     On the production box, we have:  http://explain.depesz.com/s/3dt  -
> about .25 seconds

Did you mislabel these?  Because if you didn't, the numbers are right
in line with what you say above.  But anyway, the problem with the
slower query appears to be poor rowcount estimates, leading the planner
to use a nestloop join when it shouldn't.  You haven't provided nearly
enough context to let anyone guess why the estimates are off, other
than boilerplate suggestions like making sure the tables have been
ANALYZEd recently, and maybe increasing the statistics targets.

            regards, tom lane

pgsql-performance by date:

Previous
From: Radhya sahal
Date:
Subject: explain workload
Next
From: Anthony Presley
Date:
Subject: Re: 8.4.4, 9.0, and 9.1 Planner Differences