On Tue, Mar 13, 2012 at 09:28:58AM -0500, Kevin Grittner wrote:
> Bruce Momjian <bruce@momjian.us> wrote:
>
> > I just received a sobering blog comment stating that pg_upgrade
> > took 5 minutes on a 0.5TB database, but analyze took over an hour:
>
> Yeah, we have had similar experiences. Even if this can't be done
> for every release or for every data type, bringing over statistics
> from the old release as a starting point would really help minimize
> downtime on large databases.
>
> Of course, release docs should indicate which statistics *won't* be
> coming across, and should probably recommend a database ANALYZE or
> VACUUM ANALYZE be done when possible.
Having a "works timetimes" behavior is really not good; some users
aren't going to notice until it is too late that they need to run
analyze. It is fine for hard-core folks like Kevin, but not for the
average user.
At best, pg_upgrade needs to copy over the statistics it can, and adjust
the system statistics to cause autoanalyze to run on those that can't be
migrated. Frankly, as Tom stated, we have been adjusting the system
statistics collection so often that I have avoided hard-coding that
information into pg_upgrade --- it could potentially make pg_upgrade
less reliable, i.e. vacuumdb --all --analyze always works.
We might decide we want a consistently slow process rather than one that
is sometimes fast and sometimes slow.
As you can see, I am at a loss in how to improve this.
-- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB
http://enterprisedb.com
+ It's impossible for everything to be true. +