Re: Dump/Reload pg_statistic to cut time from pg_upgrade? - Mailing list pgsql-general

From Jeff Davis
Subject Re: Dump/Reload pg_statistic to cut time from pg_upgrade?
Date
Msg-id 1378056957.21540.14.camel@jdavis
Whole thread Raw
In response to Dump/Reload pg_statistic to cut time from pg_upgrade?  (Jerry Sievers <gsievers19@comcast.net>)
Responses Re: Dump/Reload pg_statistic to cut time from pg_upgrade?  (Bruce Momjian <bruce@momjian.us>)
List pgsql-general
On Mon, 2013-07-08 at 14:20 -0500, Jerry Sievers wrote:
> Planning to pg_upgrade some large (3TB) clusters using hard link
> method.  Run time for the upgrade itself takes around 5 minutes.
> Nice!!  Origin version 8.4 and destination version 9.1.
>
> Unfortunately the post-upgrade analyze of the entire cluster is going
> to take a minimum of 1.5 hours running several threads to analyze all
> tables.  This was measured in an R&D environment.

...

> Anyway, perhaps there are other good reasons I should *not* attempt
> this but it turns out that the stats table can't be reloaded with it's
> own dump so this of course is a show-stopper.
>
> psql:d:456: ERROR:  cannot accept a value of type anyarray
> CONTEXT:  COPY pg_statistic, line 1, column stavalues1: "{_assets,_income,_liabilities,assets,income,liabilities}"

[ late response, but might still be useful to someone ]

You can work around the problem with a little effort if you call
array_in directly. It takes the type output (cstring), element type
(oid), and element typmod (integer).

To dump the pg_statistics table, you have to output all of the columns
plus the type ID and the typmod, and then load it back in by doing
something like:

   insert into pg_statistic
      select starelid, ...,
         array_in(stavalues1, the_element_type, -1), ...
      from my_statistics_dump;

The element typmod is always -1 for pg_statistic. To get the element
type, you can join against pg_attribute. The only problem is, you don't
actually want the attribute type, you want the type used for the
statistics, which is normally the same but could be different. I don't
think the statypid is stored in the catalog, so you'd have to inventory
the types that you use and figure out a mapping of the type to it's
statistics type looking at the typanalyze routines.

So, it's possible to do, but not worth the effort unless you are quite
concerned about the analyze time post-upgrade.

It would be nice if we had a better way to backup, transfer, and upgrade
statistics. However, allowing statistics to be upgraded could be a
challenge if the statistics format changes between releases.

Regards,
    Jeff Davis







pgsql-general by date:

Previous
From: Jeff Janes
Date:
Subject: Re: My Experiment of PG crash when dealing with huge amount of data
Next
From: Jeff Davis
Date:
Subject: Re: Why is NULL = unbounded for rangetypes?