Re: Speeding up pg_upgrade - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Speeding up pg_upgrade
Date
Msg-id 14416.1512672741@sss.pgh.pa.us
Whole thread Raw
In response to Re: Speeding up pg_upgrade  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
>> The reason pg_upgrade hasn't done that in the past is not wishing to
>> assume that the new version does stats identically to the old version.
>> Since we do in fact add stats or change stuff around from time to time,
>> that's not a negligible consideration.

> Sure, but the new version can probably limp along with incomplete stats
> until the next natural ANALYZE runs -- the system is operational in much
> shorter time than if you have to make it all wait for the post-upgrade
> full-database ANALYZE run.  The serialization step is so that the
> underlying representation doesn't have to remain identical -- surely the
> new server would be able to represent whatever the old server was able
> to, regardless of any improvement made.

Well, this is assuming a lot of stuff not in evidence about how the
"serialization format" is designed and how we insert the data in the
new installation.  But if you think you can come up with something
that can handle such cases, go for it.

(In the spirit of full disclosure, I actually wrote code that allowed
dumping and reloading stats while I was at Salesforce.  But I've forgotten
the details of that design, and anyway I'm pretty sure it didn't handle
any cross-version scenarios, else I probably would have offered it to
the community.)

            regards, tom lane


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Logical replication without a Primary Key
Next
From: Robert Haas
Date:
Subject: Re: Mention ordered datums in PartitionBoundInfoData comment