Re: Rearchitecting for storage - Mailing list pgsql-general

From Matthew Pounsett
Subject Re: Rearchitecting for storage
Date
Msg-id CAAiTEH-7quQaF4QzQC8YL3O8v_v01=-0ZNCZd8ri1EXkzsL7tA@mail.gmail.com
Whole thread Raw
In response to Re: Rearchitecting for storage  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Responses Re: Rearchitecting for storage  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Re: Rearchitecting for storage  (Stephen Frost <sfrost@snowman.net>)
List pgsql-general


On Fri, 19 Jul 2019 at 11:25, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2019-07-19 10:41:31 -0400, Matthew Pounsett wrote:
> Okay.  So I guess the short answer is no, nobody really knows how to
> judge how much space is required for an upgrade?  :)

As I understand it, a pg_upgrade --link uses only negligible extra
space. It duplicates a bit of householding information, but not your
data tables or indexes. Your 18 TB table will definitely not be duplicated
during the upgrade if you can use --link.

The documentation for pg_upgrade --link says that the old copy is no longer usable, which means it's modifying files that are linked.  If it were only modifying small housekeeping files, then it would be most efficient not to link those, which would keep both copies of the db usable.  That seems incompatible with your suggestion that it doesn't need to modify the data files.  Depending on how it goes about doing that, it could mean a significant short-term increase in storage requirements while the data is being converted.  

Going back to our recent 'reindex database' attempt, pgsql does not necessarily do these things in the most storage-efficient manner; it seems entirely likely that it would choose to use links to duplicate the data directory, then create copies of each data file as it converts them over, then link that back to the original for an atomic replacement.  That could eat up a HUGE amount of storage during the conversion process without the start and end sizes being very different at all.  

Sorry, but I can't reconcile your use of "as I understand it" with your use of "definitely".  It sounds like you're guessing, rather than speaking from direct knowledge of how the internals of pg_upgrade.

pgsql-general by date:

Previous
From: "Peter J. Holzer"
Date:
Subject: Re: Rearchitecting for storage
Next
From: Jacob Bunk Nielsen
Date:
Subject: Re: Rearchitecting for storage