Re: Rearchitecting for storage - Mailing list pgsql-general

From Matthew Pounsett
Subject Re: Rearchitecting for storage
Date
Msg-id CAAiTEH_VY_hPVBFNrkdD4YHHst5WJff9=gCt2cE3B0xeFV_2Hg@mail.gmail.com
Whole thread Raw
In response to Re: Rearchitecting for storage  (Kenneth Marshall <ktm@rice.edu>)
Responses Re: Rearchitecting for storage
Re: Rearchitecting for storage
Re: Rearchitecting for storage
Re: Rearchitecting for storage
List pgsql-general


On Thu, 18 Jul 2019 at 13:34, Kenneth Marshall <ktm@rice.edu> wrote:
Hi Matt,

Hi!  Thanks for your reply.
 
Have you considered using the VDO compression for tables that are less
update intensive. Using just compression you can get almost 4X size
reduction. For a database, I would forgo the deduplication function.
You can then use a non-compressed tablespace for the heavier I/O tables
and indexes.

VDO is a RedHat-only thing, isn't it?  We're not running RHEL... Debian.  Anyway, the bulk of the data (nearly 80%) is in a single table and its indexes.  ~6TB to the table, and ~12TB to its indices.  Even if we switched over to RedHat, there's no value in compressing lesser-used tables.
 

> My understanding of the standard
> upgrade process is that this requires that the data directory be smaller
> than the free storage (so that there is room to hold two copies of the data
> directory simultaneously). 

The link option with pg_upgrade does not require 2X the space, since it
uses hard links instead of copying the files to the new cluster.

That would likely keep the extra storage requirements small, but still non-zero.  Presumably the upgrade would be unnecessary if it could be done without rewriting files.  Is there any rule of thumb for making sure one has enough space available for the upgrade?   I suppose that would come down to what exactly needs to get rewritten, in what order, etc., but the pg_upgrade docs don't seem to have that detail.  For example, since we've got an ~18TB table (including its indices), if that needs to be rewritten then we're still looking at requiring significant extra storage.  Recent experience suggests postgres won't necessarily do things in the most storage-efficient way.. we just had a reindex on that database fail (in --single-user) because 17TB was insufficient free storage for the db to grow into.

pgsql-general by date:

Previous
From: "Kumar, Virendra"
Date:
Subject: Possible Values of Command Tag in PG Log file
Next
From: Adrian Klaver
Date:
Subject: Re: Possible Values of Command Tag in PG Log file