Re: Rearchitecting for storage - Mailing list pgsql-general

From Matthew Pounsett
Subject Re: Rearchitecting for storage
Date
Msg-id CAAiTEH8SbbhdXcpNuu6kZzP7kJZKL_dFUcJ=TxBB_VDopzmKhg@mail.gmail.com
Whole thread Raw
In response to Re: Rearchitecting for storage  (Luca Ferrari <fluca1978@gmail.com>)
Responses Re: Rearchitecting for storage  (Kenneth Marshall <ktm@rice.edu>)
Re: Rearchitecting for storage  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Re: Rearchitecting for storage  (Luca Ferrari <fluca1978@gmail.com>)
List pgsql-general


On Fri, 19 Jul 2019 at 04:21, Luca Ferrari <fluca1978@gmail.com> wrote:

This could be trivial, but any chance you can partition the table
and/or archive unused records (at least temporarly)? A 18 TB table
quite frankly sounds a good candidate to contain records no one is
interested in the near future.

Partitioning is a possibility.  The whole database is historical test results, stored specifically for doing comparisons over time, so I'm not sure we can actually archive anything.  Expiring old test data is a discussion we might have to have, eventually.
 
In any case, if you can partition the table chances are you can at
least do a per-table backup that could simplify maintanance of the
database.

My current backup plan for this database is on-site replication, and a monthly pg_dump from the standby to be copied off-site.  Doing per-table backups sounds like a great way to end up with an inconsistent backup, but perhaps I misunderstand what you mean.  

Another possibility is putting the server into backup mode and taking a snapshot of the filesystem, but coordinating that across chassis (in the case where partitioning is used) in such a way that the db is consistent sounds like a hard problem... unless issuing pg_start_backup on the chassis holding the master table coordinates backup mode on all the chassis holding child tables at the same time?  I haven't read enough on that yet.
 

In desperate order, I would check also the log files (I mean, textual
logs, not wals) because occasionally I found them requiring a few GBs
on my disk, and that can be easily archived to gain some more extra
space.
Then I would go for some commodity NAS to attach as extra storage, at
least for the upgrade process.

Okay.  So I guess the short answer is no, nobody really knows how to judge how much space is required for an upgrade?  :)

Any logs we have are going to be a rounding error when compared to the database itself.  And buying storage last-minute because an upgrade failed is exactly the sort of thing that a resource constrained not-for-profit can't do.  We really need to be able to plan this out long term so that we get as much as possible out of every dollar.
 
If any of the following fails, I would probably drop all the indexes
to gain extra space, perform the upgrade, and then reindex (removing
the old cluster, in the case it has not been upgraded with the link
option).
 
Yeah, this sort of trial-and-error approach to getting upgrades done will bother me, but it seems like it'll be necessary once we start growing into whatever new home we get for the db.  

Thanks very much for your time on this.

pgsql-general by date:

Previous
From: Matthew Pounsett
Date:
Subject: Re: Rearchitecting for storage
Next
From: Kenneth Marshall
Date:
Subject: Re: Rearchitecting for storage