Re: Rearchitecting for storage - Mailing list pgsql-general
From | Matthew Pounsett |
---|---|
Subject | Re: Rearchitecting for storage |
Date | |
Msg-id | CAAiTEH_mz4K4gxPGsFcYRd6AdzGRkPmRTBsMXTG49XhHCUuNmQ@mail.gmail.com Whole thread Raw |
In response to | Rearchitecting for storage (Matthew Pounsett <matt@conundrum.com>) |
List | pgsql-general |
On Thu, 18 Jul 2019 at 09:44, Matthew Pounsett <matt@conundrum.com> wrote:
I've recently inherited a database that is dangerously close to outgrowing the available storage on its existing hardware. I'm looking for (pointers to) advice on scaling the storage in a financially constrained not-for-profit.
Thanks for your replies, everyone. Here's a quick summary of what I've got out of this.
Although nobody really addressed the core question of the performance tradeoffs in different storage architectures, perhaps the fact that nobody mentioned them means there really aren't any. We'll proceed on the assumption that externally attached storage really doesn't make a difference. NAS storage seems like a poor choice to me, for performance reasons, and nobody's really said anything to disabuse me of that notion.
We're going to have a look at the relative costs of single-head solutions as well as dual head (server + jbod) setups, and see what gets us the most growth for the least cost. We'll plan for enough storage to get us five years of growth, and just accept that maybe in the 5th year we won't be able to do in-place upgrades without dropping indexes.
Horizontal scalability through range partitioning sounds interesting, but I don't think it's a cost-effective solution for us right now. As long as it's possible for the db to fit in a single server (or jbod) using commodity hardware, the incremental cost of adding more chassis (and therefore more motherbaords, more CPUs, more memory) isn't offset by a reduced cost anywhere else (e.g. using cheaper drives). And that's not even accounting for the increased operational cost of coordinating the DB across multiple servers. It could be a useful approach if DB growth outpaces historical averages and we need to add hardware before a normal replacement cycle. It could also be useful at the end of that replacement cycle if DB growth has outpaced commodity hardware improvements, and single server solutions are no longer viable.
The DB server we inherited is currently a single instance, but once we expand and have replication in place I'll have to do some testing to see if LVM compression gives us any performance boost (or incurs an acceptable performance cost). The big question there is whether the processing required to do the compression is faster than the difference in read times on the disk... I think that might be dependent on the data and how it's accessed. It certainly seems like it could give us some benefits, but I don't think it's an experiment I want to attempt with only a single production copy of the DB; the downtime required to rebuild the DB server for A+B comparisons would be unacceptable.
Thanks again everyone. This has been educational.
pgsql-general by date: