Re: 1 or 2 servers for large DB scenario. - Mailing list pgsql-performance
From | Merlin Moncure |
---|---|
Subject | Re: 1 or 2 servers for large DB scenario. |
Date | |
Msg-id | b42b73150801272025n6f0cd12cobaa131dd87fcd37f@mail.gmail.com Whole thread Raw |
In response to | 1 or 2 servers for large DB scenario. (David Brain <dbrain@bandwidth.com>) |
List | pgsql-performance |
On Jan 25, 2008 11:36 AM, David Brain <dbrain@bandwidth.com> wrote: > I'd appreciate some assistance in working through what would be the > optimal configuration for the following situation. > > We currently have one large DB (~1.2TB on disk), that essentially > consists of 1 table with somewhere in the order of 500 million rows , > this database has daily inserts as well as being used for some semi- > data mining type operations, so there are a fairly large number of > indices on the table. The hardware storing this DB (a software RAID6) > array seems to be very IO bound for writes and this is restricting our > insert performance to ~50TPS. > > As we need to achieve significantly faster insert performance I have > been considering splitting the table into 'new' and 'old' data, > making inserts into the 'new' table (which will also help as there are > really 1 insert, an update and some selects involved in populating the > table), then moving the data over to the 'old' DB on a periodic > basis. There would be new hardware involved, I'm thinking of HW RAID > 10 to improve the write performance. > > The question really is, is it best to use two separate servers and > databases (and have to come up with some copy process to move data > from one to another), or to just add the faster storage hardware to > the existing server and create a new tablespace for the 'new data' > table on that hardware. Doing this would enable us to potentially > move data more easily from new to old (we can't use partitioning > because there is some logic involved in when things would need to be > moved to 'old'). Are there any global resources that make just adding > the faster storage to the existing box a bad idea (the wal_log springs > to mind - although that could be moved too), that would make adding an > additional server instead a better idea? > > Also are there any settings that might boost our TPS on the existing > hardware (sync=off isn't an option.. (-: ). I have already > significantly increased the various buffers, but this was mainly to > improve select performance? I would (amalgamating suggestions from others and adding my own): *) get off raid 6 asap. raid 6 is wild wild west in database terms *) partition this table. if you have a lot of indexes on the table you might be running into random read problems. I'm not a huge fan in partitioning in most cases, but your case passes the smell test. Unique constraints are a problem, so partition wisely. *) move wal to separate device(s). you could see as much as double tps, but probably less than that. a single 15k drive will do, or two in a raid 1. contrary to the others, I would advise _against_ a ssd for the wal...wal writing is mostly sequential and ssd is unlikely to help (where ssd is most likely to pay off is in the database volume for faster random reads...likely not cost effective). *) and, for heaven's sake, if there is any way for you to normalize your database into more than one table, do so :-) merlin
pgsql-performance by date: