Thread: DB on mSATA SSD
Dear Postgresql mailing list, we use Postgresql 8.4.x on our Linux firewall distribution. Actually, we are moving from standard SATA disk to mSATA SSD solid drive, and we noticed that the DB, using lots of indexes,is writing a lot. In some monthes, two test machine got SSD broken, and we are studying how to reduce write impact for DB. Are there some suggestions with SSD drives? Putting the DB into RAM and backing up periodically to disk is a valid solutions? Or, is storing indexes on a ram drive possible? Thank you in advance for your appreciated interest! Best regards, Francesco
Are there some suggestions with SSD drives?
Putting the DB into RAM and backing up periodically to disk is a valid solutions?
I have some very busy databases on SSD-only systems. I think you're using SSDs that are not rated for server use.
Dear Postgresql mailing list,
we use Postgresql 8.4.x on our Linux firewall distribution.
Actually, we are moving from standard SATA disk to mSATA SSD solid drive, and we noticed that the DB, using lots of indexes, is writing a lot.
In some monthes, two test machine got SSD broken, and we are studying how to reduce write impact for DB.
Are there some suggestions with SSD drives?
Putting the DB into RAM and backing up periodically to disk is a valid solutions?
Or, is storing indexes on a ram drive possible?
Thank you in advance for your appreciated interest!
Best regards,
Francesco
To define a tablespace, use the CREATE TABLESPACE command, for example::
CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';
The location must be an existing, empty directory that is owned by the PostgreSQL system user. All objects subsequently created within the tablespace will be stored in files underneath this directory.
- temp_tablespaces (string)
This variable specifies tablespace(s) in which to create temporary objects (temp tables and indexes on temp tables) when a CREATE command does not explicitly specify a tablespace. Temporary files for purposes such as sorting large data sets are also created in these tablespace(s).
The value is a list of names of tablespaces. When there is more than one name in the list, PostgreSQL chooses a random member of the list each time a temporary object is to be created; except that within a transaction, successively created temporary objects are placed in successive tablespaces from the list. If the selected element of the list is an empty string, PostgreSQL will automatically use the default tablespace of the current database instead.
When temp_tablespaces is set interactively, specifying a nonexistent tablespace is an error, as is specifying a tablespace for which the user does not have CREATE privilege. However, when using a previously set value, nonexistent tablespaces are ignored, as are tablespaces for which the user lacks CREATE privilege. In particular, this rule applies when using a value set in postgresql.conf.
The default value is an empty string, which results in all temporary objects being created in the default tablespace of the current database.
He's about as useful as a wax frying pan.
10 to the 12th power microphones = 1 Megaphone
Maranatha! <><
John McKown
> Dear Postgresql mailing list, > > we use Postgresql 8.4.x on our Linux firewall distribution. > Actually, we are moving from standard SATA disk to mSATA SSD solid drive, and we noticed that the DB, using lots of indexes,is writing a lot. > > In some monthes, two test machine got SSD broken, and we are studying how to reduce write impact for DB. > > Are there some suggestions with SSD drives? > Putting the DB into RAM and backing up periodically to disk is a valid solutions? > > Or, is storing indexes on a ram drive possible? > > Thank you in advance for your appreciated interest! > > Best regards, > Francesco Hi, I don't think that today's SSDs - and certainly not the server-grade ones - will break due to write intensive loads. Have a look at the SMART data for you drives, there should be some metrics called "wear level count" or similar that gives some indications. I wouldn't be surprised if you find that your broken drives had failures not related to wear level. If you're on Linux use smartctl. Also, as others have pointed out 8.4 is out of support, so consider upgrading. Bye, Chris.
Hi, On 04/23/15 14:33, John McKown wrote: > > That's a really old release. But I finally found some doc on it. And > 8.4 does appear to have TABLESPACEs in it. > > http://www.postgresql.org/docs/8.4/static/manage-ag-tablespaces.html > > <quote> > > To define a tablespace, use the CREATE TABLESPACE > <http://www.postgresql.org/docs/8.4/static/sql-createtablespace.html> command, > for example:: > > CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data'; > > The location must be an existing, empty directory that is owned by the > PostgreSQL system user. All objects subsequently created within the > tablespace will be stored in files underneath this directory. I think you should read this: http://thebuild.com/blog/2013/03/10/you-cannot-recover-from-the-loss-of-a-tablespace/ in other words, using tablespaces for placing some of the data into a RAM filesystem (which may disappear) is a rather terrible idea. In case of crash you won't be able to even start the database, because it will try to recover the tablespace. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Apr 23, 2015 at 7:07 AM, Job <Job@colliniconsulting.it> wrote:Are there some suggestions with SSD drives?
Putting the DB into RAM and backing up periodically to disk is a valid solutions?
I have some very busy databases on SSD-only systems. I think you're using SSDs that are not rated for server use.Your strategy of using in-ram disk and backing up to the SSD is sensible, depending on what guarantees you need for the data to survive an unclean system shutdown. You will want to use a file system that allows you to snapshot and backup or logical DB backups. Postgres 9.4 has some features that will make taking the backup from the file system much easier and cleaner, too.
FS Snapshots are an option but one should make sure that all file systems are snapshot atomically, which is not very common unless you use ZFS or similarly high-end FS.
Regarding file filesys based backups, apart from pg_basebackup which is a nice utility but built on top of the existing "Continuous Archiving" philosophy, the very feature was already implemented since 8.*
-- Achilleas Mantzios Head of IT DEV IT DEPT Dynacom Tankers Mgmt
Hi On 04/23/15 14:50, Chris Mair wrote: >> Dear Postgresql mailing list, >> >> we use Postgresql 8.4.x on our Linux firewall distribution. >> Actually, we are moving from standard SATA disk to mSATA SSD solid drive, and we noticed that the DB, using lots of indexes, is writing a lot. >> >> In some monthes, two test machine got SSD broken, and we are >> studyinghow to reduce write impact for DB. >> >> Are there some suggestions with SSD drives? >> Putting the DB into RAM and backing up periodically to disk is a >> valid solutions? >> >> Or, is storing indexes on a ram drive possible? >> >> Thank you in advance for your appreciated interest! >> >> Best regards, >> Francesco > > Hi, > > I don't think that today's SSDs - and certainly not the server-grade > ones - > will break due to write intensive loads. Exactly. If you want an SSD for a write-intensive database, you need a reasonably good SSD. Sadly, the OP mentioned they're going for a mSATA drive, and those suck when used for this purpose. Theoretically it's possible to improve the lifetime by only allocating part of the SSD and leaving some additional free space for the wear leveling - the manufacturer already does that, but allocates a small amount of space for the cheaper SSDs (say ~10% while the "server-grade" SSDs may have ~25% of unallocated space for this purpose). So by allocating only 75% for a filesystem, it may last longer. > Have a look at the SMART data for you drives, there should be some > metrics called "wear level count" or similar that gives some > indications. I wouldn't be surprised if you find that your broken > drives had failures not related to wear level. My experience with mSATA drives is rather bad - the SMART data is rather unreliable, and most of them doesn't even have power-loss protection (which you need for a database, although a UPS may help a bit here). But maybe that changed recently. > > If you're on Linux use smartctl. > > Also, as others have pointed out 8.4 is out of support, so consider > upgrading. +1 to this -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 04/23/15 13:07, Job wrote: > Dear Postgresql mailing list, > > we use Postgresql 8.4.x on our Linux firewall distribution. > Actually, we are moving from standard SATA disk to mSATA SSD solid > drive, and we noticed that the DB, using lots of indexes, is writing a lot. > > In some monthes, two test machine got SSD broken, and we are > studyinghow to reduce write impact for DB. > > Are there some suggestions with SSD drives? There are ways to make the writes less frequent, both at the database and OS level. We don't know what's your PostgreSQL config, but making the checkpoints less frequent and tuning the kernel/mount options may help a lot. > Putting the DB into RAM and backing up periodically to disk is a > validsolutions? Well, that depends on your requirements. You may lose the changes since the last backup. > > Or, is storing indexes on a ram drive possible? No, not really. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello, thank you first of all for your wonder help! Tomas, regarding: >There are ways to make the writes less frequent, both at the database >and OS level. We don't know what's your PostgreSQL config, but making >the checkpoints less frequent and tuning the kernel/mount options may >help a lot. We can raise up checkpoints, at kernel-side which options do you suggest to tune for minimize disk writing? We have a table, about 500Mb, that is updated and written every day. When machines updates, table is truncated and then re-populated with pg_bulk. But i think we strongly writes when importing new data tables.. Here is why we tought putting some tables into ramdrive... Thank you, Francesco
We have a table, about 500Mb, that is updated and written every day.
When machines updates, table is truncated and then re-populated with pg_bulk.
But i think we strongly writes when importing new data tables..
so this is static data you update once per day from some external source? seems like a good candidate to put into RAM if you have enough. make sure your ram disk is not backed by swap, though.
On 04/23/15 15:36, Job wrote: > Hello, thank you first of all for your wonder help! > > Tomas, regarding: > >> There are ways to make the writes less frequent, both at the database >> and OS level. We don't know what's your PostgreSQL config, but making >> the checkpoints less frequent and tuning the kernel/mount options may >> help a lot. > > We can raise up checkpoints, at kernel-side which options do you > suggest to tune for minimize disk writing? You may make the pdflush configuration less aggressive, but that may not help with your workload. You should use TRIM (of fstrim regularly), move /tmp into a tmpfs and don't put swap on the SSD. > We have a table, about 500Mb, that is updated and written every day. > When machines updates, table is truncated and then re-populated with > pg_bulk. But i think we strongly writes when importing new data tables.. In that case the checkpoint optimizations or kernel tuning probably won't help much. But if you can easily recreate the database, and it fits into RAM, then you can just place it into a tmpfs. > Here is why we tought putting some tables into ramdrive... Well, technically you can do that, but don't expect the database to work after a crash or a reboot. You might keep a snapshot of the database (e.g. using pg_basebackup), and use it to 'seed' the database after a server restart. But don't expect the database to start without a tablespace that got lost because of being placed in a tmpfs or something. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 4/23/15 8:36 AM, Job wrote: > Hello, thank you first of all for your wonder help! > > Tomas, regarding: > >> There are ways to make the writes less frequent, both at the database >> and OS level. We don't know what's your PostgreSQL config, but making >> the checkpoints less frequent and tuning the kernel/mount options may >> help a lot. > > We can raise up checkpoints, at kernel-side which options do you suggest to tune for minimize disk writing? Certainly disable atime updates if you haven't already. Having a long checkpoint period is somewhat similar to running in a ram disk and doing periodic backups. BTW, if you want to try using backups, I recommend you setup actual PITR archiving to the SSD. That will write data sequentially and in larger chunks, which should help the SSD better deal with the writes. This will give you more control over how much data you lose during a crash. Though, if all you do is a single large update once a day you're probably better off just taking a backup right after the update. I would also look at the backup size and recovery time of pg_dump vs PITR or a filesystem snapshot; it could be significantly smaller. It might take longer to restore though. BTW, if you go the ramdisk route you should turn off fsync; there's no point in the extra calls to the kernel. Only do that if the ENTIRE database is in a ramdisk though. > We have a table, about 500Mb, that is updated and written every day. > When machines updates, table is truncated and then re-populated with pg_bulk. > But i think we strongly writes when importing new data tables.. That depends on how much data has actually changed. If most of the data in the table is changed then truncate and load will be your best bet. OTOH if relatively little of the data has changed you'll probably get much better results by loading the data into a loading table and then updating changed data, deleting data that shouldn't be there anymore, and inserting new data. You definitely want the loading table to not be on SSD, and to be unlogged. That means it needs to go in a tablespace on a ram disk. True temporary tables (CREATE TEMP TABLE) are not logged, but I don't know that that will work well with pg_bulk. You can use a real table with the unlogged option to the same effect (though, I'm not sure if unlogged is available in 8.4). You also need to consider the indexes. First, make absolutely certain you need all of them. Get rid of every one that isn't required. Second, you need to test the amount of data that's written during an update with the indexes in place *and doing a subsequent VACCUM* compared to dropping all the indexes and re-creating them. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
- Another configuration parameter to look into is effective_io_concurrency. For SSD we typically set it to 1 io per channel of controller card not including the RAID parity drives. If you decrease this value PostgreSQL will not try to initiate as many parallel io operations.
http://www.postgresql.org/docs/8.4/static/runtime-config-resource.html
- A strategy Instagram used is to (manually) vertically partition tables that have some, but not all, columns updated frequently. When PostgreSQL updates a data value for a column it writes a new copy of the entire row and marks the original row as garbage. If you have tables with many rows but many of them are updated infrequently (Instagram's example was a table with user information where "last login" was updated very frequently but other information about the user was rarely updated) you could split the frequently updated columns into a separate table to reduce io. However note that PostgreSQL does not have features to support vertical partitioning directly so to employ this technique you would need to manually partition and update your SQL code accordingly.
- Also check into your WAL and logging settings to make sure you aren't writing more to them than you need to.
On 4/23/15 8:36 AM, Job wrote:Hello, thank you first of all for your wonder help!
Tomas, regarding:There are ways to make the writes less frequent, both at the database
and OS level. We don't know what's your PostgreSQL config, but making
the checkpoints less frequent and tuning the kernel/mount options may
help a lot.
We can raise up checkpoints, at kernel-side which options do you suggest to tune for minimize disk writing?
Certainly disable atime updates if you haven't already.
Having a long checkpoint period is somewhat similar to running in a ram disk and doing periodic backups. BTW, if you want to try using backups, I recommend you setup actual PITR archiving to the SSD. That will write data sequentially and in larger chunks, which should help the SSD better deal with the writes. This will give you more control over how much data you lose during a crash. Though, if all you do is a single large update once a day you're probably better off just taking a backup right after the update. I would also look at the backup size and recovery time of pg_dump vs PITR or a filesystem snapshot; it could be significantly smaller. It might take longer to restore though.
BTW, if you go the ramdisk route you should turn off fsync; there's no point in the extra calls to the kernel. Only do that if the ENTIRE database is in a ramdisk though.We have a table, about 500Mb, that is updated and written every day.
When machines updates, table is truncated and then re-populated with pg_bulk.
But i think we strongly writes when importing new data tables..
That depends on how much data has actually changed. If most of the data in the table is changed then truncate and load will be your best bet. OTOH if relatively little of the data has changed you'll probably get much better results by loading the data into a loading table and then updating changed data, deleting data that shouldn't be there anymore, and inserting new data.
You definitely want the loading table to not be on SSD, and to be unlogged. That means it needs to go in a tablespace on a ram disk. True temporary tables (CREATE TEMP TABLE) are not logged, but I don't know that that will work well with pg_bulk. You can use a real table with the unlogged option to the same effect (though, I'm not sure if unlogged is available in 8.4).
You also need to consider the indexes. First, make absolutely certain you need all of them. Get rid of every one that isn't required. Second, you need to test the amount of data that's written during an update with the indexes in place *and doing a subsequent VACCUM* compared to dropping all the indexes and re-creating them.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general