Thread: DB on mSATA SSD

DB on mSATA SSD

From

Job

Date:

23 April 2015, 11:10:56

Dear Postgresql mailing list,

we use Postgresql 8.4.x on our Linux firewall distribution.
Actually, we are moving from standard SATA disk to mSATA SSD solid drive, and we noticed that the DB, using lots of
indexes,is writing a lot. 

In some monthes, two test machine got SSD broken, and we are studying how to reduce write  impact for DB.

Are there some suggestions with SSD drives?
Putting the DB into RAM and backing up periodically to disk is a valid solutions?

Or, is storing indexes on a ram drive possible?

Thank you in advance for your appreciated interest!

Best regards,
Francesco

Re: DB on mSATA SSD

From

Vick Khera

Date:

23 April 2015, 12:28:37

On Thu, Apr 23, 2015 at 7:07 AM, Job <Job@colliniconsulting.it> wrote:

Are there some suggestions with SSD drives?
Putting the DB into RAM and backing up periodically to disk is a valid solutions?

I have some very busy databases on SSD-only systems. I think you're using SSDs that are not rated for server use.

Your strategy of using in-ram disk and backing up to the SSD is sensible, depending on what guarantees you need for the data to survive an unclean system shutdown. You will want to use a file system that allows you to snapshot and backup or logical DB backups. Postgres 9.4 has some features that will make taking the backup from the file system much easier and cleaner, too.

Re: DB on mSATA SSD

From

John McKown

Date:

23 April 2015, 12:33:15

On Thu, Apr 23, 2015 at 6:07 AM, Job <Job@colliniconsulting.it> wrote:

Dear Postgresql mailing list,

we use Postgresql 8.4.x on our Linux firewall distribution.
Actually, we are moving from standard SATA disk to mSATA SSD solid drive, and we noticed that the DB, using lots of indexes, is writing a lot.

In some monthes, two test machine got SSD broken, and we are studying how to reduce write impact for DB.

Are there some suggestions with SSD drives?
Putting the DB into RAM and backing up periodically to disk is a valid solutions?

Or, is storing indexes on a ram drive possible?

Thank you in advance for your appreciated interest!

Best regards,
Francesco

That's a really old release. But I finally found some doc on it. And 8.4 does appear to have TABLESPACEs in it.

http://www.postgresql.org/docs/8.4/static/manage-ag-tablespaces.html

<quote>

To define a tablespace, use the CREATE TABLESPACE command, for example::

CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';

The location must be an existing, empty directory that is owned by the PostgreSQL system user. All objects subsequently created within the tablespace will be stored in files underneath this directory.

</quote>

By using tablespaces appropriately, you can direct individual tables onto different media. I do not know what is causing your "excessive" writing, but if it is temporary tables, they you can set a TABLESPACE aside for those tables which is somewhere other than your SSD (actual hard drive, or RAM disk if you have enough memory to create a RAM disk).

http://www.postgresql.org/docs/8.4/static/runtime-config-client.html#GUC-TEMP-TABLESPACES

<quote>

temp_tablespaces (string)

This variable specifies tablespace(s) in which to create temporary objects (temp tables and indexes on temp tables) when a CREATE command does not explicitly specify a tablespace. Temporary files for purposes such as sorting large data sets are also created in these tablespace(s).

The value is a list of names of tablespaces. When there is more than one name in the list, PostgreSQL chooses a random member of the list each time a temporary object is to be created; except that within a transaction, successively created temporary objects are placed in successive tablespaces from the list. If the selected element of the list is an empty string, PostgreSQL will automatically use the default tablespace of the current database instead.

When temp_tablespaces is set interactively, specifying a nonexistent tablespace is an error, as is specifying a tablespace for which the user does not have CREATE privilege. However, when using a previously set value, nonexistent tablespaces are ignored, as are tablespaces for which the user lacks CREATE privilege. In particular, this rule applies when using a value set in postgresql.conf.

The default value is an empty string, which results in all temporary objects being created in the default tablespace of the current database.

</quote>

I agree that is sounds like you're not using quality SSD drives.

If you sent twitter messages while exploring, are you on a textpedition?

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

Re: DB on mSATA SSD

From

Chris Mair

Date:

23 April 2015, 12:50:11

> Dear Postgresql mailing list,
>
> we use Postgresql 8.4.x on our Linux firewall distribution.
> Actually, we are moving from standard SATA disk to mSATA SSD solid drive, and we noticed that the DB, using lots of
indexes,is writing a lot. 
>
> In some monthes, two test machine got SSD broken, and we are studying how to reduce write  impact for DB.
>
> Are there some suggestions with SSD drives?
> Putting the DB into RAM and backing up periodically to disk is a valid solutions?
>
> Or, is storing indexes on a ram drive possible?
>
> Thank you in advance for your appreciated interest!
>
> Best regards,
> Francesco

Hi,

I don't think that today's SSDs - and certainly not the server-grade ones -
will break due to write intensive loads.

Have a look at the SMART data for you drives, there should be some metrics
called "wear level count" or similar that gives some indications. I wouldn't
be surprised if you find that your broken drives had failures not related to
wear level.

If you're on Linux use smartctl.

Also, as others have pointed out 8.4 is out of support, so consider upgrading.

Bye,
Chris.

Re: DB on mSATA SSD

From

Tomas Vondra

Date:

23 April 2015, 13:13:38

Hi,


On 04/23/15 14:33, John McKown wrote:
>
> That's a really old release. But I finally found some doc on it. And
> 8.4 does appear to have TABLESPACEs in it.
>
> http://www.postgresql.org/docs/8.4/static/manage-ag-tablespaces.html
>
> <quote>
>
> To define a tablespace, use the CREATE TABLESPACE
> <http://www.postgresql.org/docs/8.4/static/sql-createtablespace.html> command,
> for example::
>
> CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';
>
> The location must be an existing, empty directory that is owned by the
> PostgreSQL system user. All objects subsequently created within the
> tablespace will be stored in files underneath this directory.

I think you should read this:

http://thebuild.com/blog/2013/03/10/you-cannot-recover-from-the-loss-of-a-tablespace/

in other words, using tablespaces for placing some of the data into a
RAM filesystem (which may disappear) is a rather terrible idea. In case
of crash you won't be able to even start the database, because it will
try to recover the tablespace.


--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: DB on mSATA SSD

From

Achilleas Mantzios

Date:

23 April 2015, 13:22:02

On 23/04/2015 15:28, Vick Khera wrote:

On Thu, Apr 23, 2015 at 7:07 AM, Job <Job@colliniconsulting.it> wrote:
Are there some suggestions with SSD drives?
Putting the DB into RAM and backing up periodically to disk is a valid solutions?

I have some very busy databases on SSD-only systems. I think you're using SSDs that are not rated for server use.

Your strategy of using in-ram disk and backing up to the SSD is sensible, depending on what guarantees you need for the data to survive an unclean system shutdown. You will want to use a file system that allows you to snapshot and backup or logical DB backups. Postgres 9.4 has some features that will make taking the backup from the file system much easier and cleaner, too.

FS Snapshots are an option but one should make sure that all file systems are snapshot atomically, which is not very common unless you use ZFS or similarly high-end FS.
Regarding file filesys based backups, apart from pg_basebackup which is a nice utility but built on top of the existing "Continuous Archiving" philosophy, the very feature was already implemented since 8.*

-- 
Achilleas Mantzios
Head of IT DEV
IT DEPT
Dynacom Tankers Mgmt

Re: DB on mSATA SSD

From

Tomas Vondra

Date:

23 April 2015, 13:24:28

Hi

On 04/23/15 14:50, Chris Mair wrote:
>> Dear Postgresql mailing list,
>>
>> we use Postgresql 8.4.x on our Linux firewall distribution.
>> Actually, we are moving from standard SATA disk to mSATA SSD solid
drive, and we noticed that the DB, using lots of indexes, is writing a lot.
>>
>> In some monthes, two test machine got SSD broken, and we are
>> studyinghow to reduce write impact for DB.
>>
>> Are there some suggestions with SSD drives?
>> Putting the DB into RAM and backing up periodically to disk is a
>> valid solutions?
>>
>> Or, is storing indexes on a ram drive possible?
>>
>> Thank you in advance for your appreciated interest!
>>
>> Best regards,
>> Francesco
>
> Hi,
>
> I don't think that today's SSDs - and certainly not the server-grade
> ones - > will break due to write intensive loads.

Exactly. If you want an SSD for a write-intensive database, you need a
reasonably good SSD. Sadly, the OP mentioned they're going for a mSATA
drive, and those suck when used for this purpose.

Theoretically it's possible to improve the lifetime by only allocating
part of the SSD and leaving some additional free space for the wear
leveling - the manufacturer already does that, but allocates a small
amount of space for the cheaper SSDs (say ~10% while the "server-grade"
SSDs may have ~25% of unallocated space for this purpose).

So by allocating only 75% for a filesystem, it may last longer.

> Have a look at the SMART data for you drives, there should be some
> metrics called "wear level count" or similar that gives some
> indications. I wouldn't be surprised if you find that your broken
> drives had failures  not related to wear level.

My experience with mSATA drives is rather bad - the SMART data is rather
unreliable, and most of them doesn't even have power-loss protection
(which you need for a database, although a UPS may help a bit here). But
maybe that changed recently.

>
> If you're on Linux use smartctl.
>
> Also, as others have pointed out 8.4 is out of support, so consider
> upgrading.

+1 to this

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: DB on mSATA SSD

From

Tomas Vondra

Date:

23 April 2015, 13:27:52

On 04/23/15 13:07, Job wrote:
> Dear Postgresql mailing list,
>
> we use Postgresql 8.4.x on our Linux firewall distribution.
> Actually, we are moving from standard SATA disk to mSATA SSD solid
> drive, and we noticed that the DB, using lots of indexes, is writing a lot.
>
> In some monthes, two test machine got SSD broken, and we are
> studyinghow to reduce write impact for DB.
>
> Are there some suggestions with SSD drives?

There are ways to make the writes less frequent, both at the database
and OS level. We don't know what's your PostgreSQL config, but making
the checkpoints less frequent and tuning the kernel/mount options may
help a lot.

> Putting the DB into RAM and backing up periodically to disk is a
> validsolutions?

Well, that depends on your requirements. You may lose the changes since
the last backup.

>
> Or, is storing indexes on a ram drive possible?

No, not really.

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

R: DB on mSATA SSD

From

Job

Date:

23 April 2015, 13:42:20

Hello, thank you first of all for your wonder help!

Tomas, regarding:

>There are ways to make the writes less frequent, both at the database
>and OS level. We don't know what's your PostgreSQL config, but making
>the checkpoints less frequent and tuning the kernel/mount options may
>help a lot.

We can raise up checkpoints, at kernel-side which options do you suggest to tune for minimize disk writing?

We have a table, about 500Mb, that is updated and written every day.
When machines updates, table is truncated and then re-populated with pg_bulk.
But i think we strongly writes when importing new data tables..

Here is why we tought putting some tables into ramdrive...

Thank you,
Francesco

Re: DB on mSATA SSD

From

Vick Khera

Date:

23 April 2015, 14:31:59

On Thu, Apr 23, 2015 at 9:36 AM, Job <Job@colliniconsulting.it> wrote:

We have a table, about 500Mb, that is updated and written every day.
When machines updates, table is truncated and then re-populated with pg_bulk.
But i think we strongly writes when importing new data tables..

so this is static data you update once per day from some external source? seems like a good candidate to put into RAM if you have enough. make sure your ram disk is not backed by swap, though.

how and how much do you query it? perhaps you don't even need indexes if it is on SSD or RAM disk.

Re: R: DB on mSATA SSD

From

Tomas Vondra

Date:

23 April 2015, 14:34:24

On 04/23/15 15:36, Job wrote:
> Hello, thank you first of all for your wonder help!
>
> Tomas, regarding:
>
>> There are ways to make the writes less frequent, both at the database
>> and OS level. We don't know what's your PostgreSQL config, but making
>> the checkpoints less frequent and tuning the kernel/mount options may
>> help a lot.
>
> We can raise up checkpoints, at kernel-side which options do you
> suggest to tune for minimize disk writing?

You may make the pdflush configuration less aggressive, but that may not
help with your workload. You should use TRIM (of fstrim regularly), move
/tmp into a tmpfs and don't put swap on the SSD.

> We have a table, about 500Mb, that is updated and written every day.
> When machines updates, table is truncated and then re-populated with
> pg_bulk.  But i think we strongly writes when importing new data tables..

In that case the checkpoint optimizations or kernel tuning probably
won't help much. But if you can easily recreate the database, and it
fits into RAM, then you can just place it into a tmpfs.

> Here is why we tought putting some tables into ramdrive...

Well, technically you can do that, but don't expect the database to work
after a crash or a reboot. You might keep a snapshot of the database
(e.g. using pg_basebackup), and use it to 'seed' the database after a
server restart.

But don't expect the database to start without a tablespace that got
lost because of being placed in a tmpfs or something.

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: R: DB on mSATA SSD

From

Jim Nasby

Date:

23 April 2015, 14:38:25

On 4/23/15 8:36 AM, Job wrote:
> Hello, thank you first of all for your wonder help!
>
> Tomas, regarding:
>
>> There are ways to make the writes less frequent, both at the database
>> and OS level. We don't know what's your PostgreSQL config, but making
>> the checkpoints less frequent and tuning the kernel/mount options may
>> help a lot.
>
> We can raise up checkpoints, at kernel-side which options do you suggest to tune for minimize disk writing?

Certainly disable atime updates if you haven't already.

Having a long checkpoint period is somewhat similar to running in a ram
disk and doing periodic backups. BTW, if you want to try using backups,
I recommend you setup actual PITR archiving to the SSD. That will write
data sequentially and in larger chunks, which should help the SSD better
deal with the writes. This will give you more control over how much data
you lose during a crash. Though, if all you do is a single large update
once a day you're probably better off just taking a backup right after
the update. I would also look at the backup size and recovery time of
pg_dump vs PITR or a filesystem snapshot; it could be significantly
smaller. It might take longer to restore though.

BTW, if you go the ramdisk route you should turn off fsync; there's no
point in the extra calls to the kernel. Only do that if the ENTIRE
database is in a ramdisk though.

> We have a table, about 500Mb, that is updated and written every day.
> When machines updates, table is truncated and then re-populated with pg_bulk.
> But i think we strongly writes when importing new data tables..

That depends on how much data has actually changed. If most of the data
in the table is changed then truncate and load will be your best bet.
OTOH if relatively little of the data has changed you'll probably get
much better results by loading the data into a loading table and then
updating changed data, deleting data that shouldn't be there anymore,
and inserting new data.

You definitely want the loading table to not be on SSD, and to be
unlogged. That means it needs to go in a tablespace on a ram disk. True
temporary tables (CREATE TEMP TABLE) are not logged, but I don't know
that that will work well with pg_bulk. You can use a real table with the
unlogged option to the same effect (though, I'm not sure if unlogged is
available in 8.4).

You also need to consider the indexes. First, make absolutely certain
you need all of them. Get rid of every one that isn't required. Second,
you need to test the amount of data that's written during an update with
the indexes in place *and doing a subsequent VACCUM* compared to
dropping all the indexes and re-creating them.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: R: DB on mSATA SSD

From

William Dunn

Date:

23 April 2015, 17:53:24

Additional things to consider for decreasing pressure on the cheap drives:

Another configuration parameter to look into is effective_io_concurrency. For SSD we typically set it to 1 io per channel of controller card not including the RAID parity drives. If you decrease this value PostgreSQL will not try to initiate as many parallel io operations.
http://www.postgresql.org/docs/8.4/static/runtime-config-resource.html

A strategy Instagram used is to (manually) vertically partition tables that have some, but not all, columns updated frequently. When PostgreSQL updates a data value for a column it writes a new copy of the entire row and marks the original row as garbage. If you have tables with many rows but many of them are updated infrequently (Instagram's example was a table with user information where "last login" was updated very frequently but other information about the user was rarely updated) you could split the frequently updated columns into a separate table to reduce io. However note that PostgreSQL does not have features to support vertical partitioning directly so to employ this technique you would need to manually partition and update your SQL code accordingly.

Also check into your WAL and logging settings to make sure you aren't writing more to them than you need to.

Will J Dunn

On Thu, Apr 23, 2015 at 10:38 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 4/23/15 8:36 AM, Job wrote:
Hello, thank you first of all for your wonder help!

Tomas, regarding:

There are ways to make the writes less frequent, both at the database
and OS level. We don't know what's your PostgreSQL config, but making
the checkpoints less frequent and tuning the kernel/mount options may
help a lot.

We can raise up checkpoints, at kernel-side which options do you suggest to tune for minimize disk writing?

Certainly disable atime updates if you haven't already.

Having a long checkpoint period is somewhat similar to running in a ram disk and doing periodic backups. BTW, if you want to try using backups, I recommend you setup actual PITR archiving to the SSD. That will write data sequentially and in larger chunks, which should help the SSD better deal with the writes. This will give you more control over how much data you lose during a crash. Though, if all you do is a single large update once a day you're probably better off just taking a backup right after the update. I would also look at the backup size and recovery time of pg_dump vs PITR or a filesystem snapshot; it could be significantly smaller. It might take longer to restore though.

BTW, if you go the ramdisk route you should turn off fsync; there's no point in the extra calls to the kernel. Only do that if the ENTIRE database is in a ramdisk though.

We have a table, about 500Mb, that is updated and written every day.
When machines updates, table is truncated and then re-populated with pg_bulk.
But i think we strongly writes when importing new data tables..

That depends on how much data has actually changed. If most of the data in the table is changed then truncate and load will be your best bet. OTOH if relatively little of the data has changed you'll probably get much better results by loading the data into a loading table and then updating changed data, deleting data that shouldn't be there anymore, and inserting new data.

You definitely want the loading table to not be on SSD, and to be unlogged. That means it needs to go in a tablespace on a ram disk. True temporary tables (CREATE TEMP TABLE) are not logged, but I don't know that that will work well with pg_bulk. You can use a real table with the unlogged option to the same effect (though, I'm not sure if unlogged is available in 8.4).

You also need to consider the indexes. First, make absolutely certain you need all of them. Get rid of every one that isn't required. Second, you need to test the amount of data that's written during an update with the indexes in place *and doing a subsequent VACCUM* compared to dropping all the indexes and re-creating them.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general