Thread: SSD filesystem aligned to DBMS

SSD filesystem aligned to DBMS

From
Neto pr
Date:
Hi all

Sorry, but I'm not sure that this doubt is appropriate for this list, but I do need to prepare the file system of an SSD disk in a way that pointed me to, which would be a way optimized SSD
 to work. I have a disk: SSD: Samsung 500 GB SATA III 6Gb/s - Model: 850 Evo
http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/

One person on the list me said that should be partition aligned to 3072 not default 2048, to start on erase block bounduary. And fs block should be 8kb.

Can you give me a hint of what program I could do this. I have already used fdisk but I do not know how to do this in Fdisk. I used Linux Debian 8(Jessie) 64b with Ext4 File system.

If you prefer, just reply to me, since the subject would not be about postgresql itself. netoprbr9@gmail.com

Best Regards
Neto

Re: SSD filesystem aligned to DBMS

From
Scott Marlowe
Date:
On Tue, Jan 16, 2018 at 7:47 AM, Neto pr <netoprbr9@gmail.com> wrote:
> Hi all
>
> Sorry, but I'm not sure that this doubt is appropriate for this list, but I
> do need to prepare the file system of an SSD disk in a way that pointed me
> to, which would be a way optimized SSD
>  to work. I have a disk: SSD: Samsung 500 GB SATA III 6Gb/s - Model: 850 Evo
> http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/
>
> One person on the list me said that should be partition aligned to 3072 not
> default 2048, to start on erase block bounduary. And fs block should be 8kb.
>
> Can you give me a hint of what program I could do this. I have already used
> fdisk but I do not know how to do this in Fdisk. I used Linux Debian
> 8(Jessie) 64b with Ext4 File system.

fdisk is pretty old and can't handle larger disks. You can get a fair
bit of control over the process with parted, but it takes some getting
used to. As far as I know, linux's ext4 has a maximum block size of
4k. I can't imagine alignment matters to SSDs and I would take any
advice as such with a large grain of salt and then if I had questions
about performance I'd test it to see. I'm willing to bet a couple
bucks it makes ZERO difference.

>
> If you prefer, just reply to me, since the subject would not be about
> postgresql itself. netoprbr9@gmail.com

No this affects everybody who uses SSDs so let's keep it on list if we can.


Re: SSD filesystem aligned to DBMS

From
Michael Loftis
Date:

On Tue, Jan 16, 2018 at 08:02 Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Tue, Jan 16, 2018 at 7:47 AM, Neto pr <netoprbr9@gmail.com> wrote:
> Hi all
>
> Sorry, but I'm not sure that this doubt is appropriate for this list, but I
> do need to prepare the file system of an SSD disk in a way that pointed me
> to, which would be a way optimized SSD
>  to work. I have a disk: SSD: Samsung 500 GB SATA III 6Gb/s - Model: 850 Evo
> http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/
>
> One person on the list me said that should be partition aligned to 3072 not
> default 2048, to start on erase block bounduary. And fs block should be 8kb.
>
> Can you give me a hint of what program I could do this. I have already used
> fdisk but I do not know how to do this in Fdisk. I used Linux Debian
> 8(Jessie) 64b with Ext4 File system.

fdisk is pretty old and can't handle larger disks. You can get a fair
bit of control over the process with parted, but it takes some getting
used to. As far as I know, linux's ext4 has a maximum block size of
4k. I can't imagine alignment matters to SSDs and I would take any
advice as such with a large grain of salt and then if I had questions
about performance I'd test it to see. I'm willing to bet a couple
bucks it makes ZERO difference.

Alignment definitely makes a difference for writes. It can also make a difference for random reads as well since the underlying read may not line up to the hardware add in a read ahead (at drive or OS Level) and you’re reading far more data in the drive than the OS asks for.

Stupidly a lot of this isn’t published by a lot of SSD manufacturers, but through benchmarks it shows up.

Another potential difference here with SAS vs SATA is the maximum queue depth supported by the protocol and drive. 

SSD drives also do internal housekeeping tasks for wear leveling on writing.

I’ve seen SSD drives benchmark with 80-90MB sequential read or write, change the alignment, and you’ll get 400+ on the same drive with sequential reads (changing nothing else)




>
> If you prefer, just reply to me, since the subject would not be about
> postgresql itself. netoprbr9@gmail.com

No this affects everybody who uses SSDs so let's keep it on list if we can.

--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: SSD filesystem aligned to DBMS

From
Neto pr
Date:


2018-01-16 8:50 GMT-08:00 Michael Loftis <mloftis@wgops.com>:

On Tue, Jan 16, 2018 at 08:02 Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Tue, Jan 16, 2018 at 7:47 AM, Neto pr <netoprbr9@gmail.com> wrote:
> Hi all
>
> Sorry, but I'm not sure that this doubt is appropriate for this list, but I
> do need to prepare the file system of an SSD disk in a way that pointed me
> to, which would be a way optimized SSD
>  to work. I have a disk: SSD: Samsung 500 GB SATA III 6Gb/s - Model: 850 Evo
> http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/
>
> One person on the list me said that should be partition aligned to 3072 not
> default 2048, to start on erase block bounduary. And fs block should be 8kb.
>
> Can you give me a hint of what program I could do this. I have already used
> fdisk but I do not know how to do this in Fdisk. I used Linux Debian
> 8(Jessie) 64b with Ext4 File system.

fdisk is pretty old and can't handle larger disks. You can get a fair
bit of control over the process with parted, but it takes some getting
used to. As far as I know, linux's ext4 has a maximum block size of
4k. I can't imagine alignment matters to SSDs and I would take any
advice as such with a large grain of salt and then if I had questions
about performance I'd test it to see. I'm willing to bet a couple
bucks it makes ZERO difference.

Alignment definitely makes a difference for writes. It can also make a difference for random reads as well since the underlying read may not line up to the hardware add in a read ahead (at drive or OS Level) and you’re reading far more data in the drive than the OS asks for.

Stupidly a lot of this isn’t published by a lot of SSD manufacturers, but through benchmarks it shows up.

Another potential difference here with SAS vs SATA is the maximum queue depth supported by the protocol and drive. 

SSD drives also do internal housekeeping tasks for wear leveling on writing.

I’ve seen SSD drives benchmark with 80-90MB sequential read or write, change the alignment, and you’ll get 400+ on the same drive with sequential reads (changing nothing else)




Hi all
Searching I checked that In past, proper alignment required manual calculation and intervention when partitioning. Many of the common partition tools now handle partition alignment automatically.
For sample,  o
n an already partitioned disk, you can use partedhttps://wiki.archlinux.org/index.php/GNU_Parted#Check_alignment  )
to verify the alignment of a partition on a device in LInux S.O.  This example I ran i my Samsung SSD 500GB 850 Evo, see below:

-------BEGIN PARTED TOOL ---------------------------------------------------------------------------------
root@hp2ml110deb:parted /dev/sdb
(parted) print list                                                      
Model: ATA Samsung SSD 850 (scsi)
Disk /dev/sdb: 500GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End    Size   File system  Flags
 1      0.00B  500GB  500GB  ext4

Model: ATA MB1000GCWCV (scsi)
Disk /dev/sda: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system     Name  Flags
 1      1049kB  538MB   537MB   fat32                 boot, esp
 2      538MB   992GB   991GB   ext4
 3      992GB   1000GB  8319MB  linux-swap(v1)

(parted) select /dev/sdb
Using /dev/sdb
(parted) align-check
alignment type(min/opt)  [optimal]/minimal? opt                          
Partition number? 1                                                      
1 aligned
(parted)      
---------------------------- END --------------------------------------------------

Regards
Neto



 


>
> If you prefer, just reply to me, since the subject would not be about
> postgresql itself. netoprbr9@gmail.com

No this affects everybody who uses SSDs so let's keep it on list if we can.

--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: SSD filesystem aligned to DBMS

From
George Neuner
Date:
On Tue, 16 Jan 2018 16:50:28 +0000, Michael Loftis <mloftis@wgops.com>
wrote:

>Alignment definitely makes a difference for writes. It can also make a
>difference for random reads as well since the underlying read may not line
>up to the hardware add in a read ahead (at drive or OS Level) and you’re
>reading far more data in the drive than the OS asks for.

Best performance will be when the filesystem block size matches the
SSD's writeable *data* block size.  The SSD also has a separate erase
sector size which is some (large) multiple of the data block size.


<background>
Recall that an SSD doesn't overwrite existing data blocks.  When you
update a file, the updates are written out to *new* "clean" data
blocks, and the file's block index is updated to reflect the new
structure.  

The old data blocks are marked "free+dirty".  They must be erased
(become "free+clean") before reuse.  Depending on the drive size, the
SSD's erase sectors may be anywhere from 64MB..512MB in size, and so a
single erase sector will hold many individually writeable data blocks.

When an erase sector is cleaned, ALL the data blocks it contains are
erased.  If any still contain good data, they must be relocated before
the erase can be done.
</background>


You don't want your filesystem block to be smaller than the SSD data
block, because then you are subject to *unnecessary* write
applification: the drive controller has to read/modify/write a whole
data block to change any part of it.

But, conversely, filesystem blocks that are larger than the SSD write
block typically are not a problem because ... unless you do something
really stupid [with really low level code] ... the large filesystem
blocks will end up be an exact multiple of data blocks.


Much of the literature re: alignment actually is related to the erase
sectors rather than the data blocks and is targeted at embedded
systems that are not using conventional filesystems but rather are
accessing the raw SSD.

You do want your partitions to start on erase sector boundaries, but
that usually is trivial to do.


>Stupidly a lot of this isn’t published by a lot of SSD manufacturers, but
>through benchmarks it shows up.

Yes.  The advice to match your filesystem to the data block size is
not often given.


>Another potential difference here with SAS vs SATA is the maximum queue
>depth supported by the protocol and drive.

Yes. The interface, and how it is configured, matters greatly.


>SSD drives also do internal housekeeping tasks for wear leveling on writing.

The biggest of which is always writing to a new location.  Enterprise
grade SSD's sometimes do perform erases ahead of time during idle
periods, but cheap drives often wait until the free+dirty space is to
be reused.


>I’ve seen SSD drives benchmark with 80-90MB sequential read or write,
>change the alignment, and you’ll get 400+ on the same drive with sequential
>reads (changing nothing else)
>
>A specific example
>https://www.servethehome.com/ssd-alignment-quickly-benchmark-ssd/

I believe you have seen it, but if the read performance changed that
drastically, then the controller/driver was doing something awfully
stupid ... e.g., re-reading the same data block for each filesystem
block it contains.


YMMV.
George