Omni,
Firstly, it looks like to have a solution, ( a fixed number and type of disks) looking for a problem. Its better to
considerwhat is right mix of disks for your application the server.
To understand the best physical layout, you need to know the logical access patterns to your data. When you access
yourdata are you doing random queries, or index scans or table scans or computing aggregates? What queries are most
importantand what response time are you targeting?
Once you have an idea of your logical access patterns you map that to you physical layout.
Postgres Tablespaces are useful as they allow you locate tables and indexes of different physical devices. For example
itwill be better to put small frequently used tables of SSD's. Or just put frequently used indexes on the SSD.
There are many factors to consider when planning the physical storage layout. There could me a need for AI apps to do
justthat!
-Tim
> On 10/16/2024 7:06 AM PDT Onni Hakala <onni@fyff.ee> wrote:
>
>
> Hey,
>
> I have a large dataset of > 100TB which would be very expensive to store solely into SSD drives.
>
> I have access to a server which has 2x 3.84TB NVME SSD disks and large array of HDD drives 8 x 22TB.
>
> Most of the data that I have in my dataset is very rarely accessed and is stored only for archival purposes.
>
> What would be the de-facto way to use both SSD and HDD together in a way use where commonly used data would be fast
toaccess and old data would eventually only be stored in compressed format in the HDDs?
>
> I was initially looking into building zpool using zfs with raidz3 and zstd compression for my HDDs but I’m unsure how
toadd the SSDs into this equation and I thought that this is probably a common scenario and wanted to ask opinions from
here.
>
> Thanks in advance,
> Onni Hakala