> - Pg doesn't know the erase block sizes or positions. It can't group
> writes up by erase block except by hoping that, within a given file,
> writing in page order will get the blocks to the disk in roughly
> erase-block order. So your write caching isn't going to do anywhere near
> as good a job as the SSD's can.
>
Okay, I see. We cannot query erase block size from an SSD drive. :-(
>> I don't think that any SSD drive has more than some
>> megabytes of write cache.
>>
>
> The big, lots-of-$$ ones have HUGE battery backed caches for exactly
> this reason.
>
Heh, this is why they are so expensive. :-)
>> The same amount of write cache could easily be
>> implemented in OS memory, and then Pg would always know what hit the disk.
>>
>
> Really? How does Pg know what order the SSD writes things out from its
> cache?
>
I got the point. We cannot implement an efficient write cache without
much more knowledge about how that particular drive works.
So... the only solution that works well is to have much more RAM for
read cache, and much more RAM for write cache inside the RAID controller
(with BBU).
Thank you,
Laszlo