On 2021-05-02 14:49:44 +0100, Pól Ua Laoínecháin wrote:
> While perusing the interweb, I stumbled on this very interesting blog
> post from TiDB.
>
>
> https://pingcap.com/blog/tikv-and-spdk-pushing-the-limits-of-storage-performance
>
> It talks about the Storage Performance Development Kit (SPDK) (spdk.io).
This sounds certainly interesting. Without reading up on the details,
however, I notice that it implements a file system in user space which
means that the kernel cannot enforce any permissions. This may be ok for
a database (which typically runs as a single OS user anyway), but it is
something to consider. It looks somewhat similar to Oracle's "raw device
tablespaces" of the 1980s. By the time I got involved in database
programming in the late 1990s these were considered obsolete (negligible
performance advantage, but a hassle for the DBA). Maybe with NVME SSDs
and persistent memory like Intel Optane it is time to revisit that idea.
There are less intrusive possibilities, though: Linux has recently
(kernel 5.1 - oh, that is already 2 years old) aquired a new async I/O
API named io_uring, which eliminates the system call overhead. I haven't
played around with it myself, but some blog posts report quite
substantial performance improvements, in some cases approaching the
theoretical limits of the used (very fast) SSDs.
> Will this have any implications for PostgreSQL, given that it is a db
> that compiles/runs on a large number of systems - or can subsystems
> such as this be integrated/included for those chips which support it?
> This particular SDK appears to be Intel specific, but if one chip
> manufacturer can do it, can't they all (eventually)?
I don't think the chipset makes much of a difference. It's an open
source library written in C - it can almost certainly be recompiled for
ARM or whatever. It will almost certainly be linux-specific, though.
It's probably possible to write something similar for Windows, MacOS,
FreeBSD, etc. but I have no idea how hard that may be (maybe you just
have to change a few system calls - maybe you have to rewrite 80 percent
of it).
More important for PostgreSQL is whether something like this can be
incorporated without changing the overall architecture: If you just have
to change a handful of functions performing low-level I/O, it may be
worthwhile even if only a few systems (Linux systems where the DBA is
willing to set up devices for direct access from user space) benefit
from it. If it means rewriting large parts of postgres and then some
platforms cannot be supported at all or only at reduced performance,
this is not an option.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"