Thread: Is this the future of I/O for the RDBMS?
Hi all, Kind of a followup to my question: PostgreSQL, Asynchronous I/O, Buffered I/O and why did fsync-gate not affect Oracle or MySQL?, I have another. The blog below kinda got me thinking about all of this. I have an interest in NewSQL distributed systems - in particular CockroachDB, TiDB and YugaByte. Their architectures are fairly similar - hardly surprising since they are all (partially) F/LOSS clones of the Google Spanner/F1 systems. They use an underlying KV store and put an SQL interface above that and use a Raft (or other) consensus algorithm to coordinate and whatnot. While perusing the interweb, I stumbled on this very interesting blog post from TiDB. https://pingcap.com/blog/tikv-and-spdk-pushing-the-limits-of-storage-performance It talks about the Storage Performance Development Kit (SPDK) (spdk.io). The blog appears to think this system is a panacea and manna from heaven rolled into one: > This solution solves the four problems we mentioned earlier: it removes the syscall overhead, uses data structures andcaching algorithms more suitable for databases and NVMe disks, and simplifies file system logging. Given that many/most database processing is I/O bound, I'm just wondering if this system is all it's cracked up to be, or are we at the "Mass Media Hype Begins" and am I about to jump off the cliff edge of the "Peak of Inflated Expectations" and fall headlong into the Trough of Disillusionment? (see: https://en.wikipedia.org/wiki/Hype_cycle). Will this have any implications for PostgreSQL, given that it is a db that compiles/runs on a large number of systems - or can subsystems such as this be integrated/included for those chips which support it? This particular SDK appears to be Intel specific, but if one chip manufacturer can do it, can't they all (eventually)? If this isn't the appropriate forum for discussing these matters, then please indicate a suitable forum. TIA and rgs, Pól Ua...
On 2021-05-02 14:49:44 +0100, Pól Ua Laoínecháin wrote: > While perusing the interweb, I stumbled on this very interesting blog > post from TiDB. > > > https://pingcap.com/blog/tikv-and-spdk-pushing-the-limits-of-storage-performance > > It talks about the Storage Performance Development Kit (SPDK) (spdk.io). This sounds certainly interesting. Without reading up on the details, however, I notice that it implements a file system in user space which means that the kernel cannot enforce any permissions. This may be ok for a database (which typically runs as a single OS user anyway), but it is something to consider. It looks somewhat similar to Oracle's "raw device tablespaces" of the 1980s. By the time I got involved in database programming in the late 1990s these were considered obsolete (negligible performance advantage, but a hassle for the DBA). Maybe with NVME SSDs and persistent memory like Intel Optane it is time to revisit that idea. There are less intrusive possibilities, though: Linux has recently (kernel 5.1 - oh, that is already 2 years old) aquired a new async I/O API named io_uring, which eliminates the system call overhead. I haven't played around with it myself, but some blog posts report quite substantial performance improvements, in some cases approaching the theoretical limits of the used (very fast) SSDs. > Will this have any implications for PostgreSQL, given that it is a db > that compiles/runs on a large number of systems - or can subsystems > such as this be integrated/included for those chips which support it? > This particular SDK appears to be Intel specific, but if one chip > manufacturer can do it, can't they all (eventually)? I don't think the chipset makes much of a difference. It's an open source library written in C - it can almost certainly be recompiled for ARM or whatever. It will almost certainly be linux-specific, though. It's probably possible to write something similar for Windows, MacOS, FreeBSD, etc. but I have no idea how hard that may be (maybe you just have to change a few system calls - maybe you have to rewrite 80 percent of it). More important for PostgreSQL is whether something like this can be incorporated without changing the overall architecture: If you just have to change a handful of functions performing low-level I/O, it may be worthwhile even if only a few systems (Linux systems where the DBA is willing to set up devices for direct access from user space) benefit from it. If it means rewriting large parts of postgres and then some platforms cannot be supported at all or only at reduced performance, this is not an option. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
Hi again Peter (and thanks again for your input) > > https://pingcap.com/blog/tikv-and-spdk-pushing-the-limits-of-storage-performance > > It talks about the Storage Performance Development Kit (SPDK) (spdk.io). > It looks somewhat similar to Oracle's "raw device tablespaces" Run far and run fast... Never worked with it but have vague memories of senior colleagues who did... some are still in therapy :-) > Maybe with NVME SSDs > and persistent memory like Intel Optane it is time to revisit that idea. Plus ça change... just goes to show that there's rarely anything truly new in ICT... > There are less intrusive possibilities, though: Linux has recently > (kernel 5.1 - oh, that is already 2 years old) aquired a new async I/O > API named io_uring, I found this https://thenewstack.io/how-io_uring-and-ebpf-will-revolutionize-programming-in-linux/ and a few other bits and pieces - really interesting stuff! I had read the term io_uring but hadn't appreciated what it was about - it does add another layer of complexity to my future study of this area - Linux I/O and db I/O (esp. PG) and how to tie it all together. > More important for PostgreSQL is whether something like this can be incorporated without changing the overall architecture: The one major architectural criticism that I regularly read about PG is that is uses a process per connection rather than threads: https://rbranson.medium.com/10-things-i-hate-about-postgresql-20dbab8c2791 #5: Process-Per-Connection = Pain at Scale I appreciate that the architecture can't be changed for every shiny new toy that comes along - However, it's frequently interesting though to look at underlying assumptions and check to see if they're still valid. MfG & nochmal Dank. Pól... > hp