Re: Database Kernels and O_DIRECT - Mailing list pgsql-hackers
From | James Rogers |
---|---|
Subject | Re: Database Kernels and O_DIRECT |
Date | |
Msg-id | 1066161543.20750.95.camel@localhost.localdomain Whole thread Raw |
In response to |
[Linus Torvalds |
Responses |
Re: Database Kernels and O_DIRECT
|
List | pgsql-hackers |
On Sun, 2003-10-12 at 15:13, Greg Stark wrote: > There's an interesting thread on linux-kernel right now about O_DIRECT and the > kernel i/o APIs databases need. I noticed a connection between what they were > discussing and the earlier discussions here and the pining for an interface to > avoid having vacuum preempt other disk i/o. > > Someone from Oracle is on there explaining what Oracle's needs are. Perhaps > someone more knowledgable than myself could explain what would most help > postgres in this area. There is an important difference between Oracle and Postgres that makes discussions of this complicated because the assumptions are different. Oracle runs on top of a database kernel, whereas Postgres does not. In the former case, it is very useful and conducive to better performance to have O_DIRECT and direct control of the I/O in general -- the more, the better. In the latter case (e.g. Postgres), it is more of a nuisance and difficult to exploit well. The point of having a database kernel underneath the DBMS is two-fold. First, it improves portability by acting as an operating system abstraction layer, replacing OS kernel services with its own equivalents (which may map to any number of mechanisms underneath). It is the reason Oracle is easily supported on so many operating systems; to port to a new OS, they only have to modify the database kernel, and they probably have a highly portable generic version to start with that they can then optimize for a given platform at their leisure. All the rest of Oracle's code only has to compile against and run on the virtual operating system that is their database kernel. Second, where possible, the database kernel bypasses the OS kernel internally (e.g. O_DIRECT) and implements its own versions of the OS kernel services that are highly-tuned for database purposes. This often has significant performance benefits. While it kind of looks like an OS on top of an OS, well-written database kernels often tend to exist almost parallel the system kernel in certain respects, only using the system kernel where it is convenient or for future capabilities that have been stubbed out in the database kernel. Writing DBMS code to a database kernel almost always produces a more scalable system than writing to portable OS APIs because it eliminates the "lowest common denominator" effect. Having a database kernel isn't really important unless you are a performance junkie or have to address really scalable database systems. Some more advanced DBMS features are easier to implement on a database kernel as a pragmatic concern, because the system model being implemented for is more database friendly. It lets the database take advantage of the more advanced features and optimizations of whatever operating system it is running on without the vast majority of the DBMS code base being aware of these significant differences. I'd like to see Postgres move to a database kernel eventually for a lot of reasons, but it would a relatively significant change. Maybe v8? :-) Cheers, -James Rogersjamesr@best.com
pgsql-hackers by date: