Thread: RAW I/O device
On Linux now exist project for raw I/O device (http://oss.sgi.com/projects/rawio/). Exist any plan (for far future) with raw device for PgSQL? (TODO be quiet for this.) Karel ---------------------------------------------------------------------- Karel Zak <zakkr@zf.jcu.cz> http://home.zf.jcu.cz/~zakkr/ Docs: http://docs.linux.cz (big docs archive) Kim Project: http://home.zf.jcu.cz/~zakkr/kim/ (process manager) FTP: ftp://ftp2.zf.jcu.cz/users/zakkr/ (C/ncurses/PgSQL) -----------------------------------------------------------------------
> On Linux now exist project for raw I/O device > (http://oss.sgi.com/projects/rawio/). Exist any plan (for far future) > with raw device for PgSQL? (TODO be quiet for this.) Up to now we kept the storage manager overhead in the system. Actually there is no way to tell which storage manager to use for a particular table/index, so anything goes to the default which is the magnetic disk one that uses single files for each relation. There was a discussion about simplifying it, but the consensus was to let it as is because it is the base for a tablespace and/or raw device manager. AFAIK, noone is working on it, so it must be really FAR future. But the plan is still alive. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
On Mon, 6 Dec 1999, Jan Wieck wrote: > > On Linux now exist project for raw I/O device > > (http://oss.sgi.com/projects/rawio/). Exist any plan (for far future) > > with raw device for PgSQL? (TODO be quiet for this.) > > Up to now we kept the storage manager overhead in the system. > Actually there is no way to tell which storage manager to use > for a particular table/index, so anything goes to the default > which is the magnetic disk one that uses single files for > each relation. > > There was a discussion about simplifying it, but the > consensus was to let it as is because it is the base for a > tablespace and/or raw device manager. > > AFAIK, noone is working on it, so it must be really FAR > future. But the plan is still alive. I raise the question, because the linux kernel opening with raw-device new way for a faster and better database engine. I know (and agree) that it not is priority for next year(s?). But it is interesting, and is prabably good remember it during development, and not write (in future) features which close this good way. Karel
I may be out of place on this one, but remember that postgres runs on other systems besides linux. Making the DB work w/ one FS (and write the storage code for it) seems pointless if we are still stuck on normal FSs on other machines. - merlin > > On Linux now exist project for raw I/O device > > (http://oss.sgi.com/projects/rawio/). Exist any plan (for far future) > > with raw device for PgSQL? (TODO be quiet for this.) > > Up to now we kept the storage manager overhead in the system. > Actually there is no way to tell which storage manager to use > for a particular table/index, so anything goes to the default > which is the magnetic disk one that uses single files for > each relation. > > There was a discussion about simplifying it, but the > consensus was to let it as is because it is the base for a > tablespace and/or raw device manager. > > AFAIK, noone is working on it, so it must be really FAR > future. But the plan is still alive. ------++++======++++------Smith Computer Lab Administrator, Case Western Reserve University bap@scl.cwru.edu| 216.368.5066 | http://home.cwru.edu/~bap ------++++======++++------
> other systems besides linux. Making the DB work w/ one FS (and write the > storage code for it) seems pointless if we are still stuck on normal FSs > on other machines. Yes, of course, but storing databases directly to RAW device - not through the filesystem - is one feature of modern DB engines... -------------------------------------------------------------------------- Ing. Pavel Janousek (PaJaSoft) FoNet, spol. s r. o. Vyvoj software, sprava siti, Unix, Web, Y2K Anenska 11, 602 00 Brno E-mail: mailto:Janousek@FoNet.Cz Tel.: +420 5 4324 4749 SMS: mailto:P.Janousek@SMS.Paegas.Cz Fax.: +420 5 4324 4751 WWW: http://WWW.FoNet.Cz/ E-mail: mailto:Info@FoNet.Cz --------------------------------------------------------------------------
On Mon, 6 Dec 1999, Ing. Pavel PaJaSoft Janousek wrote: > > other systems besides linux. Making the DB work w/ one FS (and write the > > storage code for it) seems pointless if we are still stuck on normal FSs > > on other machines. > > Yes, of course, but storing databases directly to RAW device - not > through the filesystem - is one feature of modern DB engines... Actually, Oracle has been moving *away* from this...more recent versions of Oracle recommend using the Operating System file systems, since, in most cases, the Operating System does a better job, and its too difficult to have Oracle itself optimize internal for all the different variants that it supports.... At work, we use Oracle extensively, and I sat down one day last year with our Oracle DBA to discuss exactly this...and, if I recall correctly, it was prompted by a similar thread here... If Linux is providing an Interface into the RAW file system, then this may change things for Oracle, since it wouldn't have to "learn" all the different OSs, as long as the API is the same across them all...and, my experience with Linux is that the API for Linux will most likely be different then everyone else *roll eyes* Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker writes: > If Linux is providing an Interface into the RAW file system, then this may > change things for Oracle, since it wouldn't have to "learn" all the > different OSs, as long as the API is the same across them all...and, my > experience with Linux is that the API for Linux will most likely be > different then everyone else *roll eyes* The system admin side is slightly different from others: raw devices have their own major number and devices, /dev/rawN and you "bind" a raw device to any existing block device /dev/blockdev by doing # raw /dev/rawN /dev/blockdev However, the DBA side (or software) side isn't any different: provided you access /dev/rawN *only* in sector chunks (i.e. multiples of 512 bytes that are 512-byte aligned) then the software doesn't care whether it's /dev/rawN or an ordinary block device. If PostgreSQL can guarantee (or be tweaked/enhanced to guarantee) that it only ever reads/writes in multiples of 512 byte chunks and never does anything "weird" (truncates, file-specific, ioctls, mmap, needing O_CREAT to start with etc.) then it should be perfectly happy when presented with a /dev/rawN instead of an ordinary file. --Malcolm -- Malcolm Beattie <mbeattie@sable.ox.ac.uk> Unix Systems Programmer Oxford University Computing Services
Sorry for the previous note...I'd tried stopping and restarting postmaster and that didn't help, so I posted - this web site gets low but steady use and was suddenly acting weird on me. Anyway, a quick look at the source made it clear that Postgres wasn't at fault, as the node's a T_Query and apply_RIR_view clearly handles it. So, I rebooted linux and it works fine now. Sorry to bother folks... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> I raise the question, because the linux kernel opening with raw-device > new way for a faster and better database engine. I know (and agree) > that it not is priority for next year(s?). But it is interesting, and > is prabably good remember it during development, and not write (in future) > features which close this good way. I would be very surprised to see any significant change in raw vs. filesystem i/o on modern file systems, and I am sorry, but Linux ext2 does not count as modern. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> On Mon, 6 Dec 1999, Ing. Pavel PaJaSoft Janousek wrote: > > > > other systems besides linux. Making the DB work w/ one FS (and write the > > > storage code for it) seems pointless if we are still stuck on normal FSs > > > on other machines. > > > > Yes, of course, but storing databases directly to RAW device - not > > through the filesystem - is one feature of modern DB engines... > > Actually, Oracle has been moving *away* from this...more recent versions > of Oracle recommend using the Operating System file systems, since, in > most cases, the Operating System does a better job, and its too difficult > to have Oracle itself optimize internal for all the different variants > that it supports.... Ding, ding, ding. Give that man a cigar. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Mon, 6 Dec 1999, Bruce Momjian wrote: > > I raise the question, because the linux kernel opening with raw-device > > new way for a faster and better database engine. I know (and agree) > > that it not is priority for next year(s?). But it is interesting, and > > is prabably good remember it during development, and not write (in future) > > features which close this good way. > > I would be very surprised to see any significant change in raw vs. > filesystem i/o on modern file systems, and I am sorry, but Linux ext2 > does not count as modern. Yes. The ext2's limitation and unavailable is public secret and use it for raw is crazy idea. On a raw device can be implement specific data organization (specific for DB demand). Raw's advantage is non-universal organization. A raw is not only about filesystem, this feature remove full control from OS kernel to DB (example data caching - kernel not has information how/why/what remove to cache but DB has this information... etc). Karel