AW: Re: New Linux xfs/reiser file systems - Mailing list pgsql-hackers

From Zeugswetter Andreas SB
Subject AW: Re: New Linux xfs/reiser file systems
Date
Msg-id 11C1E6749A55D411A9670001FA6879633682B1@sdexcsrv1.f000.d0188.sd.spardat.at
Whole thread Raw
Responses Re: AW: Re: New Linux xfs/reiser file systems  (Giles Lean <giles@nemeton.com.au>)
List pgsql-hackers
> > I think it's worth noting that Oracle has been petitioning the
> > kernel developers for better raw device support: in other words,
> > the ability to write directly to the hard disk and bypassing the
> > filesystem all together.   
> 
> But there could be other reasons why Oracle would want to do 
> raw stuff.

The reasons are: 
1. Most Unixen now have shared (between several machines) raw devicesOracle needs this for their shared everything
ParallelServer. Only 2 Unixen that I know of have shared filesystems (IBM gpfs and Sun Veritas) (both are rather new)
 
2. The allocation time for raw devices is by far better (near instantaneous) thancreating preallocated files in a fs.
Providing1 Tb of raw devices is a task of minutes, creating 1 Tb filsystems with preallocated 2 Gb files is a task of
hoursat best.
 
3. absolute control over writes and page location (you don't want interleaved pages)
4. Efficient use of buffer memory. Usual use of filesystems buffers the disk pages twice,one copy in the db buffer
pool,one in the OS file cache.
 
5. async raw IO (most Unixes provide async raw IO on raw devices, only some provide raw IO on filesystem files).(async
IOhas 2 advantages: CPU work can be done while waiting for IO and IO can complete within one OS timeslice (20 us). This
ispossible with modern disk systems, that have large caches)
 

Andreas


pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: TABLE RENAME/NUMERIC FILENAMES (Was: New Linux xfs/reiser file systems)
Next
From: Alessio Bragadini
Date:
Subject: A problem with new pg_dump