Re: O_DIRECT support for Windows - Mailing list pgsql-patches

From Chuck McDevitt
Subject Re: O_DIRECT support for Windows
Date
Msg-id EB48EBF3B239E948AC1E3F3780CF8F88018BB495@MI8NYCMAIL02.Mi8.com
Whole thread Raw
In response to O_DIRECT support for Windows  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
List pgsql-patches
BTW:  From the current FAT/FAT32 source code fsctrl.c (which I have via
Microsoft's IFS kit), when it checks the boot sector for validity:

    //
    //  Enforce some sanity on the sector size (easy check)
    //

    } else if ((Bpb.BytesPerSector !=  128) &&
               (Bpb.BytesPerSector !=  256) &&
               (Bpb.BytesPerSector !=  512) &&
               (Bpb.BytesPerSector != 1024) &&
               (Bpb.BytesPerSector != 2048) &&
               (Bpb.BytesPerSector != 4096)) {

        Result = FALSE;


So, sector size has to be <= 4096.

I happen to know from a friend who works with ntfs source that it
requires sector size <= page size.

So on both systems, sectors can't be more than 4096.

Also, see http://support.microsoft.com/kb/923332


-----Original Message-----
From: pgsql-patches-owner@postgresql.org
[mailto:pgsql-patches-owner@postgresql.org] On Behalf Of Chuck McDevitt
Sent: Tuesday, January 16, 2007 11:09 PM
To: Takayuki Tsunakawa; Magnus Hagander
Cc: ITAGAKI Takahiro; pgsql-patches@postgresql.org
Subject: Re: [pgsql-patches] O_DIRECT support for Windows

People seem to be confusing sector size and cluster size.

Microsoft Windows assumes sectors are 8k or less on hard drives (99% are
512 bytes).

Cluster size is the allocation unit.  On windows, this can be 512 to
256k (max 64k with 512 byte sectors).
NTFS (which I think we need) is limited to 64k, last I looked.

On RAID devices, the allocation unit might actually be larger, but
usually the *sector* size of these devices is still 8k or less (usually,
they mimic the 512 byte sector size, because too much software assumes
this)


Non-buffered I/Os don't need to be cluster boundary aligned, only sector
aligned.

And that restriction is only for certain drivers and devices.  Many
don't enforce the restriction.
But to be safe, sector alignment is best, because there are some drivers
that care.




-----Original Message-----
From: pgsql-patches-owner@postgresql.org
[mailto:pgsql-patches-owner@postgresql.org] On Behalf Of Takayuki
Tsunakawa
Sent: Tuesday, January 16, 2007 4:53 PM
To: Magnus Hagander
Cc: ITAGAKI Takahiro; pgsql-patches@postgresql.org
Subject: Re: [pgsql-patches] O_DIRECT support for Windows

Hello, Magnus-san, Itagaki-san

From: "Magnus Hagander" <magnus@hagander.net>
>> I think many people can benefit from Itagaki-san's proposal, and
>> NO_BUFFERING should be default.  Isn't it very rare that disks with
>> sector size larger than 8KB are used?
>
> Definitly very rare.
>
>
>> Providing a way (such as
>> wal_sync_method) to avoid NO_BUFFERING is sufficient for people in
>> rare environments.  Or, by determining the sector size with
>> GetDiskFreeSpaceEx(), we could auto-switch to not using
NO_BUFFERING
>> when the sector size is larger than 8KB.
>
> I think the second one is better.

Thank you for agreeing.  Then, I hope Itagaki-san's patch will be
accepted when the following treatments are added to the patch and some
performance report is delivered.

1. On Windows, O_DIRECT (and O_SYNC?) is default for WAL.
2. Auto-switch to not using O_DIRECT if the sector size is larger than
8KB when the server starts.



> A quick google shows some inconclusive results :-)BUt look at for
> example:
>
http://groups.google.se/group/microsoft.public.sqlserver.server/tree/bro
wse_frm/thread/d3288d3b43338b47/ff5e825dd02faff4?rnum=1&hl=en&q=ntfs+sec
tor+size&_done=%2Fgroup%2Fmicrosoft.public.sqlserver.server%2Fbrowse_frm
%2Fthread%2Fd3288d3b43338b47%2Fff5e825dd02faff4%3Ftvc%3D1%26q%3Dntfs+sec
tor+size%26hl%3Den%26#doc_4556b64132b3baa7
>
> This seems to indicate that *Windows* supports sector sizes >4K, but
SQL
> Server doesn't. But again, it could be a mixup between cluster and
> sector size...

This is interesting.  I've never seen systems with a sector size
larger than 4KB, too.  On IBM zSeries (which is a mainframe running
Linux), DASD (direct attached storage device) is usually used as a
hard disk.  The sector size of DASD is 4KB.  So, the current
implementation of PostgreSQL which assumes 8KB sector size is
practically sufficient.
Delivering an intuitive error message like SQL Server is one way when
PostgreSQL encounters devices with a larger sector size than is
supported.  However, as you say, auto-switching to not using
NO_BUFFERING is kinder to users.







---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match



pgsql-patches by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: [PATCHES] pg_standby
Next
From: "Merlin Moncure"
Date:
Subject: Re: [PATCHES] pg_standby