Thread: O_DIRECT support for Windows
The attached is a patch to define O_DIRECT by ourselves on Windows, and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. There will be a consistency in our support between Windows and other OSes that have O_DIRECT. Also, there is the following comment that says, I read, we should do so. | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) Is this worth doing? Do we need more performance reports for the change? Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Attachment
On Mon, Jan 15, 2007 at 05:36:09PM +0900, ITAGAKI Takahiro wrote: > The attached is a patch to define O_DIRECT by ourselves on Windows, > and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. > > There will be a consistency in our support between Windows and other OSes > that have O_DIRECT. Also, there is the following comment that says, I read, > we should do so. > | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) > > Is this worth doing? Do we need more performance reports for the change? IIRC we've discussed this before at some point, and I think we came to the conclusion that we shouldn't do it. However, things may have changed :-) FILE_FLAG_NO_BUFFERING requires that *all* I/O follows: * File access must begin at offsets that are integer multples of the volume sector size. * File access must be for number of bytes that are integer multiples of the volume sector size. * Buffer addresses for read and write operations must be sector aligned. I was under the impression that our code can in no way guarantee this. Especially given that a typical NTFS drive can have anything from 512 to 4096 bytes if you use the GUI to format it, and larger sizes than that when you use some SAN tools to do it. (btw, we already map O_DSYNC to FILE_FLAG_WRITE_THROUGH) //Magnus
Magnus Hagander <magnus@hagander.net> wrote: > FILE_FLAG_NO_BUFFERING requires that *all* I/O follows: > * File access must begin at offsets that are integer multples of the > volume sector size. > * File access must be for number of bytes that are integer multiples of > the volume sector size. > * Buffer addresses for read and write operations must be sector aligned. > > I was under the impression that our code can in no way guarantee this. > Especially given that a typical NTFS drive can have anything from 512 to > 4096 bytes if you use the GUI to format it, and larger sizes than that > when you use some SAN tools to do it. Do you mean there are drives that have larger sector size than 8kB? We've already put the xlog buffer along the alignment of ALIGNOF_XLOG_BUFFER (typically 8192 bytes). But if there are such drives, using FILE_FLAG_NO_BUFFERING is harmful! > (btw, we already map O_DSYNC to FILE_FLAG_WRITE_THROUGH) Yes, it is not consistent to the source code... And if we had the conclusion you said, the comment should be "We are not willing to use FILE_FLAG_NO_BUFFERING". Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
ITAGAKI Takahiro wrote: > Magnus Hagander <magnus@hagander.net> wrote: > >> FILE_FLAG_NO_BUFFERING requires that *all* I/O follows: >> * File access must begin at offsets that are integer multples of the >> volume sector size. >> * File access must be for number of bytes that are integer multiples of >> the volume sector size. >> * Buffer addresses for read and write operations must be sector aligned. >> >> I was under the impression that our code can in no way guarantee this. >> Especially given that a typical NTFS drive can have anything from 512 to >> 4096 bytes if you use the GUI to format it, and larger sizes than that >> when you use some SAN tools to do it. > > Do you mean there are drives that have larger sector size than 8kB? > We've already put the xlog buffer along the alignment of > ALIGNOF_XLOG_BUFFER (typically 8192 bytes). > But if there are such drives, using FILE_FLAG_NO_BUFFERING is harmful! Yes. I have heard this can happen with certain SAN drives. I haven't seen it myself, and I can't seem to find a reference right now :-) But I do recall having read about th need to check the sector size and specifically align it, because some do have that problem. //Magnus
From: "Magnus Hagander" <magnus@hagander.net> > ITAGAKI Takahiro wrote: >> Do you mean there are drives that have larger sector size than 8kB? >> We've already put the xlog buffer along the alignment of >> ALIGNOF_XLOG_BUFFER (typically 8192 bytes). >> But if there are such drives, using FILE_FLAG_NO_BUFFERING is harmful! > > Yes. I have heard this can happen with certain SAN drives. I haven't > seen it myself, and I can't seem to find a reference right now :-) But I > do recall having read about th need to check the sector size and > specifically align it, because some do have that problem. I think many people can benefit from Itagaki-san's proposal, and NO_BUFFERING should be default. Isn't it very rare that disks with sector size larger than 8KB are used? Providing a way (such as wal_sync_method) to avoid NO_BUFFERING is sufficient for people in rare environments. Or, by determining the sector size with GetDiskFreeSpaceEx(), we could auto-switch to not using NO_BUFFERING when the sector size is larger than 8KB. # I wonder whether GetDiskFreeSpaceEx() tells us the right sector size configured by SAN tools. And I wonder if Microsoft assumes a sector size larger than 4KB and NTFS works. The following paragraph appears in the CreateFile page: One way to align buffers on integer multiples of the volume sector size is to use VirtualAlloc to allocate the buffers. It allocates memory that is aligned on addresses that are integer multiples of the operating system's memory page size. Because both memory page and volume sector sizes are powers of 2, this memory is also aligned on addresses that are integer multiples of a volume sector size. Memory pages are 4-8 KB in size; sectors are 512 bytes (hard disks) or 2048 bytes (CD), and therefore, volume sectors can never be larger than memory pages.
On Tue, Jan 16, 2007 at 10:59:11AM +0900, Takayuki Tsunakawa wrote: > From: "Magnus Hagander" <magnus@hagander.net> > > ITAGAKI Takahiro wrote: > >> Do you mean there are drives that have larger sector size than 8kB? > >> We've already put the xlog buffer along the alignment of > >> ALIGNOF_XLOG_BUFFER (typically 8192 bytes). > >> But if there are such drives, using FILE_FLAG_NO_BUFFERING is > harmful! > > > > Yes. I have heard this can happen with certain SAN drives. I haven't > > seen it myself, and I can't seem to find a reference right now :-) > But I > > do recall having read about th need to check the sector size and > > specifically align it, because some do have that problem. > > I think many people can benefit from Itagaki-san's proposal, and > NO_BUFFERING should be default. Isn't it very rare that disks with > sector size larger than 8KB are used? Definitly very rare. > Providing a way (such as > wal_sync_method) to avoid NO_BUFFERING is sufficient for people in > rare environments. Or, by determining the sector size with > GetDiskFreeSpaceEx(), we could auto-switch to not using NO_BUFFERING > when the sector size is larger than 8KB. I think the second one is better. > I wonder whether GetDiskFreeSpaceEx() tells us the right sector size > configured by SAN tools. It should. If it doesn't, then there are likely to be other issues. > And I wonder if Microsoft assumes a sector size larger than 4KB and > NTFS works. The following paragraph appears in the CreateFile page: > > One way to align buffers on integer multiples of the volume sector > size is to use VirtualAlloc to allocate the buffers. It allocates > memory that is aligned on addresses that are integer multiples of the > operating system's memory page size. Because both memory page and > volume sector sizes are powers of 2, this memory is also aligned on > addresses that are integer multiples of a volume sector size. Memory > pages are 4-8 KB in size; sectors are 512 bytes (hard disks) or 2048 > bytes (CD), and therefore, volume sectors can never be larger than > memory pages. Good question. Again, I have no firsthand info about systems with >4K sectors. Obviously you have 2K sectors on CDs, but that doesn't really apply to us because we don't run with our files on CD at all... It *could* be someone who mixed up the difference between sector size and NTFS block size (which is definitly supoprted up to 64K/block at least). A quick google shows some inconclusive results :-)BUt look at for example: http://groups.google.se/group/microsoft.public.sqlserver.server/tree/browse_frm/thread/d3288d3b43338b47/ff5e825dd02faff4?rnum=1&hl=en&q=ntfs+sector+size&_done=%2Fgroup%2Fmicrosoft.public.sqlserver.server%2Fbrowse_frm%2Fthread%2Fd3288d3b43338b47%2Fff5e825dd02faff4%3Ftvc%3D1%26q%3Dntfs+sector+size%26hl%3Den%26#doc_4556b64132b3baa7 This seems to indicate that *Windows* supports sector sizes >4K, but SQL Server doesn't. But again, it could be a mixup between cluster and sector size... //MAgnus
Hello, Magnus-san, Itagaki-san From: "Magnus Hagander" <magnus@hagander.net> >> I think many people can benefit from Itagaki-san's proposal, and >> NO_BUFFERING should be default. Isn't it very rare that disks with >> sector size larger than 8KB are used? > > Definitly very rare. > > >> Providing a way (such as >> wal_sync_method) to avoid NO_BUFFERING is sufficient for people in >> rare environments. Or, by determining the sector size with >> GetDiskFreeSpaceEx(), we could auto-switch to not using NO_BUFFERING >> when the sector size is larger than 8KB. > > I think the second one is better. Thank you for agreeing. Then, I hope Itagaki-san's patch will be accepted when the following treatments are added to the patch and some performance report is delivered. 1. On Windows, O_DIRECT (and O_SYNC?) is default for WAL. 2. Auto-switch to not using O_DIRECT if the sector size is larger than 8KB when the server starts. > A quick google shows some inconclusive results :-)BUt look at for > example: > http://groups.google.se/group/microsoft.public.sqlserver.server/tree/browse_frm/thread/d3288d3b43338b47/ff5e825dd02faff4?rnum=1&hl=en&q=ntfs+sector+size&_done=%2Fgroup%2Fmicrosoft.public.sqlserver.server%2Fbrowse_frm%2Fthread%2Fd3288d3b43338b47%2Fff5e825dd02faff4%3Ftvc%3D1%26q%3Dntfs+sector+size%26hl%3Den%26#doc_4556b64132b3baa7 > > This seems to indicate that *Windows* supports sector sizes >4K, but SQL > Server doesn't. But again, it could be a mixup between cluster and > sector size... This is interesting. I've never seen systems with a sector size larger than 4KB, too. On IBM zSeries (which is a mainframe running Linux), DASD (direct attached storage device) is usually used as a hard disk. The sector size of DASD is 4KB. So, the current implementation of PostgreSQL which assumes 8KB sector size is practically sufficient. Delivering an intuitive error message like SQL Server is one way when PostgreSQL encounters devices with a larger sector size than is supported. However, as you say, auto-switching to not using NO_BUFFERING is kinder to users.
People seem to be confusing sector size and cluster size. Microsoft Windows assumes sectors are 8k or less on hard drives (99% are 512 bytes). Cluster size is the allocation unit. On windows, this can be 512 to 256k (max 64k with 512 byte sectors). NTFS (which I think we need) is limited to 64k, last I looked. On RAID devices, the allocation unit might actually be larger, but usually the *sector* size of these devices is still 8k or less (usually, they mimic the 512 byte sector size, because too much software assumes this) Non-buffered I/Os don't need to be cluster boundary aligned, only sector aligned. And that restriction is only for certain drivers and devices. Many don't enforce the restriction. But to be safe, sector alignment is best, because there are some drivers that care. -----Original Message----- From: pgsql-patches-owner@postgresql.org [mailto:pgsql-patches-owner@postgresql.org] On Behalf Of Takayuki Tsunakawa Sent: Tuesday, January 16, 2007 4:53 PM To: Magnus Hagander Cc: ITAGAKI Takahiro; pgsql-patches@postgresql.org Subject: Re: [pgsql-patches] O_DIRECT support for Windows Hello, Magnus-san, Itagaki-san From: "Magnus Hagander" <magnus@hagander.net> >> I think many people can benefit from Itagaki-san's proposal, and >> NO_BUFFERING should be default. Isn't it very rare that disks with >> sector size larger than 8KB are used? > > Definitly very rare. > > >> Providing a way (such as >> wal_sync_method) to avoid NO_BUFFERING is sufficient for people in >> rare environments. Or, by determining the sector size with >> GetDiskFreeSpaceEx(), we could auto-switch to not using NO_BUFFERING >> when the sector size is larger than 8KB. > > I think the second one is better. Thank you for agreeing. Then, I hope Itagaki-san's patch will be accepted when the following treatments are added to the patch and some performance report is delivered. 1. On Windows, O_DIRECT (and O_SYNC?) is default for WAL. 2. Auto-switch to not using O_DIRECT if the sector size is larger than 8KB when the server starts. > A quick google shows some inconclusive results :-)BUt look at for > example: > http://groups.google.se/group/microsoft.public.sqlserver.server/tree/bro wse_frm/thread/d3288d3b43338b47/ff5e825dd02faff4?rnum=1&hl=en&q=ntfs+sec tor+size&_done=%2Fgroup%2Fmicrosoft.public.sqlserver.server%2Fbrowse_frm %2Fthread%2Fd3288d3b43338b47%2Fff5e825dd02faff4%3Ftvc%3D1%26q%3Dntfs+sec tor+size%26hl%3Den%26#doc_4556b64132b3baa7 > > This seems to indicate that *Windows* supports sector sizes >4K, but SQL > Server doesn't. But again, it could be a mixup between cluster and > sector size... This is interesting. I've never seen systems with a sector size larger than 4KB, too. On IBM zSeries (which is a mainframe running Linux), DASD (direct attached storage device) is usually used as a hard disk. The sector size of DASD is 4KB. So, the current implementation of PostgreSQL which assumes 8KB sector size is practically sufficient. Delivering an intuitive error message like SQL Server is one way when PostgreSQL encounters devices with a larger sector size than is supported. However, as you say, auto-switching to not using NO_BUFFERING is kinder to users. ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
> People seem to be confusing sector size and cluster size. > > Microsoft Windows assumes sectors are 8k or less on hard drives (99% are > 512 bytes). Do you have any doc ref for this? I beleive you but I've been searching for docs on that and found nothing. > > Cluster size is the allocation unit. On windows, this can be 512 to > 256k (max 64k with 512 byte sectors). > NTFS (which I think we need) is limited to 64k, last I looked. Correct. > On RAID devices, the allocation unit might actually be larger, but > usually the *sector* size of these devices is still 8k or less (usually, > they mimic the 512 byte sector size, because too much software assumes > this) Usually being the thing that might require an extra check. > And that restriction is only for certain drivers and devices. Many > don't enforce the restriction. > But to be safe, sector alignment is best, because there are some drivers > that care. exactly. so we need that check to be sure, don't we? /Magnus
> >> I think many people can benefit from Itagaki-san's proposal, and > >> NO_BUFFERING should be default. Isn't it very rare that disks with > >> sector size larger than 8KB are used? > > > > Definitly very rare. > > > > > >> Providing a way (such as > >> wal_sync_method) to avoid NO_BUFFERING is sufficient for people in > >> rare environments. Or, by determining the sector size with > >> GetDiskFreeSpaceEx(), we could auto-switch to not using > NO_BUFFERING > >> when the sector size is larger than 8KB. > > > > I think the second one is better. > > Thank you for agreeing. Then, I hope Itagaki-san's patch will be > accepted when the following treatments are added to the patch and some > performance report is delivered. I would think so, but we definity need to see some numbers that show that it helps. > > A quick google shows some inconclusive results :-)BUt look at for > > example: > > > http://groups.google.se/group/microsoft.public.sqlserver.server/tree/browse_frm/thread/d3288d3b43338b47/ff5e825dd02faff4?rnum=1&hl=en&q=ntfs+sector+size&_done=% 2Fgroup%2Fmicrosoft.public.sqlserver.server%2Fbrowse_frm%2Fthread%2Fd3288d3b43338b47%2Fff5e825dd02faff4%3Ftvc%3D1%26q%3Dntfs+sector+size%26hl%3Den%26#doc_4556b6 4132b3baa7 > > > > This seems to indicate that *Windows* supports sector sizes >4K, but > SQL > > Server doesn't. But again, it could be a mixup between cluster and > > sector size... > > This is interesting. I've never seen systems with a sector size > larger than 4KB, too. On IBM zSeries (which is a mainframe running > Linux), DASD (direct attached storage device) is usually used as a > hard disk. The sector size of DASD is 4KB. So, the current > implementation of PostgreSQL which assumes 8KB sector size is > practically sufficient. > Delivering an intuitive error message like SQL Server is one way when > PostgreSQL encounters devices with a larger sector size than is > supported. However, as you say, auto-switching to not using > NO_BUFFERING is kinder to users. I think both. Auto-switch but log a warning to the user that this was done. /Magnus
BTW: From the current FAT/FAT32 source code fsctrl.c (which I have via Microsoft's IFS kit), when it checks the boot sector for validity: // // Enforce some sanity on the sector size (easy check) // } else if ((Bpb.BytesPerSector != 128) && (Bpb.BytesPerSector != 256) && (Bpb.BytesPerSector != 512) && (Bpb.BytesPerSector != 1024) && (Bpb.BytesPerSector != 2048) && (Bpb.BytesPerSector != 4096)) { Result = FALSE; So, sector size has to be <= 4096. I happen to know from a friend who works with ntfs source that it requires sector size <= page size. So on both systems, sectors can't be more than 4096. Also, see http://support.microsoft.com/kb/923332 -----Original Message----- From: pgsql-patches-owner@postgresql.org [mailto:pgsql-patches-owner@postgresql.org] On Behalf Of Chuck McDevitt Sent: Tuesday, January 16, 2007 11:09 PM To: Takayuki Tsunakawa; Magnus Hagander Cc: ITAGAKI Takahiro; pgsql-patches@postgresql.org Subject: Re: [pgsql-patches] O_DIRECT support for Windows People seem to be confusing sector size and cluster size. Microsoft Windows assumes sectors are 8k or less on hard drives (99% are 512 bytes). Cluster size is the allocation unit. On windows, this can be 512 to 256k (max 64k with 512 byte sectors). NTFS (which I think we need) is limited to 64k, last I looked. On RAID devices, the allocation unit might actually be larger, but usually the *sector* size of these devices is still 8k or less (usually, they mimic the 512 byte sector size, because too much software assumes this) Non-buffered I/Os don't need to be cluster boundary aligned, only sector aligned. And that restriction is only for certain drivers and devices. Many don't enforce the restriction. But to be safe, sector alignment is best, because there are some drivers that care. -----Original Message----- From: pgsql-patches-owner@postgresql.org [mailto:pgsql-patches-owner@postgresql.org] On Behalf Of Takayuki Tsunakawa Sent: Tuesday, January 16, 2007 4:53 PM To: Magnus Hagander Cc: ITAGAKI Takahiro; pgsql-patches@postgresql.org Subject: Re: [pgsql-patches] O_DIRECT support for Windows Hello, Magnus-san, Itagaki-san From: "Magnus Hagander" <magnus@hagander.net> >> I think many people can benefit from Itagaki-san's proposal, and >> NO_BUFFERING should be default. Isn't it very rare that disks with >> sector size larger than 8KB are used? > > Definitly very rare. > > >> Providing a way (such as >> wal_sync_method) to avoid NO_BUFFERING is sufficient for people in >> rare environments. Or, by determining the sector size with >> GetDiskFreeSpaceEx(), we could auto-switch to not using NO_BUFFERING >> when the sector size is larger than 8KB. > > I think the second one is better. Thank you for agreeing. Then, I hope Itagaki-san's patch will be accepted when the following treatments are added to the patch and some performance report is delivered. 1. On Windows, O_DIRECT (and O_SYNC?) is default for WAL. 2. Auto-switch to not using O_DIRECT if the sector size is larger than 8KB when the server starts. > A quick google shows some inconclusive results :-)BUt look at for > example: > http://groups.google.se/group/microsoft.public.sqlserver.server/tree/bro wse_frm/thread/d3288d3b43338b47/ff5e825dd02faff4?rnum=1&hl=en&q=ntfs+sec tor+size&_done=%2Fgroup%2Fmicrosoft.public.sqlserver.server%2Fbrowse_frm %2Fthread%2Fd3288d3b43338b47%2Fff5e825dd02faff4%3Ftvc%3D1%26q%3Dntfs+sec tor+size%26hl%3Den%26#doc_4556b64132b3baa7 > > This seems to indicate that *Windows* supports sector sizes >4K, but SQL > Server doesn't. But again, it could be a mixup between cluster and > sector size... This is interesting. I've never seen systems with a sector size larger than 4KB, too. On IBM zSeries (which is a mainframe running Linux), DASD (direct attached storage device) is usually used as a hard disk. The sector size of DASD is 4KB. So, the current implementation of PostgreSQL which assumes 8KB sector size is practically sufficient. Delivering an intuitive error message like SQL Server is one way when PostgreSQL encounters devices with a larger sector size than is supported. However, as you say, auto-switching to not using NO_BUFFERING is kinder to users. ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
So, do we want this patch? Are we OK on WIN32 alignment issues? --------------------------------------------------------------------------- ITAGAKI Takahiro wrote: > The attached is a patch to define O_DIRECT by ourselves on Windows, > and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. > > There will be a consistency in our support between Windows and other OSes > that have O_DIRECT. Also, there is the following comment that says, I read, > we should do so. > | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) > > Is this worth doing? Do we need more performance reports for the change? > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
We're ok with the alignment issues provided the is code added to reject O_DIRECT if the sector size is too large. We also said we need to see some performance numbers on the effect of the patch before it goes in. //Magnus Bruce Momjian wrote: > So, do we want this patch? Are we OK on WIN32 alignment issues? > > --------------------------------------------------------------------------- > > ITAGAKI Takahiro wrote: >> The attached is a patch to define O_DIRECT by ourselves on Windows, >> and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. >> >> There will be a consistency in our support between Windows and other OSes >> that have O_DIRECT. Also, there is the following comment that says, I read, >> we should do so. >> | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) >> >> Is this worth doing? Do we need more performance reports for the change? >> >> Regards, >> --- >> ITAGAKI Takahiro >> NTT Open Source Software Center > > [ Attachment, skipping... ] > >> ---------------------------(end of broadcast)--------------------------- >> TIP 2: Don't 'kill -9' the postmaster >
Are there any performance numbers on this? --------------------------------------------------------------------------- ITAGAKI Takahiro wrote: > The attached is a patch to define O_DIRECT by ourselves on Windows, > and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. > > There will be a consistency in our support between Windows and other OSes > that have O_DIRECT. Also, there is the following comment that says, I read, > we should do so. > | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) > > Is this worth doing? Do we need more performance reports for the change? > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Magnus, where are on this? --------------------------------------------------------------------------- Magnus Hagander wrote: > We're ok with the alignment issues provided the is code added to reject > O_DIRECT if the sector size is too large. > > We also said we need to see some performance numbers on the effect of > the patch before it goes in. > > //Magnus > > > Bruce Momjian wrote: > > So, do we want this patch? Are we OK on WIN32 alignment issues? > > > > --------------------------------------------------------------------------- > > > > ITAGAKI Takahiro wrote: > >> The attached is a patch to define O_DIRECT by ourselves on Windows, > >> and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. > >> > >> There will be a consistency in our support between Windows and other OSes > >> that have O_DIRECT. Also, there is the following comment that says, I read, > >> we should do so. > >> | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) > >> > >> Is this worth doing? Do we need more performance reports for the change? > >> > >> Regards, > >> --- > >> ITAGAKI Takahiro > >> NTT Open Source Software Center > > > > [ Attachment, skipping... ] > > > >> ---------------------------(end of broadcast)--------------------------- > >> TIP 2: Don't 'kill -9' the postmaster > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
IIRC, we're still waiting for performance numbers showing there exists a win from this patch. //Magnus Bruce Momjian wrote: > Magnus, where are on this? > > --------------------------------------------------------------------------- > > Magnus Hagander wrote: >> We're ok with the alignment issues provided the is code added to reject >> O_DIRECT if the sector size is too large. >> >> We also said we need to see some performance numbers on the effect of >> the patch before it goes in. >> >> //Magnus >> >> >> Bruce Momjian wrote: >>> So, do we want this patch? Are we OK on WIN32 alignment issues? >>> >>> --------------------------------------------------------------------------- >>> >>> ITAGAKI Takahiro wrote: >>>> The attached is a patch to define O_DIRECT by ourselves on Windows, >>>> and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. >>>> >>>> There will be a consistency in our support between Windows and other OSes >>>> that have O_DIRECT. Also, there is the following comment that says, I read, >>>> we should do so. >>>> | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) >>>> >>>> Is this worth doing? Do we need more performance reports for the change? >>>> >>>> Regards, >>>> --- >>>> ITAGAKI Takahiro >>>> NTT Open Source Software Center >>> [ Attachment, skipping... ] >>> >>>> ---------------------------(end of broadcast)--------------------------- >>>> TIP 2: Don't 'kill -9' the postmaster >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 7: You can help support the PostgreSQL project by donating at >> >> http://www.postgresql.org/about/donate >
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --------------------------------------------------------------------------- ITAGAKI Takahiro wrote: > The attached is a patch to define O_DIRECT by ourselves on Windows, > and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. > > There will be a consistency in our support between Windows and other OSes > that have O_DIRECT. Also, there is the following comment that says, I read, > we should do so. > | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) > > Is this worth doing? Do we need more performance reports for the change? > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
I've done some further looking aruond at this, and I've been unable to find any references to disk systems with sector size > 8192 bytes (which is what the alignment of the buffers per XLOG_BLCKSZ, at leastby default). So I'll commit this fairly simple patch, and we'll revert it or add runtime checks later if we find out that there exist such systems somewhere.. //Magnus On Mon, Apr 02, 2007 at 04:39:00PM -0400, Bruce Momjian wrote: > > Your patch has been added to the PostgreSQL unapplied patches list at: > > http://momjian.postgresql.org/cgi-bin/pgpatches > > It will be applied as soon as one of the PostgreSQL committers reviews > and approves it. > > --------------------------------------------------------------------------- > > > ITAGAKI Takahiro wrote: > > The attached is a patch to define O_DIRECT by ourselves on Windows, > > and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. > > > > There will be a consistency in our support between Windows and other OSes > > that have O_DIRECT. Also, there is the following comment that says, I read, > > we should do so. > > | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) > > > > Is this worth doing? Do we need more performance reports for the change? > > > > Regards, > > --- > > ITAGAKI Takahiro > > NTT Open Source Software Center > > [ Attachment, skipping... ] > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 2: Don't 'kill -9' the postmaster > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://www.enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match
Patch applied by Magnus. --------------------------------------------------------------------------- ITAGAKI Takahiro wrote: > The attached is a patch to define O_DIRECT by ourselves on Windows, > and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. > > There will be a consistency in our support between Windows and other OSes > that have O_DIRECT. Also, there is the following comment that says, I read, > we should do so. > | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) > > Is this worth doing? Do we need more performance reports for the change? > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +