Thread: Re: fsync vs open_sync
> > There is also the fact that NTFS is a very slow filesystem, and > > Linux is > > a lot better than Windows for everything disk, caching and IO related. > Try > > to copy some files in NTFS and in ReiserFS... > > I'm not so sure I would agree with such a blanket generalization. I find > NTFS to be very fast, my main complaint is fragmentation issues...I bet > NTFS is better than ext3 at most things (I do agree with you about the > cache, thoughO. Ok, you were right. I made some tests and NTFS is just not very good in the general case. I've seen some benchmarks forReiser4 that are just amazing. Merlin
>> > There is also the fact that NTFS is a very slow filesystem, and >> > Linux is >> > a lot better than Windows for everything disk, caching and IO related. >> Try >> > to copy some files in NTFS and in ReiserFS... >> >> I'm not so sure I would agree with such a blanket generalization. I >> find >> NTFS to be very fast, my main complaint is fragmentation issues...I bet >> NTFS is better than ext3 at most things (I do agree with you about the >> cache, thoughO. > > Ok, you were right. I made some tests and NTFS is just not very good in > the general case. I've seen some benchmarks for Reiser4 that are just > amazing. As a matter of fact I was again amazed today. I was looking into a way to cache database queries for a website (not yet) written in Python. The purpose was to cache long queries like those used to render forum pages (which is the typical slow query, selecting from a big table where records are rather random and LIMIT is used to cut the result in pages). I wanted to save a serialized (python pickled) representation of the data to disk to avoid reissuing the query every time. In the end it took about 1 ms to load or save the data for a page with 40 posts... then I wondered, how much does it take just to read or write the file ? ReiserFS 3.6, Athlon XP 2.5G+, 512Mb DDR400 7200 RPM IDE Drive with 8MB Cache This would be considered a very underpowered server... 22 KB files, 1000 of them : open(), read(), close() : 10.000 files/s open(), write(), close() : 4.000 files/s This is quite far from database FS activity, but it's still amazing, although the disk doesn't even get used. Which is what I like in Linux. You can write 10000 files in one second and the HDD is still idle... then when it decides to flush it all goes to disk in one burst. I did make benchmarks some time ago and found that what sets Linux apart from Windows in terms of filesystems is : - very high performance filesystems like ReiserFS This is the obvious part ; although with a huuuuge amount of data in small files accessed randomly, ReiserFS is faster but not 10x, maybe something like 2x NTFS. I trust Reiser4 to offer better performance, but not right now. Also ReiserFS lacks a defragmenter, and it gets slower after 1-2 years (compared to 1-2 weeks with NTFS this is still not that bad, but I'd like to defragment and I cant). Reiser4 will fix that apparently with background defragger etc. - caching. Linux disk caching is amazing. When copying a large file to the same disk on Windows, the drive head swaps a lot, like the OS can't decide between reading and writing. Linux, on the other hand, reads and writes by large chunks and loses a lot less time seekng. Even when reading two files at the same time, Linux reads ahead in large chunks (very little performance loss) whereas Windows seeks a lot. The read-ahead and write-back thus gets it a lot faster than 2x NTFS for everyday tasks like copying files, backing up, making archives, grepping, serving files, etc... My windows box was able to saturate a 100Mbps ethernet while serving one large FTP file on the LAN (not that impressive, it's only 10 MB/s hey!). However, when several simultaneous clients were trying to download different files which were not in the disk cache, all hell broke loose : lots of seeking, and bandwidth dropped to 30 Mbits/s. Not enough read-ahead... The Linux box, serving FTP, with half the RAM (256 Mb), had no problem pushing the 100 Mbits/s with something like 10 simultaneous connections. The amusing part is that I could not use the Windows box to test it because it would choke at such a "high" IO concurrency (writing 10 MBytes/s to several files at once, my god). Of course the files which had been downloaded to the Windows box were cut in as many fragments as the number of disk seeks during the download... several hundred fragments each... my god... What amazes me is that it must just be some parameter somewhere and the Microsoft guys probably could have easily changed the read-ahead thresholds and time between seeks when in a multitasking environment, but they didn't. Why ? Thus people are forced to buy 10000RPM SCSI drives for their LAN servers when an IDE raid, used with Linux, could push nearly a Gigabit... For database, this is different, as we're concerned about large files, and fsync() times... but it seems reiserfs still wins over ext3 so... About NTFS vs EXT3 : ext3 dies if you put a lot of files in the same directory. It's fast but still outperformed by reiser. I saw XFS fry eight 7 harddisk RAID bays. The computer was rebooted with the Reset button a few times because a faulty SCSI cable in the eighth RAID bay was making it hang. The 7 bays had no problem. When it went back up, all the bays were in mayhem. XFSrepair just vomited over itself and we got plenty of files with random data in them. Fortunately there was a catalog of files with their checksums so at least we could know which files were okay. Have you tried restoring that amount of data from a backup ? Now maybe this was just bad luck and crap hardware, but I still won't touch XFS. Amazing performance on large files though... I've had my computers shutdown violently by power failures and no reiserfs problems so far. NTFS is very crash proof too. My windows machine bluescreens twice a day and still no data loss ;) Upside : an junkyard UPS with dead batteries, powered with two brand new 12V car batteries, costs 70 euro and powers a computer for more than 5 hours... Downside : - it's ugly (I hide it under my desk) - you "borrow" a battery to start your friend's car, and just five minutes later, the UPS wants to test itself, discovered it has no more batteries, and switches everything off... argh. Good evening...
Pierre-Frédéric Caillaud wrote: > 22 KB files, 1000 of them : > open(), read(), close() : 10.000 files/s > open(), write(), close() : 4.000 files/s > > This is quite far from database FS activity, but it's still > amazing, although the disk doesn't even get used. Which is what I like > in Linux. You can write 10000 files in one second and the HDD is still > idle... then when it decides to flush it all goes to disk in one burst. You can not trust your data in this. > I've had my computers shutdown violently by power failures and no > reiserfs problems so far. NTFS is very crash proof too. My windows > machine bluescreens twice a day and still no data loss ;) If you have the BSOD twice a day then you have a broken driver or broken HW. CPU overclocked ? I understood from your email that you are a Windows haters, try to post something here: http://ihatelinux.blogspot.com/ :-) Regards Gaetano Mendola
Another possibly useless datapoint on this thread for anyone who's curious ... open_sync absolutely stinks over NFS at least on Linux. :)
The world rejoiced as merlin.moncure@rcsonline.com ("Merlin Moncure") wrote: > Ok, you were right. I made some tests and NTFS is just not very > good in the general case. I've seen some benchmarks for Reiser4 > that are just amazing. Reiser4 has been sounding real interesting. The killer problem is thus: "We must caution that just as Linux 2.6 is not yet as stable as Linux 2.4, it will also be some substantial time before V4 is as stable as V3." In practice, there's a further problem. We have some systems at work we need to connect to EMC disk arrays; that's something that isn't supported by EMC unless you're using a whole set of pieces that are "officially supported." RHAT doesn't want to talk to you about support for anything other than ext3. I'm not sure what all SuSE supports; they're about the only other Linx vendor that EMC would support, and I don't expect that Reiser4 yet fits into the "supportable" category :-(. The upshot of that is that this means that we'd only consider using stuff like Reiser4 on "toy" systems, and, quite frankly, that means that they'll have "toy" disk as opposed to the good stuff :-(. And frankly, we're too busy with issues nearer to our hearts than testing out ReiserFS. :-( -- output = ("cbbrowne" "@" "cbbrowne.com") http://cbbrowne.com/info/emacs.html "Linux! Guerrilla Unix Development Venimus, Vidimus, Dolavimus." -- <mah@ka4ybr.com> Mark A. Horton KA4YBR
On Sat, 2004-09-04 at 23:47 -0400, Christopher Browne wrote: > The world rejoiced as merlin.moncure@rcsonline.com ("Merlin Moncure") wrote: > > Ok, you were right. I made some tests and NTFS is just not very > > good in the general case. I've seen some benchmarks for Reiser4 > > that are just amazing. > > Reiser4 has been sounding real interesting. > Are these independent benchmarks, or the benchmarketing at namesys.com? Note that the APPEND, MODIFY, and OVERWRITE phases have been turned off on the mongo tests and the other tests have been set to a lexical (non default for mongo) mode. I've done some mongo benchmarking myself and reiser4 loses to ext3 (data=ordered) in the excluded tests. APPEND phase performance is absolutely *horrible*. So they just turned off the phases in which reiser4 lost and published the remaining results as proof that "resier4 is the fastest filesystem". See: http://marc.theaimsgroup.com/?l=reiserfs&m=109363302000856 -Steve Bergman
Christopher Browne wrote: > I'm not sure what all SuSE supports; they're about the only other Linx > vendor that EMC would support, and I don't expect that Reiser4 yet > fits into the "supportable" category :-(. I use quite a bit of SuSE, and although I don't know their official position on Reiser file systems, I do know that it is the default when installing, so I'd suggest you might check into it. -- Until later, Geoffrey Registered Linux User #108567 AT&T Certified UNIX System Programmer - 1995
Were you upset by my message ? I'll try to clarify. > I understood from your email that you are a Windows haters Well, no, not really. I use Windows everyday and it has its strengths. I still don't think the average (non-geek) person can really use Linux as a Desktop OS. The problem I have with Windows is that I think it could be made much faster, without too much effort (mainly some tweaking in the Disk IO field), but Microsoft doesn't do it. Why ? I can't understand this. >> in Linux. You can write 10000 files in one second and the HDD is still >> idle... then when it decides to flush it all goes to disk in one burst. > > You can not trust your data in this. That's why I mentioned that it did not relate to database type performance. If the computer crashes while writing these files, some may be partially written, some not at all, some okay... the only certainty is about filesystem integrity. But it's exactly the same on all Journaling filesystems (including NTFS). Thus, with equal reliability, the faster wins. Maybe, with Reiser4, we will see real filesystem transactions and maybe this will translate in higher postgres performance... > >> I've had my computers shutdown violently by power failures and no >> reiserfs problems so far. NTFS is very crash proof too. My windows >> machine bluescreens twice a day and still no data loss ;) > > If you have the BSOD twice a day then you have a broken driver or broken > HW. CPU overclocked ? I think this machine has crap hardware. In fact this example was to emphasize the reliability of NTFS : it is indeed remarkable that no data loss occurs even on such a crap machine. I know Windows has got quite reliable now.
I trust ReiserFS 3. I wouldn't trust the 4 before maybe 1-2 years. On Sun, 05 Sep 2004 07:41:29 -0400, Geoffrey <esoteric@3times25.net> wrote: > Christopher Browne wrote: > >> I'm not sure what all SuSE supports; they're about the only other Linx >> vendor that EMC would support, and I don't expect that Reiser4 yet >> fits into the "supportable" category :-(. > > I use quite a bit of SuSE, and although I don't know their official > position on Reiser file systems, I do know that it is the default when > installing, so I'd suggest you might check into it. > >
On Sun, Sep 05, 2004 at 12:16:42AM -0500, Steve Bergman wrote: > On Sat, 2004-09-04 at 23:47 -0400, Christopher Browne wrote: > > The world rejoiced as merlin.moncure@rcsonline.com ("Merlin Moncure") wrote: > > > Ok, you were right. I made some tests and NTFS is just not very > > > good in the general case. I've seen some benchmarks for Reiser4 > > > that are just amazing. > > > > Reiser4 has been sounding real interesting. > > > > Are these independent benchmarks, or the benchmarketing at namesys.com? > Note that the APPEND, MODIFY, and OVERWRITE phases have been turned off > on the mongo tests and the other tests have been set to a lexical (non > default for mongo) mode. I've done some mongo benchmarking myself and > reiser4 loses to ext3 (data=ordered) in the excluded tests. APPEND > phase performance is absolutely *horrible*. So they just turned off the > phases in which reiser4 lost and published the remaining results as > proof that "resier4 is the fastest filesystem". > > See: http://marc.theaimsgroup.com/?l=reiserfs&m=109363302000856 > > > -Steve Bergman > > > Reiser4 also isn't optmized for lots of fsyncs (unless it's been done recently.) I believe the mention fsync performance in their release notes. I've seen this dramatically hurt performance with our OLTP workload. -- Mark Wong - - markw@osdl.org Open Source Development Lab Inc - A non-profit corporation 12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005 (503) 626-2455 x 32 (office) (503) 626-2436 (fax) http://developer.osdl.org/markw/