Thread: Win32 Powerfail testing - results
Well the results are finally in. Hopefully we can concentrate on putting them right, rather than having a round of "told you so's" :-) I modified the test program slightly to improve the consistency checks. The updated version is attached. Regards, Dave. System ====== Gigabyte GA-6VTXD Motherboard Dual 1GHz PIII Processors 1Gb Non-ECC RAM Fujitsu MPG3240AH IDE Disk Drive Enhanced IDE Performance disabled in the BIOS. Test ==== Test program run from a seperate machine. 20 Tests per OS. Powerfail randomly applied. Windows 2000 Testing ==================== Write back cache on IDE disk disabled. Clean installation of Windows 2000 Server with Service Pack 3 Run | Errors Detected ============================================================= 01 | None 02 | None 03 | None 04 | None 05 | None 06 | None 07 | COUNT CHECK - Duplicate or missing rows detected (10262)!! 08 | None 09 | DISTINCT CHECK - Duplicate or missing rows detected (9893)!! | COUNT CHECK - Duplicate or missing rows detected (9893)!! 10 | None 11 | None 12 | None 13 | None 14 | COUNT CHECK - Duplicate or missing rows detected (10024)!! 15 | None 16 | None 17 | None 18 | None 19 | None 20 | None Linux Testing ============= Clean installation of Slackware Linux 8.1 on ext3 Kernel 2.4.18 Run | Errors Detected ============================================================= 01 | None 02 | None 03 | None 04 | None 05 | None 06 | None 07 | None 08 | None 09 | None 10 | None 11 | None 12 | None 13 | None 14 | None 15 | None 16 | None 17 | None 18 | None 19 | None 20 | None
Attachment
On Mon, 3 Feb 2003, Dave Page wrote: > Well the results are finally in. Hopefully we can concentrate on putting > them right, rather than having a round of "told you so's" :-) > > I modified the test program slightly to improve the consistency checks. > The updated version is attached. [...] > > Run | Errors Detected > ============================================================= > 07 | COUNT CHECK - Duplicate or missing rows detected (10262)!! > 09 | DISTINCT CHECK - Duplicate or missing rows detected (9893)!! > | COUNT CHECK - Duplicate or missing rows detected (9893)!! > 14 | COUNT CHECK - Duplicate or missing rows detected (10024)!! Out of curiousity, what was required to return things to normal again? Vince. -- Fast, inexpensive internet service 56k and beyond! http://www.pop4.net/ http://www.meanstreamradio.com http://www.unknown-artists.com Internet radio: It's not file sharing, it's just radio.
> I modified the test program slightly to improve the consistency checks. > The updated version is attached. For curiosity sake, I've compiled it and am running it on FreeBSD with soft-updates enabled. A few variable declarations needed to be bumped up to the top of their respective function. Any change of tossing in a periodic VACUUM or would that throw off the results? -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Vince Vielhaber allegedly said: > On Mon, 3 Feb 2003, Dave Page wrote: > >> Run | Errors Detected >> ============================================================= >> 07 | COUNT CHECK - Duplicate or missing rows detected (10262)!! 09 | >> DISTINCT CHECK - Duplicate or missing rows detected (9893)!! >> | COUNT CHECK - Duplicate or missing rows detected (9893)!! >> 14 | COUNT CHECK - Duplicate or missing rows detected (10024)!! > > Out of curiousity, what was required to return things to normal > again? I ran the test app in reset mode which drops the table, then re-creates it and populates it with fresh data. I thought it best to drop first to eliminate possible problems with corrupt, but invisible tuples (if such a thing could have occured). Regards, Dave.
Rod Taylor allegedly said: >> I modified the test program slightly to improve the consistency >> checks. The updated version is attached. > > For curiosity sake, I've compiled it and am running it on FreeBSD with > soft-updates enabled. > > A few variable declarations needed to be bumped up to the top of their > respective function. I've been doing a fair bit of C++ recently... > Any change of tossing in a periodic VACUUM or would that throw off the > results? Dunno, Tom could best answer that, but a *complete guess* based on piecing together tidbits of how it all works from various threads here, would be that it would merely increase the time period during which a powerfail would be unlikely to cause duplicate rows. Reasoning for this is that vacuum would be messing with tuples that are already dead. Please correct me if I'm wrong :-) Regards, Dave.
"Dave Page" <dpage@vale-housing.co.uk> writes: > Rod Taylor allegedly said: >> Any change of tossing in a periodic VACUUM or would that throw off the >> results? > Dunno, Tom could best answer that, but a *complete guess* based on piecing > together tidbits of how it all works from various threads here, would be > that it would merely increase the time period during which a powerfail > would be unlikely to cause duplicate rows. Reasoning for this is that > vacuum would be messing with tuples that are already dead. I think it'd be interesting to try it both ways. VACUUM might throw in new failure modes. I'm not sure if it could mask the failure mode you already found. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 03 February 2003 21:52 > To: Dave Page > Cc: rbt@rbt.ca; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Win32 Powerfail testing - results > > > "Dave Page" <dpage@vale-housing.co.uk> writes: > > Rod Taylor allegedly said: > >> Any change of tossing in a periodic VACUUM or would that throw off > >> the results? > > > Dunno, Tom could best answer that, but a *complete guess* based on > > piecing together tidbits of how it all works from various threads > > here, would be that it would merely increase the time period during > > which a powerfail would be unlikely to cause duplicate > rows. Reasoning > > for this is that vacuum would be messing with tuples that > are already > > dead. > > I think it'd be interesting to try it both ways. VACUUM > might throw in new failure modes. I'm not sure if it could > mask the failure mode you already found. OK, I'll bung Win2K back on the test box tomorrow. Any preference as to the type of vacuum? I assume full would be most likely to cause problems. I'll add the vacuum after the commit... Regards, Dave.
Dave Page kirjutas E, 03.02.2003 kell 18:51: > Well the results are finally in. Hopefully we can concentrate on putting > them right, rather than having a round of "told you so's" :-) > > I modified the test program slightly to improve the consistency checks. > The updated version is attached. > > Regards, Dave. > > System > ====== > > Gigabyte GA-6VTXD Motherboard > Dual 1GHz PIII Processors > 1Gb Non-ECC RAM > Fujitsu MPG3240AH IDE Disk Drive > > Enhanced IDE Performance disabled in the BIOS. > > Test > ==== > > Test program run from a seperate machine. > 20 Tests per OS. > Powerfail randomly applied. Your hardware should also be able to run Postgres on BeOS http://www.bebits.com/app/2752 Being the only non-unix "port" before/besides win32, it could be an interesting excercise. You should be able to get and installable BeOS itself from SourceForge http://sourceforge.net/projects/crux/ > Windows 2000 Testing > ==================== Is this NTFS ? Any possibility of trying the same tests with SCSI disks ? > Write back cache on IDE disk disabled. > Clean installation of Windows 2000 Server with Service Pack 3 > > Run | Errors Detected > ============================================================= > 01 | None > 02 | None > 03 | None > 04 | None > 05 | None > 06 | None > 07 | COUNT CHECK - Duplicate or missing rows detected (10262)!! > 08 | None > 09 | DISTINCT CHECK - Duplicate or missing rows detected (9893)!! I remember having problems with UNIQUE columns having duplicate values a few versions back on Linux-ext2-IDE. Could this be the same problem or must it be something completely different ? > | COUNT CHECK - Duplicate or missing rows detected (9893)!! > 10 | None > 11 | None > 12 | None > 13 | None > 14 | COUNT CHECK - Duplicate or missing rows detected (10024)!! > 15 | None > 16 | None > 17 | None > 18 | None > 19 | None > 20 | None > > Linux Testing > ============= > > Clean installation of Slackware Linux 8.1 on ext3 > Kernel 2.4.18 > > Run | Errors Detected > ============================================================= > 01 | None > ... > 20 | None BTW, are the tests portable enough to run also on MSSQL, Oracle and DB2 ? I know that you can't publish exact results, but perhaps something like the GreatBridge results - the one that runs only on Win32 did so-and-so, the one that has 'i' at the end of version number this, and the one whose name consists of two letters and a number did that ? -- Hannu Krosing <hannu@tm.ee>
> -----Original Message----- > From: Hannu Krosing [mailto:hannu@tm.ee] > Sent: 03 February 2003 22:30 > To: Dave Page > Cc: PostgreSQL Hackers; Katie Ward > Subject: Re: [HACKERS] Win32 Powerfail testing - results > > > Your hardware should also be able to run Postgres on BeOS > > http://www.bebits.com/app/2752 > > Being the only non-unix "port" before/besides win32, it could > be an interesting excercise. One that will have to go untested I'm afraid. These tests take a fair while and you know how many pies I've got my fingers in right now just on this project, never mind my paying gig and Uni!! > > Windows 2000 Testing > > ==================== > > Is this NTFS ? Yes. > Any possibility of trying the same tests with SCSI disks ? Depends on my time. I have a couple of 29160's and some Seagate Cheetah X15's knocking about. > > I remember having problems with UNIQUE columns having > duplicate values a few versions back on Linux-ext2-IDE. Could > this be the same problem or must it be something completely > different ? Pass. I don't know the details of your problem, or how Peerdirect have handled the IO. If I'm honest, I'm probably not experienced enough in that sort of thing to know what's going wrong anyway :-( > > BTW, are the tests portable enough to run also on MSSQL, > Oracle and DB2 ? Well I posted the source. If you pull out the libpq stuff then I guess so. I only have DB2 and MSSQL here though (and they both fall over at will anyway). Again though, I can't really spend time testing them just for interest's sake (not at present anyway). Regards, Dave.