Thread: ext3
Hello, Gabor Szima asked us to translate the letter below. "I read that ext3 writeback mode is recommended for PostgreSQL. I made some tests. data=ordered data=writeback ---------------------------------------------------------------------- restoredb: 2m16.790s 1m42.367s UPDATE <tbl1> (17krows): 9.289s 7.147s UPDATE <tbl1> (17krows) (2.): 10.480s 3.778s VACUUM ANALYZE <tbl1>: 9.364s 0.986s ! VACUUM FULL <tbl1>: 16.071s 2.575s REINDEX TABLE <tbl1>: 3.815s 1.886s ---------------------------------------------------------------------- It's seductive. However I made some crash-tests too. Updated 4 tables simultaneously and recurring for 10 to 120s, then powered off the machine (without the reset button. i just pulled out the cable). SEQ RECOVERY-WARNINGS VACUUM ------------------------------- 01: 1650 OK (WARNING: invalid page header in block 769 of relation "18800"; zeroing out page) 02: 3 FATAL (ERROR: could not access status of transaction 37814272) ------------------------------- (DETAIL: could not open file "/data/pgdata/pg_clog/0024": No such file or directory) I have stopped my tests at this point because this is not for production use. The database was corrupted. With ordered mode I got this: ext3-noatime,data=ordered: SEQ RECOVERY-WARNINGS VACUUM ------------------------------ 01: 0 OK 02: 0 OK 03: 0 OK 04: 0 W,OK (relation "<tbl>" page 398 is uninitialized --- fixing) 05: 0 OK 06: 0 OK 07: 0 W,OK (relation "<tbl>" page 911 is uninitialized --- fixing) 08: 0 OK 09: 0 OK 10: 0 OK ------------------------------ I think that writeback mode first records the data then the inode, and the ordered mode does it in reverse order. I also mean that postgres log requires the inode recorded correctly, the data loss is handled by the WAL. AMD XP2000, 512MB RAM, PostgreSQL 7.4.6 (i686), linux-2.4.28, gcc-3.3.5, Adaptec 29160, WD Enterprise 4360 (SCSI, SCA-80) I made mkfs and initdb before every tests and I repeated them in reverse order too. No quake3 ran in the background. -Sygma" Sorry for my english. Mage
On Mon, 17 Jan 2005 20:00:46 +0100, Mage <mage@mage.hu> wrote: > Hello, > > Gabor Szima asked us to translate the letter below. > > "I read that ext3 writeback mode is recommended for PostgreSQL. I made > some tests. > > data=ordered data=writeback > ---------------------------------------------------------------------- > restoredb: 2m16.790s 1m42.367s > UPDATE <tbl1> (17krows): 9.289s 7.147s > UPDATE <tbl1> (17krows) (2.): 10.480s 3.778s > VACUUM ANALYZE <tbl1>: 9.364s 0.986s ! > VACUUM FULL <tbl1>: 16.071s 2.575s > REINDEX TABLE <tbl1>: 3.815s 1.886s > ---------------------------------------------------------------------- > > It's seductive. > However I made some crash-tests too. Updated 4 tables simultaneously and > recurring for 10 to 120s, then powered off the machine (without the > reset button. i just pulled out the cable). That's an excellent way to fry your PSU and damage your hardware. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ L. Friedman netllama@gmail.com LlamaLand http://netllama.linux-sxs.org
I recommend you don't use ext3 for any database: http://seclists.org/lists/linux-kernel/2005/Jan/0641.html apparently its still buggy. Regards, tzahi. > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Mage > Sent: Monday, January 17, 2005 9:01 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] ext3 > > > Hello, > > Gabor Szima asked us to translate the letter below. > > "I read that ext3 writeback mode is recommended for > PostgreSQL. I made > some tests. > > data=ordered data=writeback > ---------------------------------------------------------------------- > restoredb: 2m16.790s 1m42.367s > UPDATE <tbl1> (17krows): 9.289s 7.147s > UPDATE <tbl1> (17krows) (2.): 10.480s 3.778s > VACUUM ANALYZE <tbl1>: 9.364s 0.986s ! > VACUUM FULL <tbl1>: 16.071s 2.575s > REINDEX TABLE <tbl1>: 3.815s 1.886s > ---------------------------------------------------------------------- > > It's seductive. > However I made some crash-tests too. Updated 4 tables > simultaneously and > recurring for 10 to 120s, then powered off the machine (without the > reset button. i just pulled out the cable). > > SEQ RECOVERY-WARNINGS VACUUM > ------------------------------- > 01: 1650 OK (WARNING: invalid page header in > block 769 of relation "18800"; zeroing out page) > 02: 3 FATAL (ERROR: could not access status of > transaction 37814272) > ------------------------------- (DETAIL: could not open file > "/data/pgdata/pg_clog/0024": No such file or directory) > > I have stopped my tests at this point because this is not for > production > use. The database was corrupted. > > > With ordered mode I got this: > > ext3-noatime,data=ordered: > > SEQ RECOVERY-WARNINGS VACUUM > ------------------------------ > 01: 0 OK > 02: 0 OK > 03: 0 OK > 04: 0 W,OK (relation "<tbl>" page 398 is > uninitialized --- fixing) > 05: 0 OK > 06: 0 OK > 07: 0 W,OK (relation "<tbl>" page 911 is > uninitialized --- fixing) > 08: 0 OK > 09: 0 OK > 10: 0 OK > ------------------------------ > > I think that writeback mode first records the data then the > inode, and > the ordered mode does it in reverse order. I also mean that postgres > log requires the inode recorded correctly, the data loss is > handled by > the WAL. > > AMD XP2000, 512MB RAM, PostgreSQL 7.4.6 (i686), linux-2.4.28, > gcc-3.3.5, > Adaptec 29160, WD Enterprise 4360 (SCSI, SCA-80) > > I made mkfs and initdb before every tests and I repeated them > in reverse > order too. No quake3 ran in the background. > > -Sygma" > > > Sorry for my english. > > > Mage > > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index > scan if your > joining column's datatypes do not match > >
> Gabor Szima asked us to translate the letter below. > > "I read that ext3 writeback mode is recommended for PostgreSQL. I made > some tests. > > data=ordered data=writeback > ---------------------------------------------------------------------- > restoredb: 2m16.790s 1m42.367s > UPDATE <tbl1> (17krows): 9.289s 7.147s > UPDATE <tbl1> (17krows) (2.): 10.480s 3.778s > VACUUM ANALYZE <tbl1>: 9.364s 0.986s ! > VACUUM FULL <tbl1>: 16.071s 2.575s > REINDEX TABLE <tbl1>: 3.815s 1.886s > ---------------------------------------------------------------------- Hum. You might as well run it with fsync disabled for that extra thrill ;) You could try the same with reiserfs (hint, hint).
Hi all I don't understand what these two lines exactly mean. INFO: free space map: 490 relations, 13541 pages stored; 34480 total pages needed DETAIL: Allocated FSM size: 1000 relations + 20000 pages = 178 kB shared memory Thanks in advance Conni
Tzahi Fadida wrote: > I recommend you don't use ext3 for any database: > http://seclists.org/lists/linux-kernel/2005/Jan/0641.html > > apparently its still buggy. So what is the recommended fs under Linux? I don't need the best speed/throughput, but I prefer not to use ext2 due to long fsck time. I also tend to avoid reiser3, it has given us many griefs in the past. XFS? Regards, dave
On Tue, 18 Jan 2005, David Garamond wrote: > So what is the recommended fs under Linux? I don't need the best > speed/throughput, but I prefer not to use ext2 due to long fsck time. I > also tend to avoid reiser3, it has given us many griefs in the past. XFS? dave, I have no large databases here, but I run reiserfs on must systems and ext3 on my notebook. Over the past couple of years I've had no problems with either. I've read of folks who don't like one or the other, but locally I know of no one who's experienced any negative situations with either. Rich -- Dr. Richard B. Shepard, President Applied Ecosystem Services, Inc. (TM) <http://www.appl-ecosys.com> Voice: 503-667-4517 Fax: 503-667-8863
David Garamond wrote: > Tzahi Fadida wrote: > >> I recommend you don't use ext3 for any database: >> http://seclists.org/lists/linux-kernel/2005/Jan/0641.html >> >> apparently its still buggy. > > > So what is the recommended fs under Linux? I don't need the best > speed/throughput, but I prefer not to use ext2 due to long fsck time. > I also tend to avoid reiser3, it has given us many griefs in the past. > XFS? We have had success with XFS and JFS. XFS seems a little better supported. Sincerely, Joshua D. Drake > > > Regards, > dave > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
Attachment
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I typically use XFS when given the choice. On Jan 17, 2005, at 7:52 PM, Rich Shepard wrote: > On Tue, 18 Jan 2005, David Garamond wrote: > >> So what is the recommended fs under Linux? I don't need the best >> speed/throughput, but I prefer not to use ext2 due to long fsck time. >> I >> also tend to avoid reiser3, it has given us many griefs in the past. >> XFS? > > dave, > > I have no large databases here, but I run reiserfs on must systems > and > ext3 on my notebook. Over the past couple of years I've had no > problems with > either. I've read of folks who don't like one or the other, but > locally I > know of no one who's experienced any negative situations with either. > > Rich > > -- > Dr. Richard B. Shepard, President > Applied Ecosystem Services, Inc. (TM) > <http://www.appl-ecosys.com> Voice: 503-667-4517 Fax: 503-667-8863 > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > > - ----------------------------------------------------------- Frank D. Engel, Jr. <fde101@fjrhome.net> $ ln -s /usr/share/kjvbible /usr/manual $ true | cat /usr/manual | grep "John 3:16" John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life. $ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iD8DBQFB7GKY7aqtWrR9cZoRArC+AJ9yBCcKWu0hurzvyYgYPgak3bSXSQCfZUK4 RwY2fa38Nco6JUdDdBwvUQ0= =n3Lt -----END PGP SIGNATURE----- ___________________________________________________________ $0 Web Hosting with up to 120MB web space, 1000 MB Transfer 10 Personalized POP and Web E-mail Accounts, and much more. Signup at www.doteasy.com
On Mon, 17 Jan 2005 16:54:45 -0800, Joshua D. Drake <jd@commandprompt.com> wrote: > David Garamond wrote: > > > Tzahi Fadida wrote: > > > >> I recommend you don't use ext3 for any database: > >> http://seclists.org/lists/linux-kernel/2005/Jan/0641.html > >> > >> apparently its still buggy. > > > > > > So what is the recommended fs under Linux? I don't need the best > > speed/throughput, but I prefer not to use ext2 due to long fsck time. > > I also tend to avoid reiser3, it has given us many griefs in the past. > > XFS? > > We have had success with XFS and JFS. XFS seems a little better supported. I'll 2nd (or 3rd?) that vote for XFS. Its been rock solid for my servers. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ L. Friedman netllama@gmail.com LlamaLand http://netllama.linux-sxs.org
On Tue, 2005-01-18 at 07:43 +0700, David Garamond wrote: > Tzahi Fadida wrote: > > I recommend you don't use ext3 for any database: > > http://seclists.org/lists/linux-kernel/2005/Jan/0641.html > > > > apparently its still buggy. > > So what is the recommended fs under Linux? I don't need the best > speed/throughput, but I prefer not to use ext2 due to long fsck time. I Wouldn't ext2 also allow the possibility of a missing file? Even though postgres does WAL, couldn't ext2 forget a file or not record that a new file has been created? In other words, does PostgreSQL assume that the filesystem at least journals the metadata? Regards, Jeff Davis
You may also want to test data=journal for ext3. Most of the time, this is slower but for databases with logging and mail servers, it can be faster. Mage wrote: > Hello, > > Gabor Szima asked us to translate the letter below. > > "I read that ext3 writeback mode is recommended for PostgreSQL. I made > some tests. > > data=ordered data=writeback > ---------------------------------------------------------------------- > restoredb: 2m16.790s 1m42.367s > UPDATE <tbl1> (17krows): 9.289s 7.147s > UPDATE <tbl1> (17krows) (2.): 10.480s 3.778s > VACUUM ANALYZE <tbl1>: 9.364s 0.986s ! > VACUUM FULL <tbl1>: 16.071s 2.575s > REINDEX TABLE <tbl1>: 3.815s 1.886s > ---------------------------------------------------------------------- > > It's seductive. > However I made some crash-tests too. Updated 4 tables simultaneously and > recurring for 10 to 120s, then powered off the machine (without the > reset button. i just pulled out the cable). > > SEQ RECOVERY-WARNINGS VACUUM > ------------------------------- > 01: 1650 OK (WARNING: invalid page header in > block 769 of relation "18800"; zeroing out page) > 02: 3 FATAL (ERROR: could not access status of > transaction 37814272) > ------------------------------- (DETAIL: could not open file > "/data/pgdata/pg_clog/0024": No such file or directory) > > I have stopped my tests at this point because this is not for production > use. The database was corrupted. > > > With ordered mode I got this: > > ext3-noatime,data=ordered: > > SEQ RECOVERY-WARNINGS VACUUM > ------------------------------ > 01: 0 OK > 02: 0 OK > 03: 0 OK > 04: 0 W,OK (relation "<tbl>" page 398 is > uninitialized --- fixing) > 05: 0 OK > 06: 0 OK > 07: 0 W,OK (relation "<tbl>" page 911 is > uninitialized --- fixing) > 08: 0 OK > 09: 0 OK > 10: 0 OK > ------------------------------ > > I think that writeback mode first records the data then the inode, and > the ordered mode does it in reverse order. I also mean that postgres > log requires the inode recorded correctly, the data loss is handled by > the WAL. > > AMD XP2000, 512MB RAM, PostgreSQL 7.4.6 (i686), linux-2.4.28, gcc-3.3.5, > Adaptec 29160, WD Enterprise 4360 (SCSI, SCA-80)
Jeff Davis <jdavis-pgsql@empires.org> writes: > In other words, does PostgreSQL assume that the filesystem at least > journals the metadata? Postgres assumes that the filesystem can take care of itself, which we define as not losing or corrupting successfully-fsynced data. The original BSD filesystem designs met this requirement without any journal; they were just careful about the order in which things got forced to disk. It appears that ext3 may not be able to meet this requirement even with a journal :-(. But in theory a metadata journal should be sufficient. Journaling data writes is redundant, unless maybe the filesystem substitutes that for the ordinary idea of fsync(). regards, tom lane
Hi, Am Dienstag, den 18.01.2005, 07:43 +0700 schrieb David Garamond: > Tzahi Fadida wrote: > > I recommend you don't use ext3 for any database: > > http://seclists.org/lists/linux-kernel/2005/Jan/0641.html > > > > apparently its still buggy. > > So what is the recommended fs under Linux? I don't need the best > speed/throughput, but I prefer not to use ext2 due to long fsck time. I > also tend to avoid reiser3, it has given us many griefs in the past. XFS? From my experience, reiser3 dies if the hardware dies. E.g. if your disk starts trashing blocks. So when you have trusty hardware (good raid level), reiserfs works very well. I've not yet tested XFS on faulty disks. But on raid it works very well and it is somewhat optimized for larger files - as tables and indices can be. HTH Tino
Am Montag, den 17.01.2005, 17:47 -0800 schrieb Jeff Davis: > On Tue, 2005-01-18 at 07:43 +0700, David Garamond wrote: > > Tzahi Fadida wrote: > > > I recommend you don't use ext3 for any database: > > > http://seclists.org/lists/linux-kernel/2005/Jan/0641.html > > > > > > apparently its still buggy. > > > > So what is the recommended fs under Linux? I don't need the best > > speed/throughput, but I prefer not to use ext2 due to long fsck time. I > > Wouldn't ext2 also allow the possibility of a missing file? Even though > postgres does WAL, couldn't ext2 forget a file or not record that a new > file has been created? > > In other words, does PostgreSQL assume that the filesystem at least > journals the metadata? Well, postgres likes that no already written and sync()ed data gets lost. And the filesystem must be in consistent state to work at all. So to ensure (2) ext2 must du fsck, which takes a considerable amount of time if on large partitions. Regards Tino
I think that INFO gives you information about your current usage and that DETAIL tells you what is currently set in your configuration. In this example, the default settings appear to be sufficient for your database. If the values in INFO were larger than the values in DETAIL, you would want to consider increasing max_fsm_relations and max_fsm_pages in postgresql.conf. -tfo -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 On Jan 17, 2005, at 5:14 PM, Cornelia Boenigk wrote: > Hi all > > I don't understand what these two lines exactly mean. > > INFO: free space map: 490 relations, 13541 pages stored; 34480 total > pages > needed > DETAIL: Allocated FSM size: 1000 relations + 20000 pages = 178 kB > shared > memory > > Thanks in advance > Conni