Thread: PostgreSQL on XFS experiences?
Dear list, We are using PostgreSQL with the database and xlogs on (separate) XFS volumes under Linux 2.4.25. We are simply curious to hear your experiences with this combination, if you are using it. In only two days of heavy activity, we've already been able to corrupt one database. We've also seen XFS panic because of inconsistent in-memory metadata. Frankly we don't have the highest confidence. Anybody out there also using PostgreSQL over XFS with success or otherwise? -jwb
On Thu, 26 Feb 2004, Jeffrey W. Baker wrote: > We are using PostgreSQL with the database and xlogs on (separate) XFS > volumes under Linux 2.4.25. We are simply curious to hear your > experiences with this combination, if you are using it. In only two > days of heavy activity, we've already been able to corrupt one > database. We've also seen XFS panic because of inconsistent in-memory > metadata. Frankly we don't have the highest confidence. I am afraid that xfs in that kernel or your hardware is buggy (probably RAM). A 24h run of memtest86 wouldn't be bad. Since PostgreSQL uses the operating system's calls for file operations as any other program does, it's most probably no PostgreSQL issue.
On Thu, 2004-02-26 at 11:46, Holger Marzen wrote: > On Thu, 26 Feb 2004, Jeffrey W. Baker wrote: > > > We are using PostgreSQL with the database and xlogs on (separate) XFS > > volumes under Linux 2.4.25. We are simply curious to hear your > > experiences with this combination, if you are using it. In only two > > days of heavy activity, we've already been able to corrupt one > > database. We've also seen XFS panic because of inconsistent in-memory > > metadata. Frankly we don't have the highest confidence. > > I am afraid that xfs in that kernel or your hardware is buggy (probably > RAM). A 24h run of memtest86 wouldn't be bad. > > Since PostgreSQL uses the operating system's calls for file operations > as any other program does, it's most probably no PostgreSQL issue. I don't see why not. PostgreSQL could easily have a bug that swaps a buffer somewhere, resulting in a corrupt table. That we see this only on the INSERT path and not the COMMIT path also seems to point towards Pg. Anyway, you didn't mention XFS. Do you have experience using it beneath Postgres?
Jeffrey W. Baker wrote: > On Thu, 2004-02-26 at 11:46, Holger Marzen wrote: > >>On Thu, 26 Feb 2004, Jeffrey W. Baker wrote: >> >>>We are using PostgreSQL with the database and xlogs on (separate) XFS >>>volumes under Linux 2.4.25. We are simply curious to hear your >>>experiences with this combination, if you are using it. In only two >>>days of heavy activity, we've already been able to corrupt one >>>database. We've also seen XFS panic because of inconsistent in-memory >>>metadata. Frankly we don't have the highest confidence. >> >>I am afraid that xfs in that kernel or your hardware is buggy (probably >>RAM). A 24h run of memtest86 wouldn't be bad. >> >>Since PostgreSQL uses the operating system's calls for file operations >>as any other program does, it's most probably no PostgreSQL issue. > > I don't see why not. Because Postgres is in use on systems all over the place without problems. > PostgreSQL could easily have a bug that swaps a > buffer somewhere, resulting in a corrupt table. Not easily. If such a bug existed, I would think that someone else would have complained about it before now. I have 3 Postgres servers under varying levels of usage - all are rock solid reliable, with uptimes easily over 30 days without problems. These crashes you describe are unlikely to be coming from Postgres. > That we see this only > on the INSERT path and not the COMMIT path also seems to point towards > Pg. Not really. Have you used something like bonnie++ to test XFS for reliability? Have you run memtest86 as was suggested? In my experience, faulty hardware is far more common than what you are suggesting. > Anyway, you didn't mention XFS. Do you have experience using it beneath > Postgres? Personally, I have tested it for its performance capabilities, but never deployed it long-term. However, in my experience, it works reliably. -- Bill Moran Potential Technologies http://www.potentialtech.com
"Jeffrey W. Baker" <jwbaker@acm.org> writes: > We've also seen XFS panic because of inconsistent in-memory > metadata. From this fact alone we may deduce with absolute confidence that either your hardware or your filesystem implementation are faulty. There is no combination of operations which any application, buggy or not, that can cause a properly written XFS driver on error-free hardware to become corrupted. -- Brandon Craig Rhodes http://www.rhodesmill.org/brandon Georgia Tech brandon@oit.gatech.edu
On Thursday 26 February 2004 12:09 pm, Jeffrey W. Baker wrote: > On Thu, 2004-02-26 at 11:46, Holger Marzen wrote: > > On Thu, 26 Feb 2004, Jeffrey W. Baker wrote: > > > We are using PostgreSQL with the database and xlogs on > > > (separate) XFS volumes under Linux 2.4.25. We are simply > > > curious to hear your experiences with this combination, if you > > > are using it. In only two days of heavy activity, we've > > > already been able to corrupt one database. We've also seen XFS > > > panic because of inconsistent in-memory metadata. Frankly we > > > don't have the highest confidence. > > > > I am afraid that xfs in that kernel or your hardware is buggy > > (probably RAM). A 24h run of memtest86 wouldn't be bad. > > > > Since PostgreSQL uses the operating system's calls for file > > operations as any other program does, it's most probably no > > PostgreSQL issue. > > I don't see why not. PostgreSQL could easily have a bug that swaps > a buffer somewhere, resulting in a corrupt table. That we see this > only on the INSERT path and not the COMMIT path also seems to point > towards Pg. > > Anyway, you didn't mention XFS. Do you have experience using it > beneath Postgres? I do. It's great. One of my PostgreSQL machines uses XFS with everything (data, xlogs, etc.) on the same XFS partition. Machine uptime: 247 days. PostgreSQL uptime 74 days (restarted while testing reconfiguration options). PostgreSQL crashes: 0. As the good doc says, "when you hear hoofbeats think horses, not zebras." Don't chase some hypothetical theoretical possible problem when it's known that bad RAM causes problems, especially when your error message seems to point toward RAM trouble. I give all my new machines a good test with memtest86 or the newer memtest86+ before installation. Just two weeks ago I got a machine that looked good after the first couple of passes but when I went to do a final check and install it a week and 1000 passes later it reported 6 errors in one memory location - a frustrating random crash just waiting to happen. Cheers, Steve