Thread: failures on machines using jfs

failures on machines using jfs

From
Andrew Sullivan
Date:
Hi all,

Chris Browne (one of my colleagues here) has posted some tests in the
past indicating that jfs may be the fastest filesystem for Postgres
use on Linux.

We have lately had a couple of cases where machines either locked up,
slowed down to the point of complete unusability, or died completely
while using jfs.  We are _not_ sure that jfs is in fact the culprit.
In one case, a kernel panic appeared to be referring to the jfs
kernel module, but I can't be sure as I lost the output immediately
thereafter.  Yesterday, we had a problem of data corruption on a
failed jfs volume.

None of this is to say that jfs is in fact to blame, nor even that,
if it is, it does not have something to do with the age of our
installations, &c. (these are all RH 8).  In fact, I suspect hardware
in both cases.  But I thought I'd mention it just in case other
people are seeing strange behaviour, on the principle of "better
safe than sorry."

A
--
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: failures on machines using jfs

From
Josh Berkus
Date:
Andrew,

> None of this is to say that jfs is in fact to blame, nor even that,
> if it is, it does not have something to do with the age of our
> installations, &c. (these are all RH 8).  In fact, I suspect hardware
> in both cases.  But I thought I'd mention it just in case other
> people are seeing strange behaviour, on the principle of "better
> safe than sorry."

Always useful.    Actually, I just fielded on IRC a report of poor I/O
utilization with XFS during checkpointing.    Not sure if the problem is XFS
or PostgreSQL, but the fact that XFS (alone among filesystems) does its own
cache management instead of using the kernel cache makes me suspicious.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: failures on machines using jfs

From
Robert Creager
Date:
When grilled further on (Wed, 7 Jan 2004 18:06:08 -0500),
Andrew Sullivan <andrew@libertyrms.info> confessed:

>
> We have lately had a couple of cases where machines either locked up,
> slowed down to the point of complete unusability, or died completely
> while using jfs.  We are _not_ sure that jfs is in fact the culprit.
> In one case, a kernel panic appeared to be referring to the jfs
> kernel module, but I can't be sure as I lost the output immediately
> thereafter.  Yesterday, we had a problem of data corruption on a
> failed jfs volume.
>
> None of this is to say that jfs is in fact to blame, nor even that,
> if it is, it does not have something to do with the age of our
> installations, &c. (these are all RH 8).  In fact, I suspect hardware
> in both cases.  But I thought I'd mention it just in case other
> people are seeing strange behaviour, on the principle of "better
> safe than sorry."
>

Interestingly enough, I'm using JFS on a new scsi disk with Mandrake 9.1 and
was having similar problems.  I was generating heavy disk usage through database
and astronomical data reductions.  My machine (dual AMD) would suddenly hang.
No new jobs would run, just increase the load, until I reboot the machine.

I solved my problems by creating a 128Mb ram disk (using EXT2) for the temp
data produced my reduction runs.

I believe JFS was to blame, not hardware, but you never know...

Cheers,
Rob

--
 20:22:27 up 12 days, 10:13,  4 users,  load average: 2.00, 2.01, 2.03

Attachment

Re: failures on machines using jfs

From
Christopher Browne
Date:
Robert_Creager@LogicalChaos.org (Robert Creager) writes:
> When grilled further on (Wed, 7 Jan 2004 18:06:08 -0500),
> Andrew Sullivan <andrew@libertyrms.info> confessed:
>
>> We have lately had a couple of cases where machines either locked
>> up, slowed down to the point of complete unusability, or died
>> completely while using jfs.  We are _not_ sure that jfs is in fact
>> the culprit.  In one case, a kernel panic appeared to be referring
>> to the jfs kernel module, but I can't be sure as I lost the output
>> immediately thereafter.  Yesterday, we had a problem of data
>> corruption on a failed jfs volume.
>>
>> None of this is to say that jfs is in fact to blame, nor even that,
>> if it is, it does not have something to do with the age of our
>> installations, &c. (these are all RH 8).  In fact, I suspect
>> hardware in both cases.  But I thought I'd mention it just in case
>> other people are seeing strange behaviour, on the principle of
>> "better safe than sorry."
>
> Interestingly enough, I'm using JFS on a new scsi disk with Mandrake
> 9.1 and was having similar problems.  I was generating heavy disk
> usage through database and astronomical data reductions.  My machine
> (dual AMD) would suddenly hang.  No new jobs would run, just
> increase the load, until I reboot the machine.
>
> I solved my problems by creating a 128Mb ram disk (using EXT2) for
> the temp data produced my reduction runs.
>
> I believe JFS was to blame, not hardware, but you never know...

Interesting.

The set of concurrent factors that came together to appear when this
happened "consistently" were thus:

 1.  Heavy DB updates taking place on JFS filesystems;

 2.  SMP (we suspected Xeon hyperthreading as a possible factor, but
     shut it off and still saw the same problem...)

 3.  The third factor that appeared a catalyst was copying, via scp, a
     file > 2GB in size onto the system.

The third piece was a particularly interesting aspect; the file would
get copied over successfully, and the scp process would hang (to the
point of "kill -9" being unable to touch it) immediately thereafter.

At that point, processes on the system that were accessing files on
the hung-up filesystem were locked, also unkillable by "kill 9."
That's certainly consistent with JFS being at the root of the problem,
whether it was the cause or not...
--
let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];;
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

Re: failures on machines using jfs

From
"Spiegelberg, Greg"
Date:
It would seem we're experiencing somthing similiar with our scratch
volume (JFS mounted with noatime).  It is still much faster than our
experiments with ext2, ext3, and reiserfs but occasionally during
large loads it will hiccup for a couple seconds but no crashes yet.

I'm reluctant to switch back to any other file system because the
data import took a little over 1.5 hours but now takes just under
20 minutes and we haven't crashed yet.

For future reference:

 RedHat 7.3 w/2.4.18-18.7smp
 PostgreSQL 7.3.3 from source
 jfsutils 1.0.17-1
 Dual PIII Intel 1.4GHz & 2GB ECC
 Internal disk: 2xU160 SCSI, mirrored, location of our JFS file system
 External disk  Qlogic 2310 attached to FC-SW @2Gbps with ext3 on those LUNs

Greg


-----Original Message-----
From: Christopher Browne
To: pgsql-performance@postgresql.org
Sent: 1/10/04 9:08 PM
Subject: Re: [PERFORM] failures on machines using jfs

Robert_Creager@LogicalChaos.org (Robert Creager) writes:
> When grilled further on (Wed, 7 Jan 2004 18:06:08 -0500),
> Andrew Sullivan <andrew@libertyrms.info> confessed:
>
>> We have lately had a couple of cases where machines either locked
>> up, slowed down to the point of complete unusability, or died
>> completely while using jfs.  We are _not_ sure that jfs is in fact
>> the culprit.  In one case, a kernel panic appeared to be referring
>> to the jfs kernel module, but I can't be sure as I lost the output
>> immediately thereafter.  Yesterday, we had a problem of data
>> corruption on a failed jfs volume.
>>
>> None of this is to say that jfs is in fact to blame, nor even that,
>> if it is, it does not have something to do with the age of our
>> installations, &c. (these are all RH 8).  In fact, I suspect
>> hardware in both cases.  But I thought I'd mention it just in case
>> other people are seeing strange behaviour, on the principle of
>> "better safe than sorry."
>
> Interestingly enough, I'm using JFS on a new scsi disk with Mandrake
> 9.1 and was having similar problems.  I was generating heavy disk
> usage through database and astronomical data reductions.  My machine
> (dual AMD) would suddenly hang.  No new jobs would run, just
> increase the load, until I reboot the machine.
>
> I solved my problems by creating a 128Mb ram disk (using EXT2) for
> the temp data produced my reduction runs.
>
> I believe JFS was to blame, not hardware, but you never know...

Interesting.

The set of concurrent factors that came together to appear when this
happened "consistently" were thus:

 1.  Heavy DB updates taking place on JFS filesystems;

 2.  SMP (we suspected Xeon hyperthreading as a possible factor, but
     shut it off and still saw the same problem...)

 3.  The third factor that appeared a catalyst was copying, via scp, a
     file > 2GB in size onto the system.

The third piece was a particularly interesting aspect; the file would
get copied over successfully, and the scp process would hang (to the
point of "kill -9" being unable to touch it) immediately thereafter.

At that point, processes on the system that were accessing files on
the hung-up filesystem were locked, also unkillable by "kill 9."
That's certainly consistent with JFS being at the root of the problem,
whether it was the cause or not...
--
let name="cbbrowne" and tld="libertyrms.info" in String.concat "@"
[name;tld];;
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************


Re: failures on machines using jfs

From
Tom Lane
Date:
"Spiegelberg, Greg" <gspiegelberg@cranel.com> writes:
>  PostgreSQL 7.3.3 from source

*Please* update to 7.3.4 or 7.3.5 before you get bitten by the
WAL-page-boundary bug ...

            regards, tom lane

Re: failures on machines using jfs

From
Hannu Krosing
Date:
Spiegelberg, Greg kirjutas P, 11.01.2004 kell 18:21:
> It would seem we're experiencing somthing similiar with our scratch
> volume (JFS mounted with noatime).

Which files/directories do you keep on "scratch" volume ?

All postgres files or just some (WAL, tmp) ?

-------------
Hannu


Re: failures on machines using jfs

From
Greg Spiegelberg
Date:
Hannu Krosing wrote:
> Spiegelberg, Greg kirjutas P, 11.01.2004 kell 18:21:
>
>>It would seem we're experiencing somthing similiar with our scratch
>>volume (JFS mounted with noatime).
>
>
> Which files/directories do you keep on "scratch" volume ?
>
> All postgres files or just some (WAL, tmp) ?

No Postgres files are kept in scratch only the files being loaded
into the database via COPY or lo_import.

My WAL logs are kept on a separate ext3 file system.

Greg

--
Greg Spiegelberg
  Sr. Product Development Engineer
  Cranel, Incorporated.
  Phone: 614.318.4314
  Fax:   614.431.8388
  Email: gspiegelberg@Cranel.com
Cranel. Technology. Integrity. Focus.




**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************


Re: failures on machines using jfs

From
Hannu Krosing
Date:
Greg Spiegelberg kirjutas E, 12.01.2004 kell 19:03:
> Hannu Krosing wrote:
> > Spiegelberg, Greg kirjutas P, 11.01.2004 kell 18:21:
> >
> >>It would seem we're experiencing somthing similiar with our scratch
> >>volume (JFS mounted with noatime).
> >
> >
> > Which files/directories do you keep on "scratch" volume ?
> >
> > All postgres files or just some (WAL, tmp) ?
>
> No Postgres files are kept in scratch only the files being loaded
> into the database via COPY or lo_import.

then the speedup does not make any sense !

Is reading from jfs filesystem also 5 times faster than reading from
ext3 ?

The only explanation I can give to filling database from jfs volume to
be so much faster could be some strange filesystem cache interactions.

----------------
Hannu



Re: failures on machines using jfs

From
Bill Moran
Date:
Hannu Krosing wrote:
> Greg Spiegelberg kirjutas E, 12.01.2004 kell 19:03:
>
>>Hannu Krosing wrote:
>>
>>>Spiegelberg, Greg kirjutas P, 11.01.2004 kell 18:21:
>>>
>>>
>>>>It would seem we're experiencing somthing similiar with our scratch
>>>>volume (JFS mounted with noatime).
>>>
>>>
>>>Which files/directories do you keep on "scratch" volume ?
>>>
>>>All postgres files or just some (WAL, tmp) ?
>>
>>No Postgres files are kept in scratch only the files being loaded
>>into the database via COPY or lo_import.
>
>
> then the speedup does not make any sense !
>
> Is reading from jfs filesystem also 5 times faster than reading from
> ext3 ?
>
> The only explanation I can give to filling database from jfs volume to
> be so much faster could be some strange filesystem cache interactions.

http://www.potentialtech.com/wmoran/postgresql.php

--
Bill Moran
Potential Technologies
http://www.potentialtech.com


Re: failures on machines using jfs

From
Greg Spiegelberg
Date:
Hannu Krosing wrote:
> Greg Spiegelberg kirjutas E, 12.01.2004 kell 19:03:
>
>>Hannu Krosing wrote:
>>
>>>Spiegelberg, Greg kirjutas P, 11.01.2004 kell 18:21:
>>>
>>>
>>>>It would seem we're experiencing somthing similiar with our scratch
>>>>volume (JFS mounted with noatime).
>>>
>>>
>>>Which files/directories do you keep on "scratch" volume ?
>>>
>>>All postgres files or just some (WAL, tmp) ?
>>
>>No Postgres files are kept in scratch only the files being loaded
>>into the database via COPY or lo_import.
>
>
> then the speedup does not make any sense !

We do a lot of preprocessing before the data gets loaded.  It's that
process that experiences the hiccups I mentioned.

--
Greg Spiegelberg
  Sr. Product Development Engineer
  Cranel, Incorporated.
  Phone: 614.318.4314
  Fax:   614.431.8388
  Email: gspiegelberg@Cranel.com
Cranel. Technology. Integrity. Focus.




**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************