Thread: Re: [GENERAL] Large databases, performance

Re: [GENERAL] Large databases, performance

From

"Shridhar Daithankar"

Date:

03 October 2002, 11:56:13

On 3 Oct 2002 at 19:33, Shridhar Daithankar wrote:

> On 3 Oct 2002 at 13:56, Nigel J. Andrews wrote:
> > It's one hell of a DB you're building. I'm sure I'm not the only one interested
> > so to satisfy those of us who are nosey: can you say what the application is?
> >
> > I'm sure we'll all understand if it's not possible for you mention such
> > information.
>
> Well, I can't tell everything but somethings I can..
>
> 1) This is a system that does not have online capability yet. This is an
> attempt to provide one.
>
> 2) The goal is to avoid costs like licensing oracle. I am sure this would make
> a great example for OSDB advocacy, which ever database wins..
>
> 3) The database size estimates, I put earlier i.e. 9 billion tuples/900GB data
> size, are in a fixed window. The data is generated from some real time systems.
> You can imagine the rate.

Read that fixed time window..

>
> 4) Further more there are timing restrictions attached to it. 5K inserts/sec.
> 4800 queries per hour with response time of 10 sec. each. It's this aspect that
> has forced us for partitioning..
>
> And contrary to my earlier information, this is going to be a live system
> rather than a back up one.. A better win to postgresql.. I hope it makes it.
>
> And BTW, all these results were on reiserfs. We didn't found much of difference
> in write performance between them. So we stick to reiserfs. And of course we
> got the latest hot shot Mandrake9 with 2.4.19-16 which really made difference
> over RHL7.2..

Well, we were comparing ext3 v/s reiserfs. I don't remember the journalling
mode of ext3 but we did a 10 GB write test. Besides converting the RAID to RAID-
0 from RAID-5 might have something to do about it.

There was a discussion on hackers some time back as in which file system is
better. I hope this might have an addition over it..


Bye
 Shridhar

--
    "What terrible way to die."    "There are no good ways."        -- Sulu and Kirk, "That
Which Survives", stardate unknown

Re: [GENERAL] Large databases, performance

From

Greg Copeland

Date:

03 October 2002, 12:23:42

On Thu, 2002-10-03 at 10:56, Shridhar Daithankar wrote:
> Well, we were comparing ext3 v/s reiserfs. I don't remember the journalling
> mode of ext3 but we did a 10 GB write test. Besides converting the RAID to RAID-
> 0 from RAID-5 might have something to do about it.
>
> There was a discussion on hackers some time back as in which file system is
> better. I hope this might have an addition over it..

Hmm.  Reiserfs' claim to fame is it's low latency with many, many small
files and that it's journaled.  I've never seem anyone comment about it
being considered an extremely fast file system in an general computing
context nor have I seen any even hint at it as a file system for use in
heavy I/O databases.  This is why Reiserfs is popular with news and
squid cache servers as it's almost an ideal fit.  That is, tons of small
files or directories contained within a single directory.  As such, I'm
very surprised that reiserfs is even in the running for your comparison.

Might I point you toward XFS, JFS, or ext3, ?  As I understand it, XFS
and JFS are going to be your preferred file systems for for this type of
application with XFS in the lead as it's tool suite is very rich and
robust.  I'm actually lacking JFS experience but from what I've read,
it's a notch or two back from XFS in robustness (assuming we are talking
Linux here).  Feel free to read and play to find out for your self.  I'd
recommend that you start playing with XFS to see how the others
compare.  After all, XFS' specific claim to fame is high throughput w/
low latency on large and very large files.  Furthermore, they even have
a real time mechanism that you can further play with to see how it
effects your throughput and/or latencies.

Greg

Attachment

signature.asc

Re: [GENERAL] Large databases, performance

From

"Shridhar Daithankar"

Date:

03 October 2002, 12:29:47

On 3 Oct 2002 at 11:23, Greg Copeland wrote:

> On Thu, 2002-10-03 at 10:56, Shridhar Daithankar wrote:
> > Well, we were comparing ext3 v/s reiserfs. I don't remember the journalling
> > mode of ext3 but we did a 10 GB write test. Besides converting the RAID to RAID-
> > 0 from RAID-5 might have something to do about it.
> >
> > There was a discussion on hackers some time back as in which file system is
> > better. I hope this might have an addition over it..
>
>
> Hmm.  Reiserfs' claim to fame is it's low latency with many, many small
> files and that it's journaled.  I've never seem anyone comment about it
> being considered an extremely fast file system in an general computing
> context nor have I seen any even hint at it as a file system for use in
> heavy I/O databases.  This is why Reiserfs is popular with news and
> squid cache servers as it's almost an ideal fit.  That is, tons of small
> files or directories contained within a single directory.  As such, I'm
> very surprised that reiserfs is even in the running for your comparison.
>
> Might I point you toward XFS, JFS, or ext3, ?  As I understand it, XFS
> and JFS are going to be your preferred file systems for for this type of
> application with XFS in the lead as it's tool suite is very rich and
> robust.  I'm actually lacking JFS experience but from what I've read,
> it's a notch or two back from XFS in robustness (assuming we are talking
> Linux here).  Feel free to read and play to find out for your self.  I'd
> recommend that you start playing with XFS to see how the others
> compare.  After all, XFS' specific claim to fame is high throughput w/
> low latency on large and very large files.  Furthermore, they even have
> a real time mechanism that you can further play with to see how it
> effects your throughput and/or latencies.

I would try that. Once we are thr. with tests at our hands..

Bye
 Shridhar

--
    "The combination of a number of things to make existence worthwhile."    "Yes,
the philosophy of 'none,' meaning 'all.'"        -- Spock and Lincoln, "The Savage
Curtain", stardate 5906.4

Re: [GENERAL] Large databases, performance

From

Curt Sampson

Date:

06 October 2002, 22:27:06

On Thu, 3 Oct 2002, Shridhar Daithankar wrote:

> Well, we were comparing ext3 v/s reiserfs. I don't remember the journalling
> mode of ext3 but we did a 10 GB write test. Besides converting the RAID to RAID-
> 0 from RAID-5 might have something to do about it.

That will have a massive, massive effect on performance. Depending on
your RAID subsystem, you can except RAID-0 to be between two and twenty
times as fast for writes as RAID-5.

If you compared one filesystem on RAID-5 and another on RAID-0,
your results are likely not at all indicative of file system
performance.

Note that I've redirected followups to the pgsql-performance list.
Avoiding cross-posting would be nice, since I am getting lots of
duplicate messages these days.

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

cross-posts (was Re: [GENERAL] Large databases, performance)

From

Tom Lane

Date:

06 October 2002, 23:20:43

Curt Sampson <cjs@cynic.net> writes:
> ... Avoiding cross-posting would be nice, since I am getting lots of
> duplicate messages these days.

Cross-posting is a fact of life, and in fact encouraged, on the pg
lists.  I suggest adapting.  Try sending
    set all unique your-email-address
to the PG majordomo server; this sets you up to get only one copy
of each cross-posted message.

            regards, tom lane

Re: cross-posts (was Re: [GENERAL] Large databases,

From

Larry Rosenman

Date:

07 October 2002, 07:51:15

On Sun, 2002-10-06 at 22:20, Tom Lane wrote:
> Curt Sampson <cjs@cynic.net> writes:
> > ... Avoiding cross-posting would be nice, since I am getting lots of
> > duplicate messages these days.
>
> Cross-posting is a fact of life, and in fact encouraged, on the pg
> lists.  I suggest adapting.  Try sending
>     set all unique your-email-address
> to the PG majordomo server; this sets you up to get only one copy
> of each cross-posted message.
That doesn't seem to work any more:

>>>> set all unique ler@lerctr.org
**** The "all" mailing list is not supported at
**** PostgreSQL User Support Lists.

What do I need to send now?

Marc?


--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

Re: cross-posts (was Re: [GENERAL] Large databases,

From

"Michael Paesold"

Date:

07 October 2002, 07:59:07

> On Sun, 2002-10-06 at 22:20, Tom Lane wrote:
> > Curt Sampson <cjs@cynic.net> writes:
> > > ... Avoiding cross-posting would be nice, since I am getting lots of
> > > duplicate messages these days.
> >
> > Cross-posting is a fact of life, and in fact encouraged, on the pg
> > lists.  I suggest adapting.  Try sending
> > set all unique your-email-address
> > to the PG majordomo server; this sets you up to get only one copy
> > of each cross-posted message.
> That doesn't seem to work any more:
>
> >>>> set all unique ler@lerctr.org
> **** The "all" mailing list is not supported at
> **** PostgreSQL User Support Lists.
>
> What do I need to send now?
>
> Marc?

it is:
set ALL unique your-email

if you also don't want to get emails that have already been cc'd to you, you
can use:

set ALL eliminatecc your-email

for a full list of set options send:

help set

to majordomo.

Regards,
Michael Paesold

Re: cross-posts (was Re: [GENERAL] Large databases,

From

Larry Rosenman

Date:

07 October 2002, 08:04:47

On Mon, 2002-10-07 at 07:01, Michael Paesold wrote:
> > On Sun, 2002-10-06 at 22:20, Tom Lane wrote:
> > > Curt Sampson <cjs@cynic.net> writes:
> > > > ... Avoiding cross-posting would be nice, since I am getting lots of
> > > > duplicate messages these days.
> > >
> > > Cross-posting is a fact of life, and in fact encouraged, on the pg
> > > lists.  I suggest adapting.  Try sending
> > > set all unique your-email-address
> > > to the PG majordomo server; this sets you up to get only one copy
> > > of each cross-posted message.
> > That doesn't seem to work any more:
> >
> > >>>> set all unique ler@lerctr.org
> > **** The "all" mailing list is not supported at
> > **** PostgreSQL User Support Lists.
> >
> > What do I need to send now?
> >
> > Marc?
>
> it is:
> set ALL unique your-email
>
> if you also don't want to get emails that have already been cc'd to you, you
> can use:
>
> set ALL eliminatecc your-email
>
> for a full list of set options send:
>
> help set
>
> to majordomo.
Thanks.  That worked great.  (I use Mailman, and didn't realize the ALL
needed to be capitalized.

LER


--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749