Re: RAID arrays and performance - Mailing list pgsql-performance

From Matthew
Subject Re: RAID arrays and performance
Date
Msg-id Pine.LNX.4.58.0712041637260.3731@aragorn.flymine.org
Whole thread Raw
In response to Re: RAID arrays and performance  (Mark Mielke <mark@mark.mielke.cc>)
Responses Re: RAID arrays and performance
List pgsql-performance
On Tue, 4 Dec 2007, Mark Mielke wrote:
> So much excitement and zeal - refreshing to see. And yet, no numbers! :-)

What sort of numbers did you want to see?

> You describe a new asynchronous I/O system to map integers to Java
> objects above. Why would you write this? Have you tried BerkeleyDB or
> BerkeleyDB JE? BerkeleyDB with BDB as a backend or their full Java
> backend gives you a Java persistence API that will allow you to map Java
> objects (including integers) to other Java objects.

Looked at all those. Didn't like their performance characteristics, or
their interfaces. It's simply the fact that they're not designed for the
workload that we want to put on them.

> If it came to a bet, I would bet that their research and
> tuning over several years, and many people, would beat your initial
> implementation, asynchronous I/O or not.

Quite possibly. However, there's the possibility that it wouldn't. And I
can improve it - initial implementations often suck.

> Asynchronous I/O is no more a magic bullet than threading. It requires a
> lot of work to get it right, and if one gets it wrong, it can be slower
> than the regular I/O or single threaded scenarios. Both look sexy on
> paper, neither may be the solution to your problem. Or they may be. We
> wouldn't know without numbers.

Okay, numbers. About eight years ago I wrote the shell of a filesystem
implementation, concentrating on performance and integrity. It absolutely
whooped ext2 for both read and write speed - especially metadata write
speed.  Anything up to 60 times as fast. I wrote a load of metadata to
ext2, which took 599.52 seconds, and on my system took 10.54 seconds.
Listing it back (presumably from cache) took 1.92 seconds on ext2 and 0.22
seconds on my system. No silly caching tricks that sacrifice integrity.

It's a pity I got nowhere near finishing that system - just enough to
prove my point and get a degree, but looking back on it there are massive
ways it's rubbish and should be improved. It was an initial
implementation.  I didn't have reiserfs, jfs, or xfs available at that
time, but it would have been really interesting to compare. This is the
system I would have based my indexing thing on.

Matthew

--
Anyone who goes to a psychiatrist ought to have his head examined.

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Optimizer Not using the Right plan
Next
From: Pallav Kalva
Date:
Subject: Re: Optimizer Not using the Right plan