Re: RAID arrays and performance - Mailing list pgsql-performance

From Mark Mielke
Subject Re: RAID arrays and performance
Date
Msg-id 4755810E.4000707@mark.mielke.cc
Whole thread Raw
In response to Re: RAID arrays and performance  (Matthew <matthew@flymine.org>)
Responses Re: RAID arrays and performance
Re: RAID arrays and performance
List pgsql-performance
Matthew wrote:
On Tue, 4 Dec 2007, Gregory Stark wrote: 
Fwiw, what made you bring up this topic now? You're the second person in about
two days to bring up precisely this issue and it was an issue I had been
planning to bring up on -hackers as it was.   
I only just joined the performance mailing list to talk about R-trees. I
would probably have brought it up earlier if I had been here earlier.
However, we're thinking of buying this large machine, and that reminded
me.

I have been biting at the bit for my bosses to allow me time to write an
indexing system for transient data - a lookup table backed by disc,
looking up from an integer to get an object, native in Java. Our system
often needs to fetch a list of a thousand different objects by a key like
that, and Postgres just doesn't do that exact thing fast. I was going to
implement it with full asynchronous IO, to do that particular job very
fast, so I have done a reasonable amount of research into the topic. In
Java, that is. It would add a little bit more performance for our system.
That wouldn't cover us - we still need to do complex queries with the same
problem, and that'll have to stay in Postgres
So much excitement and zeal - refreshing to see. And yet, no numbers! :-)

You describe a new asynchronous I/O system to map integers to Java objects above. Why would you write this? Have you tried BerkeleyDB or BerkeleyDB JE? BerkeleyDB with BDB as a backend or their full Java backend gives you a Java persistence API that will allow you to map Java objects (including integers) to other Java objects. They use generated Java run time instructions instead of reflection to store and lock your Java objects. If it came to a bet, I would bet that their research and tuning over several years, and many people, would beat your initial implementation, asynchronous I/O or not.

Asynchronous I/O is no more a magic bullet than threading. It requires a lot of work to get it right, and if one gets it wrong, it can be slower than the regular I/O or single threaded scenarios. Both look sexy on paper, neither may be the solution to your problem. Or they may be. We wouldn't know without numbers.

Cheers,
mark

-- 
Mark Mielke <mark@mielke.cc>

pgsql-performance by date:

Previous
From: Matthew
Date:
Subject: Re: RAID arrays and performance
Next
From: Tom Lane
Date:
Subject: Re: Optimizer Not using the Right plan