On Wed, 13 Aug 2003, Christopher Browne wrote:
> Sounds like what you were forced to do was to do TWO things:
>
> 1. Switch from raw disk to cooked files, and
> 2. Switch from a fibre array to a RAID array
>
> You're attributing the 5-6x slowdown to 1., when it seems likely that
> 2. is a far more significant multiple.
>
True.
> <flame on>
> Sure, and I'm sure the PG developers hardly know _anything_ about
> implementing databases, either.
> <flame off>
Oh I know they are good at it. I deal a lot with informix and PG and if I
could I'd bring Tom, Bruce, Joe, etc. out for a beer as I'm *constantly*
fighting informix and our PG box just sits there merrily churning away.
(and god bless "explain analyze" - informix's version is basically boolean
- "I will use an index" "I will use a seq scan". Doesn't even tell you
what index!. )
> You raise a good point vis-a-vis the thought of spawning multiple
> readers; that could conceivably be a useful approach to improve
> performance for very large queries. If you could "stripe" the tables
> in some manner so they could be doled out to multiple worker
> processes, that could indeed provide some benefits. If there are
> three workers, they might round-robin to grab successive pages from
> the table to do their work, and then end with a merge step.
The way informix does this is two fold:
1. it handles the raw disks, it knows where table data is
2. it can "partition" tables in a number of ways: round robin,
concatination or expression (Expression is nifty, allows you to use a
basic "where" clause to decide where to put data. ie
create table foo (
a int,
b int,
c int ) fragment on c > 0 and c < 100 in dbspace1, c > 100 c < 200 in
dbspace 2;
that kind of thing.
and yeah, I would not expect to see it for a long time.. Without threading
it would be rather difficult to implement.. but who knows what the future
will bring us.
--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/