Re: VLDB Features - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: VLDB Features
Date
Msg-id 87abof30m1.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: VLDB Features  (Josh Berkus <josh@agliodbs.com>)
Responses Re: VLDB Features  (Markus Schiltknecht <markus@bluegap.ch>)
List pgsql-hackers
"Josh Berkus" <josh@agliodbs.com> writes:

> Markus,
>
>> > Parallel Query
>>
>> Uh.. this only makes sense in a distributed database, no? I've thought
>> about parallel querying on top of Postgres-R. Does it make sense
>> implementing some form of parallel querying apart from the distribution
>> or replication engine?

Yes, but not for the reasons Josh describes.

> I'd say implementing a separate I/O worker would be the first step towards 
> this; if we could avoid doing I/O in the same process/thread where we're 
> doing row parsing it would speed up large scans by 100%.  I know Oracle does 
> this, and their large-table-I/O is 30-40% faster than ours despite having 
> less efficient storage.

Oracle is using Direct I/O so they need the reader and writer threads to avoid
blocking on i/o all the time. We count on the OS doing readahead and buffering
our writes so we don't have to. Direct I/O and needing some way to do
asynchronous writes and reads are directly tied.

Where Parallel query is useful is when you have queries that involve a
substantial amount of cpu resources, especially if you have a very fast I/O
system which can saturate the bandwidth to a single cpu.

So for example if you have a merge join which requires sorting both sides of
the query you could easily have subprocesses handle those sorts allowing you
to bring two processors to bear on the problem instead of being limited to a
single processor.

On Oracle Parallel Query goes great with partitioned tables. Their query
planner will almost always turn the partition scans into parallel scans and
use separate processors to scan different partitions. 

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!


pgsql-hackers by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: WORM and Read Only Tables (v0.1)
Next
From: Bruce Momjian
Date:
Subject: Re: [DOCS] "distributed checkpoint"