josh@agliodbs.com (Josh Berkus) writes:
>> The other advantage (which I hinted to above) with raw disks is being able
>> to optimize queries to take advantage of it. Informix is multithreaded
>> and it will spawn off multiple "readers" to do say, a seq scan (and merge
>> the results at the end).
>
> I like this idea. Has it ever been discussed for PostgreSQL? Hmmm
> .... we'd need to see some tests demonstrating that this approach
> was still a technical advantage given the improvements in RAID and
> FS technology since Informix was designed.
Ah, but this approach isn't so much an I/O optimization as it is a CPU
optimization.
If you have some sort of join against a big table, and do a lot of
processing on each component, there might be CPU benefits from the
split:
create table customers(
id customer_id, name character varying, other fields
); --- And we're a phone company with 8 millions of them...
create table customer_status (
customer_id customer_id,
status status_code
);
create table customer_address (
customer_id customer_id,
address_info...
);
And then are doing:
select c.id, sum(status), address_label(c.id), balance(c.id) from
customers c, customer_status cs;
We know there's going to be a SEQ SCAN against customers, because
that's the big table.
If I wanted to finish the query as fast as possible, as things stand
now, and had 4 CPUs, I would run 4 concurrent queries, for 4 ranges of
customers.
The Really Cool approach would be for PostgreSQL to dole out customers
across four processors, perhaps throwing a page at a time at each CPU,
where each process would quasi-independently build up their respective
result sets.
--
let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];;
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)