Hello Gregory,
Gregory Stark wrote:
> Oracle is using Direct I/O so they need the reader and writer threads to avoid
> blocking on i/o all the time. We count on the OS doing readahead and buffering
> our writes so we don't have to. Direct I/O and needing some way to do
> asynchronous writes and reads are directly tied.
Yeah, except in cases where we can tell ahead non-sequential reads.
Which admittedly doesn't come up too frequently and can probably be
handled with posix_fadvice - as you are currently testing.
> Where Parallel query is useful is when you have queries that involve a
> substantial amount of cpu resources, especially if you have a very fast I/O
> system which can saturate the bandwidth to a single cpu.
Full ACK, the very same applies to parallel querying on shared-nothing
clusters. Those can help if the bandwidth to all processing cores
together becomes the bottleneck (and the resulting data is relatively
small compared to the input data).
For example, Sun's UltraSparc T2 features only 8 PCIe lanes for those 8
cores, so you end up with 250 MiB/sec per core or about 32 MiB/sec per
thread on average. To be fair: their 10 Gig Ethernet ports don't go via
PCIe, so you get an additional 2x 1 GiB/sec for the complete chip. And
memory bandwidth looks a lot better: Sun claims 60+ GiB/sec, leaving
almost 8 GiB/sec per core or 1 GiB/sec per thread.
If my calculations for Intel are correct, a Quad Xeon with a 1.33 GHz
FSB has around 21 GiB/sec throughput to main memory, giving 5 GiB/sec
per core. (Why are these numbers so hard to find? It looks like Intel
deliberately obfuscates them with FSB MHz or Giga-transactions per sec
and the like.)
Regards
Markus