Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Parallel Seq Scan
Date
Msg-id 30549.1422459647@sss.pgh.pa.us
Whole thread Raw
In response to Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Parallel Seq Scan
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> The problem here, as I see it, is that we're flying blind.  If there's
> just one spindle, I think it's got to be right to read the relation
> sequentially.  But if there are multiple spindles, it might not be,
> but it seems hard to predict what we should do.  We don't know what
> the RAID chunk size is or how many spindles there are, so any guess as
> to how to chunk up the relation and divide up the work between workers
> is just a shot in the dark.

I thought the proposal to chunk on the basis of "each worker processes
one 1GB-sized segment" should work all right.  The kernel should see that
as sequential reads of different files, issued by different processes;
and if it can't figure out how to process that efficiently then it's a
very sad excuse for a kernel.

You are right that trying to do any detailed I/O scheduling by ourselves
is a doomed exercise.  For better or worse, we have kept ourselves at
sufficient remove from the hardware that we can't possibly do that
successfully.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Misaligned BufferDescriptors causing major performance problems on AMD
Next
From: Robert Haas
Date:
Subject: Re: Parallel Seq Scan