Re: Parallel sequential scans - Mailing list pgsql-general

From Tom Lane
Subject Re: Parallel sequential scans
Date
Msg-id 28745.1143182659@sss.pgh.pa.us
Whole thread Raw
In response to Parallel sequential scans  (Steve Atkins <steve@blighty.com>)
Responses Re: Parallel sequential scans  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-general
Steve Atkins <steve@blighty.com> writes:
> I'm doing some reporting-type work with PG, with the vast
> majority of queries hitting upwards of 25% of the table, so
> being executed as seq scans.
> ...
> It would be really nice to be able to do all the work with a
> single pass over the table, executing all the queries in
> parallel in that pass. They're pretty simple queries, mostly,
> just some aggregates and a simple where clause.

> There are some fairly obvious ways to merge multiple
> queries to do that at a SQL level - converting each query
> into a function and passing each row from a select * to
> each of the functions would be one of the less ugly.

> Or I could fire off all the queries simultaneously and hope
> they stay in close-enough lockstep through a single pass
> through the table to be able to share most of the IO.

I have not tried this sort of thing, but right offhand I like the second
alternative.  The "hope" is more well-founded than you seem to think:
whichever process is currently ahead will be slowed by requesting I/O,
while processes that are behind will find the pages they need already in
shared buffers.  You should definitely see just one read of each table
page as the parallel scans advance, assuming you don't have an
unreasonably small number of buffers.

Another reason, if you have more than one CPU in your machine, is that
multiple processes can make use of multiple CPUs, whereas the
one-fancy-query approach doesn't parallelize (at least not without
Bizgres or some such).

And lastly, you can just try it without sweating hard to convert the
queries ;-).  So try it and let us know how it goes.

            regards, tom lane

pgsql-general by date:

Previous
From: "Qingqing Zhou"
Date:
Subject: Re: Advantages of PostgreSQL over MySQL 5.0
Next
From: "Magnus Naeslund(f)"
Date:
Subject: Re: Some pgbench results