Re: more parallel query documentation - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: more parallel query documentation
Date
Msg-id 57114BE9.7020707@BlueTreble.com
Whole thread Raw
In response to more parallel query documentation  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 4/14/16 10:02 PM, Robert Haas wrote:
> As previously threatened, I have written some user documentation for
> parallel query.  I put it up here:

Yay! Definitely needed to be written. :)

There should be a section that summarizes the parallel machinery. I 
think the most important points are that separate processes are spun up, 
that they're limited by max_worker_processes and max_parallel_degree, 
and that shared memory queues are used to move data, results and errors 
between a regular backend (controlling backend?) and it's workers. The 
first section kind-of alludes to this, but it doesn't actually explain 
any of it. I think it's OK for the very first section to be a *brief* 
tl;dr summary on the basics of turning the feature on, but after that 
laying down groundwork knowledge will make the rest of the page much 
clearer.

I think the parts that talk about "parallel plan executed with no 
workers" are confusing... it almost sounds like the query won't be 
executed at all. It'd be better to say something like "executed single 
process" or "executed with no parallelism" or similar. Maybe the real 
issue is we need to pick a clear term for a non-parallel query and stick 
with it. I would also expand the different scenarios into bullets and 
explain why parallelism isn't used, like you did right above that. (I 
think it's great that you explained *why* parallel plans wouldn't be 
generated instead of just listing conditions.)

When describing SeqScan, it would be good to clarify whether 
effective_io_concurrency has an effect. (For that matter, does 
effective_io_concurrency interact with any of the other parallel settings?)

"Functions must be marked PARALLEL UNSAFE ..., or make persistent 
changes to settings." What would be a non-persistent change? SET LOCAL? 
(This is another case where it'd be good if we decided on specific 
terminology and referenced the definition from the page.)
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [COMMITTERS] pgsql: Add new catalog called pg_init_privs
Next
From: Andres Freund
Date:
Subject: Re: Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.