Re: assessing parallel-safety - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: assessing parallel-safety |
Date | |
Msg-id | CA+TgmoZH6Z7_MsFUz1RX5d8yBSCiO8j2PzTqwQ6g6KfObRsjrw@mail.gmail.com Whole thread Raw |
In response to | Re: assessing parallel-safety (Noah Misch <noah@leadboat.com>) |
Responses |
Re: assessing parallel-safety
|
List | pgsql-hackers |
On Sat, Feb 14, 2015 at 12:09 AM, Noah Misch <noah@leadboat.com> wrote: >> Tom also argued that (1) trying to assess parallel-safety before >> preprocess_expressions() was doomed to fail, because >> preprocess_expressions() can additional function calls via, at least, >> inlining and default argument insertion and (2) >> preprocess_expressions() can't be moved earlier than without changing >> the semantics. I'm not sure if he's right, but those are sobering >> conclusions. Andres pointed out to me via IM that inlining is >> dismissable here; if inlining introduces a parallel-unsafe construct, >> the inlined function was mislabeled to begin with, and the user has >> earned the error message they get. Default argument insertion might >> not be dismissable although the practical risks seem low. > > All implementation difficulties being equal, I would opt to check for parallel > safety after inserting default arguments and before inlining. Checking before > inlining reveals the mislabeling every time instead of revealing it only when > inline_function() gives up. Timing of the parallel safety check relative to > default argument insertion matters less. Remember, the risk is merely that a > user will find cause to remove a parallel-safe marking where he/she expected > the system to deduce parallel unsafety. If implementation difficulties lead > to some other timings, that won't bother me. Fair. For a first cut at this, I've implemented the approach you suggested before: just do an extra pass over the whole query tree, looking for parallel-unsafe functions. The attached patch implements that, via a new proparallel system catalog column. A few notes: - For testing purposes, I set this up so that the executor activates parallel mode when running any query that seems to be parallel-safe. It seems nearly certain we'd only want to do this when a parallel plan was actually selected, but this behavior is awfully useful for testing, and revealed a number of bugs in the parallel-safety markings. parallel-mode-v6.patch + this patch passes make check-world. - This patch doesn't include syntax to let the user set the proparallel flag on new objects. Presumably we could add PARALLEL { SAFE | RESTRICTED | UNSAFE } to CREATE/ALTER FUNCTION, but I haven't done that yet. This is enough to assess whether the approach is fundamentally workable, and to check what the performance ramifications may be. - The patch includes a script, fixpgproc.pl, which I used to insert and update the parallel markings on the system catalogs. That is, of course, not intended for commit, but it's mighty useful for experimentation. - parallel-mode-v6.patch adds a new restriction for a function to be considered parallel-safe: it must not acquire any heavyweight locks that it does not also release. This allows scans of system catalogs and the scan relation, but generally rules out scanning other things. To compensate, this patch marks more stuff parallel-unsafe than the list I sent previously. I think that's OK for now, but this is a pretty onerous restriction which I think we will want to try to relax at least to the extent of allowing a function to be parallel-safe if it acquires AccessShareLock on additional relations. See the README in the other patch for why I imposed this restriction. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: