Thread: Whan is it safe to mark a function PARALLEL SAFE?
According to the documentation: "Functions and aggregates must be marked PARALLEL UNSAFE if they write to the database, access sequences, change the transaction state even temporarily (e.g. a PL/pgSQL function which establishes an EXCEPTION block to catch errors), or make persistent changes to settings." If a LANGUAGE C function calls ereport(ERROR, ...), does that qualify as a potential change to the transaction state that requires it to be marked PARALLEL UNSAFE? If an error is raised in one parallel worker, does this cause the other parallel workers to be immediately terminated? How about a C function f(x) that calls out to an external system and returns a text value. If f(x) is altered on the external system, it might return a slightly different answer for some x. Let's say that for some x it returns "one" instead of "1", and we happen to know that users don't care if it returns "one" or "1". If someone were to declare f(x) to be PARALLEL SAFE, what's the worst that could happen? ----- Jim Finnerty, AWS, Amazon Aurora PostgreSQL -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Jim Finnerty <jfinnert@amazon.com> writes: > According to the documentation: > "Functions and aggregates must be marked PARALLEL UNSAFE if they write to > the database, access sequences, change the transaction state even > temporarily (e.g. a PL/pgSQL function which establishes an EXCEPTION block > to catch errors), or make persistent changes to settings." I believe the reason for the EXCEPTION-block restriction is that plpgsql does that by establishing a subtransaction, and we don't allow subtransactions in workers. It seems like that's probably just an implementation restriction that could be lifted with a little work, much more easily than the general prohibition on writing-to-the-DB could be. (Obviously, the subtransaction would still be restricted from DB writes.) The "persistent change to settings" rule is there not because it would fail, but because it wouldn't be persistent --- the GUC change would only be visible inside the particular worker. > If a LANGUAGE C function calls ereport(ERROR, ...), does that qualify as a > potential change to the transaction state that requires it to be marked > PARALLEL UNSAFE? No. It would certainly be impractical to have a rule that you can't throw errors in workers. > If an error is raised in one parallel worker, does this > cause the other parallel workers to be immediately terminated? I think not, though I didn't work on that code. The error will be reported to the parent backend, which will cause it to fail the query ... but I think it just waits for the other worker children to exit first. That's not something to rely on of course. Even if we don't make an attempt to cancel the other workers today we probably will in future. But the cancel attempt would certainly be asynchronous, so I'm not sure how "immediate" you are worried about it being. > How about a C function f(x) that calls out to an external system and returns > a text value. If f(x) is altered on the external system, it might return a > slightly different answer for some x. Let's say that for some x it returns > "one" instead of "1", and we happen to know that users don't care if it > returns "one" or "1". If someone were to declare f(x) to be PARALLEL SAFE, > what's the worst that could happen? Well, this isn't so much about whether the function is parallel safe as whether it is marked volatile or not; as you describe it, it would potentially give time-varying results even in a non-parallel query. Such a function should be marked volatile to avoid strange behavior, ie the optimizer making invalid assumptions. AFAIK "parallel safe" and "non-volatile" are more or less independent restrictions, though someone might correct me. A function that writes to the DB must be considered both volatile and parallel unsafe, but if it doesn't do that then I think it could have any combination of these properties. regards, tom lane
On Sun, Sep 8, 2019 at 12:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > If an error is raised in one parallel worker, does this > > cause the other parallel workers to be immediately terminated? > > I think not, though I didn't work on that code. The error will > be reported to the parent backend, which will cause it to fail > the query ... but I think it just waits for the other worker > children to exit first. That's not something to rely on of course. > Even if we don't make an attempt to cancel the other workers today > we probably will in future. But the cancel attempt would certainly > be asynchronous, so I'm not sure how "immediate" you are worried > about it being. If workers call CHECK_FOR_INTERRUPTS() frequently, which they should, then it should appear to users as if raising an error in one worker kills everything almost immediately, or immediately. For example, if a parallel CREATE INDEX has a worker that raises a unique violation error, that must work in a way that at least *appears* to be very similar to what the user would get with a serial CREATE INDEX. (The worst that can happen is that they'll very occasionally get two unique violation errors instead of one, or something like that.) That said, there are theoretical failures where it could take rather a long time for the parent/leader to get the memo -- see WaitForParallelWorkersToAttach() and its caller (note that anything using a gather node is subject to the same kind of failure that WaitForParallelWorkersToAttach() handles, even though they won't call the function themselves). These failures (e.g. fork() failure) are generally assumed to be rare to non-existent, though. Users will surely be upset if parallel queries cannot be cancelled almost immediately. If it happened with any regularity, somebody would have complained by now. As Tom said, it's hard to give a useful answer without more context -- how you define "immediate"? -- Peter Geoghegan
The scenario is a STABLE function that calls out to an AWS service that charges micro pennies per row, so periodically inserting CHECK_FOR_INTERRUPTS() should be able to prevent charges from accumulating for a statement that has already failed in another parallel worker. I think it's safe to declare such functions as PARALLEL SAFE. Thank you! ----- Jim Finnerty, AWS, Amazon Aurora PostgreSQL -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html