Re: assessing parallel-safety - Mailing list pgsql-hackers

From Robert Haas
Subject Re: assessing parallel-safety
Date
Msg-id CA+TgmoaMH0akw30V5WHidNTeLgd9OBrYv1bmwp9htXeYUr7uZA@mail.gmail.com
Whole thread Raw
In response to Re: assessing parallel-safety  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Feb 11, 2015 at 3:21 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I think we may want a dedicated parallel-safe property for functions
> rather than piggybacking on provolatile ...

I went through the current contents of pg_proc and tried to assess how
much parallel-unsafe stuff we've got.  I think there are actually
three categories of things: (1) functions that can be called in
parallel mode either in the worker or in the leader ("parallel safe"),
(2) functions that can be called in parallel mode in the worker, but
not in the leader ("parallel restricted"), and (3) functions that
cannot be called in parallel mode at all ("parallel unsafe").  On a
first read-through, the number of things that looked not to be
anything other than parallel-safe looked to be fairly small; many of
these could be made parallel-safe with more work, but it's unlikely to
be worth the effort.

current_query() - Restricted because debug_query_string is not copied.
lo_open(), lo_close(), loread(), lowrite(), and other large object
functions - Restricted because large object state is not shared.
age(xid) - Restricted because it uses a transaction-lifespan cache
which is not shared.
now() - Restricted because transaction start timestamp is not copied.
statement_timestamp() - Restricted because statement start timestamp
is not copied.
pg_conf_load_time() - Restricted because PgReloadTime is not copied.
nextval(), currval() - Restricted because sequence-related state is not shared.
setval() - Unsafe because no data can be written in parallel mode.
random(), setseed() - Restricted because random seed state is not
shared. (We could alternatively treat setseed() as unsafe and random()
to be restricted only in sessions where setseed() has never been
called, and otherwise safe.)
pg_stat_get_* - Restricted because there's no guarantee the value
would be the same in the parallel worker as in the leader.
pg_backend_pid() - Restricted because the worker has a different PID.
set_config() - Unsafe because GUC state must remain synchronized.
pg_my_temp_schema() - Restricted because temporary namespaces aren't
shared with parallel workers.
pg_export_snapshot() - Restricted because the worker will go away quickly.
pg_prepared_statement(), pg_cursor() - Restricted because the prepared
statements and cursors are not synchronized with the worker.
pg_listening_channels() - Restricted because listening channels are
not synchronized with the worker.
pg*advisory*lock*() - Restricted because advisory lock state is not
shared with workers - and even if it were, the semantics would be hard
to reason about.
txid_current() - Unsafe because it might attempt XID assignment.
pg_logical_slot*() - Unsafe because they do all kinds of crazy stuff.

That's not a lot, and very little of it is anything you'd care about
parallelizing anyway.  I expect that the incidence of user-written
parallel-unsafe functions will be considerably higher.  I'm not sure
if this impacts the decision about how to design the facility for
assessing parallel-safety or not, but I thought it was worth sharing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Index-only scans for GiST.
Next
From: Thom Brown
Date:
Subject: Re: Index-only scans for GiST.