Thread: [HACKERS] Write operations in parallel mode

[HACKERS] Write operations in parallel mode

From
Antonin Houska
Date:
Now that dynamic shared memory hash table has been committed
(8c0d7bafad36434cb08ac2c78e69ae72c194ca20) I wonder if it's still a big deal
to remove restrictions like this in (e.g. heap_update()):
/* * Forbid this during a parallel operation, lest it allocate a combocid. * Other workers might need that combocid for
visibilitychecks, and we * have no provision for broadcasting it to them. */if (IsInParallelMode())    ereport(ERROR,
        (errcode(ERRCODE_INVALID_TRANSACTION_STATE),             errmsg("cannot update tuples during aparallel
operation")));



--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



Re: [HACKERS] Write operations in parallel mode

From
Robert Haas
Date:
On Mon, Aug 28, 2017 at 12:23 PM, Antonin Houska <ah@cybertec.at> wrote:
> Now that dynamic shared memory hash table has been committed
> (8c0d7bafad36434cb08ac2c78e69ae72c194ca20) I wonder if it's still a big deal
> to remove restrictions like this in (e.g. heap_update()):
>
>         /*
>          * Forbid this during a parallel operation, lest it allocate a combocid.
>          * Other workers might need that combocid for visibility checks, and we
>          * have no provision for broadcasting it to them.
>          */
>         if (IsInParallelMode())
>                 ereport(ERROR,
>                                 (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
>                                  errmsg("cannot update tuples during a
>         parallel operation")));

Well, it certainly doesn't solve that problem directly, but I think it
is useful infrastructure that lends itself to some kind of a solution
to that problem.  There are other problems, too, like group locking
vs. the relation extension lock, though Masahiko-san posted a patch
for that IIRC, and also group locking vs. GIN page locks, and also
proper interlocking of updates.  I think that the order in which we
should tackle things is probably something like:

1. Make inserts parallel-restricted rather than parallel-unsafe - i.e.
allow inserts but only in the leader.  This allows plans like Insert
-> Gather -> whatever.  I'm not sure there's a lot to do here other
than allow such plans to be generated.

2. Make updates and deletes parallel-restricted rather than
parallel-unsafe - i.e. allow updates and deletes but only in the
leader.  This similarly would allow Update -> Gather -> whatever and
Delete -> Gather -> whatever.  For this, you'd need a shared combo CID
hash so that workers can learn about new combo CIDs created by the
leader.

3. Make inserts parallel-safe rather than parallel-restricted.  This
allows plans like Gather -> Insert -> whatever, which is way better
than Insert -> Gather -> whatever.  If there's no RETURNING, the
Gather isn't actually gathering anything.  This requires sorting out
some group locking issues, but not tuple locking.

4. Make updates and deletes parallel-safe rather than
parallel-restricted.  Now tuple locking has to be sorted out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company