Thread: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
Add a new GUC and a reloption to enable inserts in parallel-mode.

Commit 05c8482f7f added the implementation of parallel SELECT for
"INSERT INTO ... SELECT ..." which may incur non-negligible overhead in
the additional parallel-safety checks that it performs, even when, in the
end, those checks determine that parallelism can't be used. This is
normally only ever a problem in the case of when the target table has a
large number of partitions.

A new GUC option "enable_parallel_insert" is added, to allow insert in
parallel-mode. The default is on.

In addition to the GUC option, the user may want a mechanism to allow
inserts in parallel-mode with finer granularity at table level. The new
table option "parallel_insert_enabled" allows this. The default is true.

Author: "Hou, Zhijie"
Reviewed-by: Greg Nancarrow, Amit Langote, Takayuki Tsunakawa, Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1K-cW7svLC2D7DHoGHxdAdg3P37BLgebqBOC2ZLc9a6QQ%40mail.gmail.com
Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/c8f78b616167bf8e24bc5dc69112c37755ed3058

Modified Files
--------------
doc/src/sgml/config.sgml                      | 23 +++++++++++
doc/src/sgml/ref/alter_table.sgml             |  3 +-
doc/src/sgml/ref/create_table.sgml            | 26 +++++++++++++
src/backend/access/common/reloptions.c        | 25 +++++++++---
src/backend/optimizer/path/costsize.c         |  2 +
src/backend/optimizer/util/clauses.c          | 34 ++++++++++++++--
src/backend/utils/misc/guc.c                  | 10 +++++
src/backend/utils/misc/postgresql.conf.sample |  1 +
src/bin/psql/tab-complete.c                   |  1 +
src/include/optimizer/cost.h                  |  1 +
src/include/utils/rel.h                       | 25 ++++++++++++
src/test/regress/expected/insert_parallel.out | 56 ++++++++++++++++++++++++++-
src/test/regress/expected/sysviews.out        |  3 +-
src/test/regress/sql/insert_parallel.sql      | 44 ++++++++++++++++++++-
src/tools/pgindent/typedefs.list              |  1 +
15 files changed, 240 insertions(+), 15 deletions(-)


Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Robert Haas
Date:
On Wed, Mar 17, 2021 at 10:14 PM Amit Kapila <akapila@postgresql.org> wrote:
> Add a new GUC and a reloption to enable inserts in parallel-mode.
>
> Commit 05c8482f7f added the implementation of parallel SELECT for
> "INSERT INTO ... SELECT ..." which may incur non-negligible overhead in
> the additional parallel-safety checks that it performs, even when, in the
> end, those checks determine that parallelism can't be used. This is
> normally only ever a problem in the case of when the target table has a
> large number of partitions.
>
> A new GUC option "enable_parallel_insert" is added, to allow insert in
> parallel-mode. The default is on.
>
> In addition to the GUC option, the user may want a mechanism to allow
> inserts in parallel-mode with finer granularity at table level. The new
> table option "parallel_insert_enabled" allows this. The default is true.

I find this fairly ugly. If you can't make the cost of checking
whether parallelism is safe low enough that you don't need a setting
for this, then I think perhaps you shouldn't have the feature at all.
In other words, I propose that you revert both this and 05c8482f7f and
come back when you have a better design that doesn't introduce so much
overhead.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Robert Haas <robertmhaas@gmail.com> writes:
> I find this fairly ugly. If you can't make the cost of checking
> whether parallelism is safe low enough that you don't need a setting
> for this, then I think perhaps you shouldn't have the feature at all.
> In other words, I propose that you revert both this and 05c8482f7f and
> come back when you have a better design that doesn't introduce so much
> overhead.

I'm +1 on that idea for a completely independent reason: 05c8482f7f
is currently the easy winner for "scariest patch of the v14 cycle".
I don't have any faith in it, and so I'm very concerned that it went
in so late in the dev cycle.  It'd be better to push some successor
design early in the v15 cycle, when we'll have more time to catch
problems.

            regards, tom lane



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Mon, Mar 22, 2021 at 8:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > I find this fairly ugly. If you can't make the cost of checking
> > whether parallelism is safe low enough that you don't need a setting
> > for this, then I think perhaps you shouldn't have the feature at all.
> > In other words, I propose that you revert both this and 05c8482f7f and
> > come back when you have a better design that doesn't introduce so much
> > overhead.
>
> I'm +1 on that idea for a completely independent reason: 05c8482f7f
> is currently the easy winner for "scariest patch of the v14 cycle".
> I don't have any faith in it, and so I'm very concerned that it went
> in so late in the dev cycle.  It'd be better to push some successor
> design early in the v15 cycle, when we'll have more time to catch
> problems.
>

Okay, I can revert this work to avoid that risk but it would be great
if you can share your thoughts on what alternative design do you have
in mind, and or how it should be better implemented? Regarding
performance overhead, it is for partitioned tables with a large number
of partitions, and that too when the data to insert is not that much
or there is parallel-unsafe clause on one of the partitions. Now, how
else can we determine the parallel-safety without checking each of the
partitions? We do have other partition-related gucs
(enable_partition*) to avoid the partitions-related overhead so why is
it so bad to have guc here (maybe the naming of guc/reloption is not
good)?

-- 
With Regards,
Amit Kapila.



RE: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Amit Kapila <amit.kapila16@gmail.com>
> Okay, I can revert this work to avoid that risk but it would be great
> if you can share your thoughts on what alternative design do you have
> in mind, and or how it should be better implemented? Regarding
> performance overhead, it is for partitioned tables with a large number
> of partitions, and that too when the data to insert is not that much
> or there is parallel-unsafe clause on one of the partitions. Now, how
> else can we determine the parallel-safety without checking each of the
> partitions? We do have other partition-related gucs
> (enable_partition*) to avoid the partitions-related overhead so why is
> it so bad to have guc here (maybe the naming of guc/reloption is not
> good)?
> 

I'd love to hear even rough ideas from Robert-san.  I guess something like...:

* Basically, we can assume or hope that all partitions have the same parallel safety as the root partitioned table.
Thatis, it'd be very rare that child partitions have different indexes, constraints, and triggers.  So, we can just
checkthe root table during planning to decide if we want to run parallel processing.
 

* When the executor opens a target child partition, it checks its parallel safety only once for it.  If any target
partitionturns out to be parallel unsafe, have the parallel leader do the insertion shomehow.
 

TBH, I don't think the parameter is so ugly.  At least, it's not worse than Oracle or SQL Server.





    Regards
Takayuki     Tsunakawa
                        


Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Andres Freund
Date:
Hi,

On 2021-03-22 08:47:47 -0400, Robert Haas wrote:
> I find this fairly ugly. If you can't make the cost of checking
> whether parallelism is safe low enough that you don't need a setting
> for this, then I think perhaps you shouldn't have the feature at all.
> In other words, I propose that you revert both this and 05c8482f7f and
> come back when you have a better design that doesn't introduce so much
> overhead.

I started out wondering whether some of the caching David Rowley has been
working on to reduce the overhead of the result cache planning shows a
path for how to make parts of this cheaper.
https://postgr.es/m/CAApHDvpw5qG4vTb%2BhwhmAJRid6QF%2BR9vvtOhX2HN2Yy9%2Bw20sw%40mail.gmail.com

But looking more, several of the checks just seem wrong to me.

target_rel_index_max_parallel_hazard() deparses index expressions from
scratch? With code like

+        index_rel = index_open(index_oid, lockmode);
...
+    index_oid_list = RelationGetIndexList(rel);
+    foreach(lc, index_oid_list)
...
+        ii_Expressions = RelationGetIndexExpressions(index_rel);
...

+                    Assert(index_expr_item != NULL);
+                    if (index_expr_item == NULL)    /* shouldn't happen */
+                    {
+                        elog(WARNING, "too few entries in indexprs list");
+                        context->max_hazard = PROPARALLEL_UNSAFE;
+                        found_max_hazard = true;
+                        break;
+                    }

Brrr.

Shouldn't we have this in IndexOptInfo already? But also, why on earth
is that WARNING branch a good idea?


+static bool
+target_rel_domain_max_parallel_hazard(Oid typid, max_parallel_hazard_context *context)
...
+    scan = systable_beginscan(con_rel, ConstraintTypidIndexId, true,
+                              NULL, 1, key);
+
+    while (HeapTupleIsValid((tup = systable_getnext(scan))))

There's plenty other code in the planner that needs to know about
domains. This stuff is precisely why the typecache exists.


The pattern to rebuild information that we already have cached elsewhere
seems to repeat all over this patch.


This seems not even close to committable.

Greetings,

Andres Freund



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Andres Freund
Date:
Hi,

On 2021-03-23 09:01:13 +0530, Amit Kapila wrote:
> Okay, I can revert this work to avoid that risk but it would be great
> if you can share your thoughts on what alternative design do you have
> in mind, and or how it should be better implemented? Regarding
> performance overhead, it is for partitioned tables with a large number
> of partitions, and that too when the data to insert is not that much
> or there is parallel-unsafe clause on one of the partitions. Now, how
> else can we determine the parallel-safety without checking each of the
> partitions?

You cache it. And actually use information that is already
cached. There's a huge difference between doing expensive stuff once
every now and then, and doing it over and over and over. That's why we
have relcache, syscache, typecache etc.


> We do have other partition-related gucs (enable_partition*) to avoid
> the partitions-related overhead so why is it so bad to have guc here
> (maybe the naming of guc/reloption is not good)?

Most of those seem more about risk-reduction. And they don't have
reloptions - which seems to be the only way this feature can
realistically be used.  The one halfway comparable GUC is
enable_partitionwise_join - while it'd be great to not need it (or at
least not default to off), it implies some legitimately computationally
hard work that can't easily be cached.

Greetings,

Andres Freund



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Robert Haas
Date:
On Tue, Mar 23, 2021 at 3:13 PM Andres Freund <andres@anarazel.de> wrote:
> You cache it.

Yeah, exactly. I don't think it's super-easy to understand exactly how
to make that work well for something like this. It would be easy
enough to set a flag in the relcache whose value is computed the first
time we need it and is then consulted every time after that, and you
just invalidate it based on sinval messages. But, if you go with that
design, you've got a big problem: now an insert has to lock all the
tables in the partitioning hierarchy to decide whether it can run in
parallel or not, and we do not want that. We want to be able to just
lock the partitions we need, so really, we want to be able to test for
parallel-safety without requiring a relation lock, or only requiring
it on the partitioned table itself and not all the partitions.
However, that introduces a race condition, because if you ever check
the value of the flag without a lock on the relation, then you can't
rely on sinval to blow away the cached state. I don't have a good
solution to that problem in mind right now, because I haven't really
devoted much time to thinking about it, but I think it might be
possible to solve it with more thought.

But if I had thought about it and had not come up with anything better
than what you committed here, I would have committed nothing, and I
think that's what you should have done, too. This patch is full of
grotty hacks. Just to take one example, consider the code that forces
a transaction ID assignment prior to the operation. You *could* have
solved this problem by introducing new machinery to make it safe to
assign an XID in parallel mode. Then, we'd have a fundamental new
capability that we currently lack. Instead, you just force-assigned an
XID before entering parallel mode. That's not building any new
capability; that's just hacking around the lack of a capability to
make something work, while ignoring the disadvantages of doing it that
way, namely that sometimes an XID will be assigned for no purpose.

Likewise, the XXX comment you added to max_parallel_hazard_walker
claims that some of the code introduced there is to compensate for an
unspecified bug in the rewriter. I'm a bit skeptical that the comment
is correct, and there's no way to find out because the comment doesn't
say what the bug supposedly is, but let's just say for the sake of
argument that it's true. Well, you *could* have fixed the bug, but
instead you hacked around it, and in a relatively expensive way that
affects every query with a CTE in it whether it can benefit from this
patch or not. That's not a responsible way of maintaining the core
PostgreSQL code.

I also agree with Andres's criticism of the code in
target_rel_index_max_parallel_hazard entirely. It's completely
unacceptable to be doing index_open() here. If you don't understand
the design of the planner well enough to know why that's not OK, then
you shouldn't be committing patches like this.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Alvaro Herrera
Date:
On 2021-Mar-23, Robert Haas wrote:

> Likewise, the XXX comment you added to max_parallel_hazard_walker
> claims that some of the code introduced there is to compensate for an
> unspecified bug in the rewriter. I'm a bit skeptical that the comment
> is correct, and there's no way to find out because the comment doesn't
> say what the bug supposedly is, but let's just say for the sake of
> argument that it's true. Well, you *could* have fixed the bug, but
> instead you hacked around it, and in a relatively expensive way that
> affects every query with a CTE in it whether it can benefit from this
> patch or not. That's not a responsible way of maintaining the core
> PostgreSQL code.

I think the CTE bug is this one:

https://www.postgresql.org/message-id/flat/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com

while I can't disagree with the overall conclusion that it seems safer
to revert parallel INSERT/SELECT given the number of alleged problems,
it is true that this bug exists, and has gone unfixed.

-- 
Álvaro Herrera       Valdivia, Chile



Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2021-Mar-23, Robert Haas wrote:
>> Likewise, the XXX comment you added to max_parallel_hazard_walker
>> claims that some of the code introduced there is to compensate for an
>> unspecified bug in the rewriter.

> I think the CTE bug is this one:
> https://www.postgresql.org/message-id/flat/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
> while I can't disagree with the overall conclusion that it seems safer
> to revert parallel INSERT/SELECT given the number of alleged problems,
> it is true that this bug exists, and has gone unfixed.

Yeah, because it's not totally clear whether we want to fix it by
disallowing the case, or by significantly changing the way the
rewriter works.  It'd be good if some of the folks on this thread
weighed in on that choice.

(Having said that, another complaint about this particular comment
is that it doesn't mention how the planner should be changed once
the rewriter is fixed.  Is it sufficient to delete the stanza below
the comment?  Just looking at it, I wonder whether
max_parallel_hazard_walker is now being invoked on portions of the
Query tree that we'd not have to traverse at all given a rewriter fix.)

            regards, tom lane



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Wed, Mar 24, 2021 at 2:00 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Mar-23, Robert Haas wrote:
>
> > Likewise, the XXX comment you added to max_parallel_hazard_walker
> > claims that some of the code introduced there is to compensate for an
> > unspecified bug in the rewriter. I'm a bit skeptical that the comment
> > is correct, and there's no way to find out because the comment doesn't
> > say what the bug supposedly is, but let's just say for the sake of
> > argument that it's true. Well, you *could* have fixed the bug, but
> > instead you hacked around it, and in a relatively expensive way that
> > affects every query with a CTE in it whether it can benefit from this
> > patch or not. That's not a responsible way of maintaining the core
> > PostgreSQL code.
>
> I think the CTE bug is this one:
>
> https://www.postgresql.org/message-id/flat/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
>
> while I can't disagree with the overall conclusion that it seems safer
> to revert parallel INSERT/SELECT given the number of alleged problems,
>

Agreed. I'll revert the patch and respond to some of the points here
with my thoughts.

-- 
With Regards,
Amit Kapila.



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Wed, Mar 24, 2021 at 2:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > On 2021-Mar-23, Robert Haas wrote:
> >> Likewise, the XXX comment you added to max_parallel_hazard_walker
> >> claims that some of the code introduced there is to compensate for an
> >> unspecified bug in the rewriter.
>
> > I think the CTE bug is this one:
> > https://www.postgresql.org/message-id/flat/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
> > while I can't disagree with the overall conclusion that it seems safer
> > to revert parallel INSERT/SELECT given the number of alleged problems,
> > it is true that this bug exists, and has gone unfixed.
>
> Yeah, because it's not totally clear whether we want to fix it by
> disallowing the case, or by significantly changing the way the
> rewriter works.  It'd be good if some of the folks on this thread
> weighed in on that choice.
>
> (Having said that, another complaint about this particular comment
> is that it doesn't mention how the planner should be changed once
> the rewriter is fixed.  Is it sufficient to delete the stanza below
> the comment?
>

Yes.

> Just looking at it, I wonder whether
> max_parallel_hazard_walker is now being invoked on portions of the
> Query tree that we'd not have to traverse at all given a rewriter fix.)
>

I remember that I have debugged that part and information was
available, so decided to add a check based on that. I just went with
your analysis in the CTE bug discussion that it is not worth and OTOH
here that information was available, so I thought that it might be a
good compromise but still, I or someone else could have weighed in on
that choice.

-- 
With Regards,
Amit Kapila.



RE: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Robert Haas <robertmhaas@gmail.com>
> Yeah, exactly. I don't think it's super-easy to understand exactly how
> to make that work well for something like this. It would be easy
> enough to set a flag in the relcache whose value is computed the first
> time we need it and is then consulted every time after that, and you
> just invalidate it based on sinval messages. But, if you go with that
> design, you've got a big problem: now an insert has to lock all the
> tables in the partitioning hierarchy to decide whether it can run in
> parallel or not, and we do not want that. We want to be able to just
> lock the partitions we need, so really, we want to be able to test for
> parallel-safety without requiring a relation lock, or only requiring
> it on the partitioned table itself and not all the partitions.
> However, that introduces a race condition, because if you ever check
> the value of the flag without a lock on the relation, then you can't
> rely on sinval to blow away the cached state. I don't have a good
> solution to that problem in mind right now, because I haven't really
> devoted much time to thinking about it, but I think it might be
> possible to solve it with more thought.

One problem with caching the result is that the first access in each session has to experience the slow processing.
Somesevere customers of our proprietary database, which is not based on Postgres, have requested to eliminate even the
overheadassociated with the first access, and we have provided features for them.  As for the data file, users can use
pg_prewam. But what can we recommend users to do in this case?  Maybe the logon trigger feature, which is ready for
committerin PG 14, can be used to allow users to execute possible queries at session start (or establishing a
connectionpool), but I feel it's inconvenient.
 


> But if I had thought about it and had not come up with anything better
> than what you committed here, I would have committed nothing, and I
> think that's what you should have done, too. This patch is full of
> grotty hacks. Just to take one example, consider the code that forces
> a transaction ID assignment prior to the operation. You *could* have
> solved this problem by introducing new machinery to make it safe to
> assign an XID in parallel mode. Then, we'd have a fundamental new
> capability that we currently lack. Instead, you just force-assigned an
> XID before entering parallel mode. That's not building any new
> capability; that's just hacking around the lack of a capability to
> make something work, while ignoring the disadvantages of doing it that
> way, namely that sometimes an XID will be assigned for no purpose.

Regarding the picked xid assignment, I didn't think it's so grotty.  Yes, in fact, I felt it's a bit unclean.  But it's
onlya single line of code.  With a single line of code, we can provide great value to users.  Why don't we go for it?
Asdiscussed in the thread, the xid is wasted only when the source data is empty, which is impractical provided that the
userwants to load much data probably for ETL.
 

(I'm afraid "grotty" may be too strong a word considering the CoC statement "We encourage thoughtful, constructive
discussionof the software and this community, their current state, and possible directions for development. The focus
ofour discussions should be the code and related technology, community projects, and infrastructure.")
 


> Likewise, the XXX comment you added to max_parallel_hazard_walker
> claims that some of the code introduced there is to compensate for an
> unspecified bug in the rewriter. I'm a bit skeptical that the comment
> is correct, and there's no way to find out because the comment doesn't
> say what the bug supposedly is, but let's just say for the sake of
> argument that it's true. Well, you *could* have fixed the bug, but
> instead you hacked around it, and in a relatively expensive way that
> affects every query with a CTE in it whether it can benefit from this
> patch or not. That's not a responsible way of maintaining the core
> PostgreSQL code.

It'd be too sad if we have to be bothered by an existing bug and give up an attractive feature.  Adding more
explanationin the comment is OK?  Anyway, I think we can separate this issue.
 


> I also agree with Andres's criticism of the code in
> target_rel_index_max_parallel_hazard entirely. It's completely
> unacceptable to be doing index_open() here. If you don't understand
> the design of the planner well enough to know why that's not OK, then
> you shouldn't be committing patches like this.

This sounds like something to address.  I have to learn...


    Regards
Takayuki     Tsunakawa
                        


Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Greg Nancarrow
Date:
On Wed, Mar 24, 2021 at 5:44 AM Andres Freund <andres@anarazel.de> wrote:
>
>
> But looking more, several of the checks just seem wrong to me.
>
> target_rel_index_max_parallel_hazard() deparses index expressions from
> scratch? With code like
>
> +        index_rel = index_open(index_oid, lockmode);
> ...
> +    index_oid_list = RelationGetIndexList(rel);
> +    foreach(lc, index_oid_list)
> ...
> +        ii_Expressions = RelationGetIndexExpressions(index_rel);
> ...
>
> +                    Assert(index_expr_item != NULL);
> +                    if (index_expr_item == NULL)    /* shouldn't happen */
> +                    {
> +                        elog(WARNING, "too few entries in indexprs list");
> +                        context->max_hazard = PROPARALLEL_UNSAFE;
> +                        found_max_hazard = true;
> +                        break;
> +                    }
>
> Brrr.
>
> Shouldn't we have this in IndexOptInfo already?

The additional parallel-safety checks are (at least currently) invoked
as part of max_parallel_hazard(), which is called early on in the
planner, so I don't believe that IndexOptInfo/RelOptInfo has been
built at this point.

> But also, why on earth
> is that WARNING branch a good idea?
>

I think there are about half a dozen other places in the Postgres code
where the same check is done, in which case it calls elog(ERROR,...).
Is it a better strategy to immediately exit the backend with an error
in this case, like the other cases do?
My take on it was that if this "should never happen" condition is ever
encountered, let it back out of the parallel-safety checks and
error-out in the place it normally (currently) would.

>
> +static bool
> +target_rel_domain_max_parallel_hazard(Oid typid, max_parallel_hazard_context *context)
> ...
> +    scan = systable_beginscan(con_rel, ConstraintTypidIndexId, true,
> +                              NULL, 1, key);
> +
> +    while (HeapTupleIsValid((tup = systable_getnext(scan))))
>
> There's plenty other code in the planner that needs to know about
> domains. This stuff is precisely why the typecache exists.
>
>

OK, fair comment.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Andres Freund
Date:
Hi,

On 2021-03-24 14:42:44 +1100, Greg Nancarrow wrote:
> On Wed, Mar 24, 2021 at 5:44 AM Andres Freund <andres@anarazel.de> wrote:
> > Shouldn't we have this in IndexOptInfo already?
> 
> The additional parallel-safety checks are (at least currently) invoked
> as part of max_parallel_hazard(), which is called early on in the
> planner, so I don't believe that IndexOptInfo/RelOptInfo has been
> built at this point.

Then that's something you need to redesign, not duplicate the effort.


> > But also, why on earth
> > is that WARNING branch a good idea?

> I think there are about half a dozen other places in the Postgres code
> where the same check is done, in which case it calls elog(ERROR,...).
> Is it a better strategy to immediately exit the backend with an error
> in this case, like the other cases do?

Yes.


> My take on it was that if this "should never happen" condition is ever
> encountered, let it back out of the parallel-safety checks and
> error-out in the place it normally (currently) would.

What advantage does that have? You'll get a bunch of WARNINGs before the
ERROR later in optimize, differences between assert-non/assert builds,
more complicated code flow, longer untested code.

Greetings,

Andres Freund



Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Mar 23, 2021 at 3:13 PM Andres Freund <andres@anarazel.de> wrote:
>> You cache it.

> Yeah, exactly. I don't think it's super-easy to understand exactly how
> to make that work well for something like this. It would be easy
> enough to set a flag in the relcache whose value is computed the first
> time we need it and is then consulted every time after that, and you
> just invalidate it based on sinval messages. But, if you go with that
> design, you've got a big problem: now an insert has to lock all the
> tables in the partitioning hierarchy to decide whether it can run in
> parallel or not, and we do not want that.

Possibly-crazy late-night idea ahead:

IIUC, we need to know a global property of a partitioning hierarchy:
is every trigger, CHECK constraint, etc that might be run by an INSERT
parallel-safe?  What we see here is that reverse-engineering that
property every time we need to know it is just too expensive, even
with use of our available caching methods.

How about a declarative approach instead?  That is, if a user would
like parallelized inserts into a partitioned table, she must declare
the table parallel-safe with some suitable annotation.  Then, checking
the property during DML is next door to free, and instead we have to think
about whether and how to enforce that the marking is valid during DDL.

I don't honestly see a real cheap way to enforce such a property.
For instance, if someone does ALTER FUNCTION to remove a function's
parallel-safe marking, we can't really run around and verify that the
function is not used in any CHECK constraint.  (Aside from the cost,
there would be race conditions.)

But maybe we don't have to enforce it exactly.  It could be on the
user's head that the marking is accurate.  We could prevent any
really bad misbehavior by having parallel workers error out if they
see they've been asked to execute a non-parallel-safe function.

Or there are probably other ways to slice it up.  But I think some
outside-the-box thinking might be helpful here.

            regards, tom lane



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Wed, Mar 24, 2021 at 8:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 24, 2021 at 2:00 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2021-Mar-23, Robert Haas wrote:
> >
> > > Likewise, the XXX comment you added to max_parallel_hazard_walker
> > > claims that some of the code introduced there is to compensate for an
> > > unspecified bug in the rewriter. I'm a bit skeptical that the comment
> > > is correct, and there's no way to find out because the comment doesn't
> > > say what the bug supposedly is, but let's just say for the sake of
> > > argument that it's true. Well, you *could* have fixed the bug, but
> > > instead you hacked around it, and in a relatively expensive way that
> > > affects every query with a CTE in it whether it can benefit from this
> > > patch or not. That's not a responsible way of maintaining the core
> > > PostgreSQL code.
> >
> > I think the CTE bug is this one:
> >
> > https://www.postgresql.org/message-id/flat/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
> >
> > while I can't disagree with the overall conclusion that it seems safer
> > to revert parallel INSERT/SELECT given the number of alleged problems,
> >
>
> Agreed. I'll revert the patch and respond to some of the points here
> with my thoughts.
>

BTW, I think the revert needs catversion bump as commit
c5be48f092016b1caf597b2e21d588b56c88a23e has changed catalog. I hope
that is fine but if not do let me know?

-- 
With Regards,
Amit Kapila.



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Andres Freund
Date:
Hi,

On 2021-03-23 16:04:59 -0400, Robert Haas wrote:
> It would be easy enough to set a flag in the relcache whose value is
> computed the first time we need it and is then consulted every time
> after that, and you just invalidate it based on sinval messages. But,
> if you go with that design, you've got a big problem: now an insert
> has to lock all the tables in the partitioning hierarchy to decide
> whether it can run in parallel or not, and we do not want that. We
> want to be able to just lock the partitions we need, so really, we
> want to be able to test for parallel-safety without requiring a
> relation lock, or only requiring it on the partitioned table itself
> and not all the partitions.

That seems like an argument for a pg_class attribute about parallel
safety of the current relation *and* its children. It'd definitely mean
recursing higher in the partition tree during DDL if the action on a
child partition causes the safety to flip.


> But if I had thought about it and had not come up with anything better
> than what you committed here, I would have committed nothing, and I
> think that's what you should have done, too.

I agree with that.


> Just to take one example, consider the code that forces a transaction
> ID assignment prior to the operation. You *could* have solved this
> problem by introducing new machinery to make it safe to assign an XID
> in parallel mode. Then, we'd have a fundamental new capability that we
> currently lack. Instead, you just force-assigned an XID before
> entering parallel mode. That's not building any new capability; that's
> just hacking around the lack of a capability to make something work,
> while ignoring the disadvantages of doing it that way, namely that
> sometimes an XID will be assigned for no purpose.

Although this specific hack doesn't seem too terrible to me. If you
execute a parallel insert the likelihood to end up not needing an xid is
pretty low. Implementing it concurrently does seem like it'd end up
needing another lwlock nested around xid assignment, or some more
complicated scheme with already holding XidGenLock or retries. But maybe
I'm missing an easy solution here.

Greetings,

Andres Freund



Amit Kapila <amit.kapila16@gmail.com> writes:
> BTW, I think the revert needs catversion bump as commit
> c5be48f092016b1caf597b2e21d588b56c88a23e has changed catalog. I hope
> that is fine but if not do let me know?

Yeah, just advance catversion another time.

            regards, tom lane



RE: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Tom Lane <tgl@sss.pgh.pa.us>
> Possibly-crazy late-night idea ahead:
>
> IIUC, we need to know a global property of a partitioning hierarchy:
> is every trigger, CHECK constraint, etc that might be run by an INSERT
> parallel-safe?  What we see here is that reverse-engineering that
> property every time we need to know it is just too expensive, even
> with use of our available caching methods.
>
> How about a declarative approach instead?  That is, if a user would
> like parallelized inserts into a partitioned table, she must declare
> the table parallel-safe with some suitable annotation.  Then, checking
> the property during DML is next door to free, and instead we have to think
> about whether and how to enforce that the marking is valid during DDL.
>
> I don't honestly see a real cheap way to enforce such a property.
> For instance, if someone does ALTER FUNCTION to remove a function's
> parallel-safe marking, we can't really run around and verify that the
> function is not used in any CHECK constraint.  (Aside from the cost,
> there would be race conditions.)

I thought of a similar idea as below, which I was most reluctant to adopt.


https://www.postgresql.org/message-id/TYAPR01MB29907AE025B60A1C2CA5F08DFEA70%40TYAPR01MB2990.jpnprd01.prod.outlook.com
--------------------------------------------------
(3) Record the parallel safety in system catalog
Add a column like relparallel in pg_class that indicates the parallel safety of the relation.  planner just checks the
valueinstead of doing heavy work for every SQL statement.  That column's value is modified whenever a relation
alterationis made that affects the parallel safety, such as adding a domain column and CHECK constraint.  In case of a
partitionedrelation, the parallel safety attributes of all its descendant relations are merged.  For example, if a
partitionbecomes parallel-unsafe, the ascendant partitioned tables also become parallel-unsafe. 



But... developing such code would be burdonsome and bug-prone?
--------------------------------------------------


> But maybe we don't have to enforce it exactly.  It could be on the
> user's head that the marking is accurate.  We could prevent any
> really bad misbehavior by having parallel workers error out if they
> see they've been asked to execute a non-parallel-safe function.

I'm wondering if we can do so as I mentioned yesterday; the parallel worker delegates the work to the parallel leader
whenthe target relation or related functions is not parallel-safe. 


    Regards
Takayuki     Tsunakawa





Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Robert Haas
Date:
On Tue, Mar 23, 2021 at 11:13 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
> One problem with caching the result is that the first access in each session has to experience the slow processing.
Somesevere customers of our proprietary database, which is not based on Postgres, have requested to eliminate even the
overheadassociated with the first access, and we have provided features for them.  As for the data file, users can use
pg_prewam. But what can we recommend users to do in this case?  Maybe the logon trigger feature, which is ready for
committerin PG 14, can be used to allow users to execute possible queries at session start (or establishing a
connectionpool), but I feel it's inconvenient. 

Well, I don't mind if somebody thinks up an even better solution.

> Regarding the picked xid assignment, I didn't think it's so grotty.  Yes, in fact, I felt it's a bit unclean.  But
it'sonly a single line of code.  With a single line of code, we can provide great value to users.  Why don't we go for
it? As discussed in the thread, the xid is wasted only when the source data is empty, which is impractical provided
thatthe user wants to load much data probably for ETL. 

The amount of code isn't the issue. I'd rather expend a little more
code and solve the problem in a better way.

> (I'm afraid "grotty" may be too strong a word considering the CoC statement "We encourage thoughtful, constructive
discussionof the software and this community, their current state, and possible directions for development. The focus
ofour discussions should be the code and related technology, community projects, and infrastructure.") 

I did not mean to give offense, but I also don't think grotty is a
strong word. I consider it a pretty mild word.

> > Likewise, the XXX comment you added to max_parallel_hazard_walker
> > claims that some of the code introduced there is to compensate for an
> > unspecified bug in the rewriter. I'm a bit skeptical that the comment
> > is correct, and there's no way to find out because the comment doesn't
> > say what the bug supposedly is, but let's just say for the sake of
> > argument that it's true. Well, you *could* have fixed the bug, but
> > instead you hacked around it, and in a relatively expensive way that
> > affects every query with a CTE in it whether it can benefit from this
> > patch or not. That's not a responsible way of maintaining the core
> > PostgreSQL code.
>
> It'd be too sad if we have to be bothered by an existing bug and give up an attractive feature.  Adding more
explanationin the comment is OK?  Anyway, I think we can separate this issue. 

I don't think I agree. These checks are adding a significant amount of
overhead, and one of the problems with this whole thing is that it
adds a lot of overhead.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Robert Haas
Date:
On Wed, Mar 24, 2021 at 12:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> How about a declarative approach instead?  That is, if a user would
> like parallelized inserts into a partitioned table, she must declare
> the table parallel-safe with some suitable annotation.  Then, checking
> the property during DML is next door to free, and instead we have to think
> about whether and how to enforce that the marking is valid during DDL.
>
> I don't honestly see a real cheap way to enforce such a property.
> For instance, if someone does ALTER FUNCTION to remove a function's
> parallel-safe marking, we can't really run around and verify that the
> function is not used in any CHECK constraint.  (Aside from the cost,
> there would be race conditions.)
>
> But maybe we don't have to enforce it exactly.  It could be on the
> user's head that the marking is accurate.  We could prevent any
> really bad misbehavior by having parallel workers error out if they
> see they've been asked to execute a non-parallel-safe function.
>
> Or there are probably other ways to slice it up.  But I think some
> outside-the-box thinking might be helpful here.

It's a possibility. It's a bit different from my decision to mark
functions as PARALLEL SAFE/RESTRICTED/UNSAFE because, if you wanted to
deduce that without an explicit marking, you'd need to solve the
halting problem, which I've heard is fairly difficult. In this case,
though, the problem is solvable with a linear-time algorithm. It's
possible that there is nothing better, but I'm not sure.

One idea would be to try to cache some state in shared memory. That
wouldn't work for an unbounded number of relations, at least not
unless we used DSA, but you could have a hash table with a
configurable number of slots and make the default big enough that it
would bother few people in practice. There might be some other details
about partitioned relations that would be useful to cache, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Robert Haas
Date:
On Wed, Mar 24, 2021 at 12:48 AM Andres Freund <andres@anarazel.de> wrote:
> That seems like an argument for a pg_class attribute about parallel
> safety of the current relation *and* its children. It'd definitely mean
> recursing higher in the partition tree during DDL if the action on a
> child partition causes the safety to flip.

That could be an issue, both because locking parents is not very nice
from a concurrency standpoint, and also because it introduces deadlock
hazards.

> Although this specific hack doesn't seem too terrible to me. If you
> execute a parallel insert the likelihood to end up not needing an xid is
> pretty low. Implementing it concurrently does seem like it'd end up
> needing another lwlock nested around xid assignment, or some more
> complicated scheme with already holding XidGenLock or retries. But maybe
> I'm missing an easy solution here.

I don't think you need to do anything that is known outside the group
of processes involved in the parallel query. I think you just need to
make sure that only one of them is trying to acquire an XID at a time,
and that all the others find out about it. I haven't thought too hard
about the timing: if one process acquires an XID for the transaction,
is it OK if the others do an arbitrary amount of work before they
realize that this has happened? Also, there's the problem that the
leader has the whole transaction stack and the workers don't, so the
recursive nature of XID acquisition is a problem. I suspect these are
all pretty solvable problems; I just haven't put in the energy. But,
it could also be that I'm missing something.

I think we should be trying, though. Some of the limitations of
parallel query are unavoidable: there's always going to be a certain
amount of parallel-unsafe stuff out there, and we just have to not use
parallelism in those cases. But, some things - and I think this is
probably one of them - are just limitations of the current
implementation, and we should be looking to fix those. If we just
accept that the infrastructure limitations are what they are and skate
around them to make a few more things work, we're going to get less
and less real improvement from every new project, and any future
infrastructure improvements that somebody does want to make are going
to have to deal with all the odd special cases we've introduced to get
those improvements.

Now, just to be clear, I'm completely in favor of incremental
improvements in cases where that can be done without too much
ugliness. There have been some great examples of that with parallel
query already, like the work David Rowley did to allow parallel
sequential scans to allocate multiple blocks at a time, or, looking
back further, his work on parallel aggregate. I'm not saying those
projects were easy; I know they were hard. But, they could basically
use the infrastructure that had been created by previous commits to
new stuff in a way that was reasonably localized. But, when Thomas
Munro worked on parallel hash join, he had to first invent a whole new
way of allocating memory. Trying to do that project without inventing
a new way to allocate memory would have been a really bad decision. It
would have boxed us into all sorts of unpleasant corners where a lot
of stuff didn't really work and finding any further improvements was
hard. But, because he did invent that new way, it can now be used for
lots of other things, and already has been. That infrastructure work
made future projects *easier*. And one of the big problems that I have
with this feature is that, as it seems to me right now, there wasn't
any real new infrastructure built, even though doing this right really
requires it.

I think there is room for debate about whether the particular thing
that I'm complaining about here is mandatory for this feature,
although I am strongly of the position that should be tried. We might
consider forcing an XID assignment to be acceptable for a bulk-insert
case, but what happens when somebody wants to extend this to updates
and deletes? There's going to be a strong temptation to just double
down on the same design, and that's a very dangerous direction in my
opinion. But, even if you don't think that this *particular*
infrastructure improvement ought to be the job of this patch, I think
this patch clearly needed to do more infrastructure improvement than
it did. Your comments about the IndexOptInfo stuff are another
example, and the rewriter bug is a third.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



RE: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Robert Haas <robertmhaas@gmail.com>
> The amount of code isn't the issue. I'd rather expend a little more
> code and solve the problem in a better way.

Reading your reply to Andres-san, I feel sympathy about your attitude.  Maybe we should outline the (rough) design
first,discuss/guess its complexity, and ask for opinions on whether it's worth expending our effort for the complexity
orchoose an easier hack, considering the assumed use case and ensuring future extensibility not to prevent smooth
enhancements.

With that said, I think the easy hack this time is good, because parallel INSERT/UPDATE/DELETE will only be used
infrequently(compared to short OLTP transactions) for data migration and batch processing where the source data is
ample. I don't think of a use case yet where we want to make parallel workers allocate an XID when only necessary.
 


> I did not mean to give offense, but I also don't think grotty is a
> strong word. I consider it a pretty mild word.

Thank you for telling me the impression of the word.  TBH, I knew the word for the first time, and looked it up in
English-Japanesedictionary. It didn't give good impression, and grotty sounds somewhat similar to grotesque, so I got
worried"Robert-san may be angry."
 


    Regards
Takayuki     Tsunakawa
                        


Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Wed, Mar 24, 2021 at 5:45 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 24, 2021 at 12:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > How about a declarative approach instead?  That is, if a user would
> > like parallelized inserts into a partitioned table, she must declare
> > the table parallel-safe with some suitable annotation.  Then, checking
> > the property during DML is next door to free, and instead we have to think
> > about whether and how to enforce that the marking is valid during DDL.
> >
> > I don't honestly see a real cheap way to enforce such a property.
> > For instance, if someone does ALTER FUNCTION to remove a function's
> > parallel-safe marking, we can't really run around and verify that the
> > function is not used in any CHECK constraint.  (Aside from the cost,
> > there would be race conditions.)
> >
> > But maybe we don't have to enforce it exactly.  It could be on the
> > user's head that the marking is accurate.  We could prevent any
> > really bad misbehavior by having parallel workers error out if they
> > see they've been asked to execute a non-parallel-safe function.
> >

If we want to do something like this then we might want to provide a
function is_dml_rel_parallel_safe(relid/relname) (a better name for a
function could be used) where we can check all global properties of
relation and report whether it is safe or not to perform dml and
additionally we can report the unsafe property if it is unsafe? This
will provide a way for users to either update the parallel-safe
property of a relation or do something about the parallel-unsafe
property of the relation.

> > Or there are probably other ways to slice it up.  But I think some
> > outside-the-box thinking might be helpful here.
>
> It's a possibility. It's a bit different from my decision to mark
> functions as PARALLEL SAFE/RESTRICTED/UNSAFE because, if you wanted to
> deduce that without an explicit marking, you'd need to solve the
> halting problem, which I've heard is fairly difficult. In this case,
> though, the problem is solvable with a linear-time algorithm. It's
> possible that there is nothing better, but I'm not sure.
>
> One idea would be to try to cache some state in shared memory. That
> wouldn't work for an unbounded number of relations, at least not
> unless we used DSA, but you could have a hash table with a
> configurable number of slots and make the default big enough that it
> would bother few people in practice. There might be some other details
> about partitioned relations that would be useful to cache, too.
>

Wouldn't we need to invalidate the hash entries as soon as something
parallel-unsafe is associated with them? If so, how is this better
than setting a flag in relcache?

-- 
With Regards,
Amit Kapila.



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Robert Haas
Date:
On Thu, Mar 25, 2021 at 5:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Wouldn't we need to invalidate the hash entries as soon as something
> parallel-unsafe is associated with them?

Yes.

> If so, how is this better
> than setting a flag in relcache?

You can't legally access a flag in the relcache without taking a
relation lock. If it's possible to avoid that requirement by doing
this some other way, it would be a big win. I'm not sure whether it is
or exactly what would be involved, but relying on the relcache/sinval
interaction surely won't work.

Also, it would address the concern Takayuki-san raised about having to
recompute this in each session. I don't think it would be worth doing
for that reason alone, but it could be a side benefit.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Thu, Mar 25, 2021 at 11:45 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Mar 25, 2021 at 5:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Wouldn't we need to invalidate the hash entries as soon as something
> > parallel-unsafe is associated with them?
>
> Yes.
>
> > If so, how is this better
> > than setting a flag in relcache?
>
> You can't legally access a flag in the relcache without taking a
> relation lock.
>

But won't some form of lock is required for each rel entry in the hash
table as well for the same duration as is required for rel? Because
otherwise, while we are processing the statement or other relations in
the query, something parallel-unsafe could be attached to that
corresponding rel entry in the hash table. And, I feel probably some
concurrency bottleneck might happen because DDL/DML needs to access
this table at the same time.

-- 
With Regards,
Amit Kapila.



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Wed, Mar 24, 2021 at 12:14 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-03-22 08:47:47 -0400, Robert Haas wrote:
> > I find this fairly ugly. If you can't make the cost of checking
> > whether parallelism is safe low enough that you don't need a setting
> > for this, then I think perhaps you shouldn't have the feature at all.
> > In other words, I propose that you revert both this and 05c8482f7f and
> > come back when you have a better design that doesn't introduce so much
> > overhead.
>
> I started out wondering whether some of the caching David Rowley has been
> working on to reduce the overhead of the result cache planning shows a
> path for how to make parts of this cheaper.
> https://postgr.es/m/CAApHDvpw5qG4vTb%2BhwhmAJRid6QF%2BR9vvtOhX2HN2Yy9%2Bw20sw%40mail.gmail.com
>
> But looking more, several of the checks just seem wrong to me.
>
> target_rel_index_max_parallel_hazard() deparses index expressions from
> scratch? With code like
>
> +        index_rel = index_open(index_oid, lockmode);
> ...
> +    index_oid_list = RelationGetIndexList(rel);
> +    foreach(lc, index_oid_list)
> ...
> +        ii_Expressions = RelationGetIndexExpressions(index_rel);
> ...
>
> +                    Assert(index_expr_item != NULL);
> +                    if (index_expr_item == NULL)    /* shouldn't happen */
> +                    {
> +                        elog(WARNING, "too few entries in indexprs list");
> +                        context->max_hazard = PROPARALLEL_UNSAFE;
> +                        found_max_hazard = true;
> +                        break;
> +                    }
>
> Brrr.
>
> Shouldn't we have this in IndexOptInfo already?
>

No, because I think we don't build IndexOptInfo for target tables
(tables in which insert is performed). It is only built for tables to
scan. However, I think here we could cache parallel-safety info in the
index rel descriptor and can use it for next time. This will avoid
checking expressions each time. Do you have something else in mind?

-- 
With Regards,
Amit Kapila.



RE: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Robert Haas <robertmhaas@gmail.com>
> On Wed, Mar 24, 2021 at 12:48 AM Andres Freund <andres@anarazel.de>
> wrote:
> > Although this specific hack doesn't seem too terrible to me. If you
> > execute a parallel insert the likelihood to end up not needing an xid is
> > pretty low. Implementing it concurrently does seem like it'd end up
> > needing another lwlock nested around xid assignment, or some more
> > complicated scheme with already holding XidGenLock or retries. But maybe
> > I'm missing an easy solution here.
> 
> I don't think you need to do anything that is known outside the group
> of processes involved in the parallel query. I think you just need to
> make sure that only one of them is trying to acquire an XID at a time,
> and that all the others find out about it. I haven't thought too hard
> about the timing: if one process acquires an XID for the transaction,
> is it OK if the others do an arbitrary amount of work before they
> realize that this has happened? Also, there's the problem that the
> leader has the whole transaction stack and the workers don't, so the
> recursive nature of XID acquisition is a problem. I suspect these are
> all pretty solvable problems; I just haven't put in the energy. But,
> it could also be that I'm missing something.

It doesn't seem easy to make parallel workers allocate an XID and share it among the parallel processes.  When the DML
isrun inside a deeply nested subtransaction and the parent transactions have not allocated their XIDs yet, the worker
needsto allocate the XIDs for its parents.  That indeterminate number of XIDs must be stored in shared memory.  The
stackof TransactionState structures must also be passed.  Also, TransactionIdIsCurrentTransactionId() uses an array
ParallelCurrentXidswhere parallel workers receive sub-committed XIDs from the leader.  This needs to be reconsidered.
 

Before that, I don't see the need for parallel workers to allocate the XID.  As the following Oracle manual says,
parallelDML will be used in data analytics and OLTP batch jobs.  There should be plenty of source data in those
scenarios.


When to Use Parallel DML

https://docs.oracle.com/en/database/oracle/oracle-database/21/vldbg/types-parallelism.html#GUID-18B2AF09-C548-48DE-A794-86224111549F
--------------------------------------------------
Several scenarios where parallel DML is used include:

Refreshing Tables in a Data Warehouse System

Creating Intermediate Summary Tables

Using Scoring Tables

Updating Historical Tables

Running Batch Jobs
--------------------------------------------------


I don't mean to say we want to use the easy hack as we want to be lazy.  I'd like to know whether we *really* need the
effort. And I want PostgreSQL to provide great competitive features as early as possible without messing up the design
andcode.
 

For what kind of realistic conceivable scenarios do we need the sophisticated XID assignment mechanism in parallel
workers?


Regards
Takayuki Tsunakawa



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Wed, Mar 24, 2021 at 6:18 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 24, 2021 at 12:48 AM Andres Freund <andres@anarazel.de> wrote:
>
> > Although this specific hack doesn't seem too terrible to me. If you
> > execute a parallel insert the likelihood to end up not needing an xid is
> > pretty low. Implementing it concurrently does seem like it'd end up
> > needing another lwlock nested around xid assignment, or some more
> > complicated scheme with already holding XidGenLock or retries. But maybe
> > I'm missing an easy solution here.
>
> I don't think you need to do anything that is known outside the group
> of processes involved in the parallel query. I think you just need to
> make sure that only one of them is trying to acquire an XID at a time,
> and that all the others find out about it. I haven't thought too hard
> about the timing: if one process acquires an XID for the transaction,
> is it OK if the others do an arbitrary amount of work before they
> realize that this has happened?
>

A naive question about this scheme: What if the worker that acquires
the XID writes some row and another worker reads that row before it
gets to see the XID information? I think it won't treat such a row is
written by its own transaction. Won't such a scheme lead to different
behavior than serial inserts or where we have acquired XID before
starting parallel-operation?

-- 
With Regards,
Amit Kapila.



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Robert Haas
Date:
On Mon, Mar 29, 2021 at 6:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> A naive question about this scheme: What if the worker that acquires
> the XID writes some row and another worker reads that row before it
> gets to see the XID information? I think it won't treat such a row is
> written by its own transaction. Won't such a scheme lead to different
> behavior than serial inserts or where we have acquired XID before
> starting parallel-operation?

Well, this is the sort of thing that somebody would need to analyze as
part of implementing something like this, but off hand I don't quite
see what the problem is.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Robert Haas
Date:
On Thu, Mar 25, 2021 at 11:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> But won't some form of lock is required for each rel entry in the hash
> table as well for the same duration as is required for rel? Because
> otherwise, while we are processing the statement or other relations in
> the query, something parallel-unsafe could be attached to that
> corresponding rel entry in the hash table. And, I feel probably some
> concurrency bottleneck might happen because DDL/DML needs to access
> this table at the same time.

Hmm, yeah. I hadn't thought of that, but you're right: it's not good
enough to determine that there's no problem at the time we start the
query, because somebody could create a problem for a partition we
haven't yet locked before we get around to locking it. This is
something that really deserves a separate discussion rather than being
buried on a thread on -committers with a bunch of other topics, but in
my opinion we probably shouldn't choose to handle this by adding more
locking. The need to lock everything all the time any time we do
anything is part of what sucks in this whole area, and we don't want
to just keep propagating that. I don't know whether it would be OK to
just document that if you concurrently add parallel-unsafe stuff to a
table which is a partition while parallel inserts are going on, you
might cause the insert to error out, but I don't think it's a
completely crazy idea.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Tue, Mar 30, 2021 at 12:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Mar 25, 2021 at 11:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > But won't some form of lock is required for each rel entry in the hash
> > table as well for the same duration as is required for rel? Because
> > otherwise, while we are processing the statement or other relations in
> > the query, something parallel-unsafe could be attached to that
> > corresponding rel entry in the hash table. And, I feel probably some
> > concurrency bottleneck might happen because DDL/DML needs to access
> > this table at the same time.
>
> Hmm, yeah. I hadn't thought of that, but you're right: it's not good
> enough to determine that there's no problem at the time we start the
> query, because somebody could create a problem for a partition we
> haven't yet locked before we get around to locking it. This is
> something that really deserves a separate discussion rather than being
> buried on a thread on -committers with a bunch of other topics,
>

Agreed, we will summarize all the discussion and ideas by either
starting a new thread (s) or on the existing thread on hackers.
Sometime back, I have started a thread [1] with an intention to just
discuss this part of the puzzle but at that time it didn't get as much
attention as was required but I am hoping this time the situation
might improve.

> but in
> my opinion we probably shouldn't choose to handle this by adding more
> locking. The need to lock everything all the time any time we do
> anything is part of what sucks in this whole area, and we don't want
> to just keep propagating that. I don't know whether it would be OK to
> just document that if you concurrently add parallel-unsafe stuff to a
> table which is a partition while parallel inserts are going on, you
> might cause the insert to error out, but I don't think it's a
> completely crazy idea.
>

Yeah, this idea has merit and to some extent, the same is true for
what Tom, Andres, or Tsunakawa-San is proposing to keep a flag in
pg_class in someway and or have a declarative syntax for
parallel-safety for relations. Doing things on such lines might take
the complexity of locking partitions and maybe users also don't mind
if they occasionally get an error due to concurrently adding a
parallel-unsafe clause on one of the relations. I guess we can
mitigate the user's concerns to some degree by providing a function
like is_dml_rel_parallel_safe(relid/relname) where she can check all
global properties of relation and report whether it is safe or not to
perform dml and additionally we can report the unsafe property if it
is unsafe.

[1] - https://www.postgresql.org/message-id/CAA4eK1K-cW7svLC2D7DHoGHxdAdg3P37BLgebqBOC2ZLc9a6QQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Tue, Mar 30, 2021 at 12:43 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Mar 29, 2021 at 6:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > A naive question about this scheme: What if the worker that acquires
> > the XID writes some row and another worker reads that row before it
> > gets to see the XID information? I think it won't treat such a row is
> > written by its own transaction. Won't such a scheme lead to different
> > behavior than serial inserts or where we have acquired XID before
> > starting parallel-operation?
>
> Well, this is the sort of thing that somebody would need to analyze as
> part of implementing something like this, but off hand I don't quite
> see what the problem is.
>

Say, the worker (w-1) that acquires the XID (501) deletes the tuple
(CTID: 0, 2). Now, another worker (w-2) reads that tuple (CTID: 0, 2),
I think it would consider that the tuple is still visible to its
snapshot but if the w-2 knows that 501 is its own XID, it would have
been considered it as (not-visible) deleted. Now, if I am not missing
anything here, I think this can happen when multiple updates to the
same row happen and new rows get added to the new page.

Even, if there are no hazards with the new XID machinery, I am not
sure if we want to consider this as a pre-requisite to enable Inserts
for parallel-select work. We might want to focus on partitioning and
other caching-related stuff to rework this implementation. We might
want to consider this new XID sharing machinery for actual parallel
Inserts/Updates/Deletes if we think it is important to enabling those
parallel operations.

-- 
With Regards,
Amit Kapila.



Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
Amit Kapila
Date:
On Fri, Mar 26, 2021 at 11:07 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 24, 2021 at 12:14 AM Andres Freund <andres@anarazel.de> wrote:
> >
> >
> > Shouldn't we have this in IndexOptInfo already?
> >
>
> No, because I think we don't build IndexOptInfo for target tables
> (tables in which insert is performed). It is only built for tables to
> scan. However, I think here we could cache parallel-safety info in the
> index rel descriptor and can use it for next time. This will avoid
> checking expressions each time.
>

On further investigation, it seems we can't rely on the cached
information of parallel-safety in rel descriptor because we don't
invalidate it if someone changes function property say from safe to
unsafe or vice-versa. We need to think of some other idea here, simply
caching rel descriptors won't work. I think we need to further discuss
this topic on -hackers.

-- 
With Regards,
Amit Kapila.



RE: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From
"tsunakawa.takay@fujitsu.com"
Date:
Hello Tom-san, Robert-san, Andres-san, others,


I've just started the following thread.

Parallel INSERT SELECT take 2
https://www.postgresql.org/message-id/TYAPR01MB29905A9AB82CC8BA50AB0F80FE709%40TYAPR01MB2990.jpnprd01.prod.outlook.com

I'm worried that this might not be the right time to ask this now because you might be busy with organizing the beta
release,but we'd appreciate your comments and suggestions to set and go in the promising direction.  Unfortunately, we
couldn'tget attention from you for this difficult parallelism topic, and failed to commit it.  We'd like to achieve
parallelDML in PG 15.  I'd be grateful if you could give us advice not to lose time and effort in the wrong direction.
 


Regards
Takayuki Tsunakawa


On Mon, Mar 29, 2021 at 11:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Say, the worker (w-1) that acquires the XID (501) deletes the tuple
> (CTID: 0, 2). Now, another worker (w-2) reads that tuple (CTID: 0, 2),
> I think it would consider that the tuple is still visible to its
> snapshot but if the w-2 knows that 501 is its own XID, it would have
> been considered it as (not-visible) deleted.

Not unless the command counter had been advanced, which shouldn't be
happening anyway.

I think it's probably fundamentally impossible to allow the command
counter to advance during a parallel operation. The whole point of the
command counter mechanism is that you finish one body of work, then
bump the command counter so that all the changes become visible before
you start doing the next set of things. That doesn't really make sense
in a parallel context, and not just because of any difficulty of
synchronizing the value. It's actually trying to make sure that we
finish X first and only then do Y, which is opposed to what
parallelism is all about.

But I don't think there's anything more than incidental preventing us
from assigning an XID in a parallel operation. Contrary to what you
allege here, assigning an XID doesn't change the set of tuples we can
see - nor do any inserts, updates, or deletes it performs afterward,
until such time as the command counter is bumped. We would need to
make sure that two different participants in the parallel operation
don't both get an XID, because we don't want the transaction to end up
with two different XIDs. We want the second one to instead discover
the XID that the other one got. But that's not a definitional
question, like with the command counter. It's "just" a programming
problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com