Thread: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18314 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 16.1 Operating system: Ubuntu 22.04 Description: When planner chooses to perform parallel index build, it doesn't check whether functions used in an index expression or predicate are parallel-safe. For example: CREATE FUNCTION f() RETURNS int IMMUTABLE PARALLEL UNSAFE AS $$ BEGIN RETURN 0; EXCEPTION WHEN OTHERS THEN RETURN 1; END$$ LANGUAGE plpgsql; CREATE TABLE t(i int); INSERT INTO t SELECT g FROM generate_series(1, 300000) g; CREATE INDEX ON t((i + f())); results in: ERROR: cannot start subtransactions during a parallel operation CONTEXT: PL/pgSQL function f() line 2 during statement block entry parallel worker Although with 200000 records, the index created with no error. Reproduced on REL_12_STABLE .. master.
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Tender Wang
Date:
Thanks for reporting this issue.
I debug the process of creating index, and in plan_create_index_workers() func, which use the planner to
decide if use parallel building index.
But the index expression passed to is_parallel_safe() has already simplified by RelationGetIndexExpressions() func.
In this case, FunExpr node will be simplified to Const node. So the check in is_parallel_safe() will return true.
The goal of plan_create_index_workers() is to decide whether it's safe to build index parallelly. I think we just check the raw index
expression no need to simplify the index expression.
PG Bug reporting form <noreply@postgresql.org> 于2024年1月29日周一 19:02写道:
The following bug has been logged on the website:
Bug reference: 18314
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16.1
Operating system: Ubuntu 22.04
Description:
When planner chooses to perform parallel index build, it doesn't
check whether functions used in an index expression or predicate are
parallel-safe. For example:
CREATE FUNCTION f() RETURNS int IMMUTABLE PARALLEL UNSAFE
AS $$
BEGIN
RETURN 0;
EXCEPTION WHEN OTHERS THEN
RETURN 1;
END$$ LANGUAGE plpgsql;
CREATE TABLE t(i int);
INSERT INTO t SELECT g FROM generate_series(1, 300000) g;
CREATE INDEX ON t((i + f()));
results in:
ERROR: cannot start subtransactions during a parallel operation
CONTEXT: PL/pgSQL function f() line 2 during statement block entry
parallel worker
Although with 200000 records, the index created with no error.
Reproduced on REL_12_STABLE .. master.
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Tender Wang
Date:
In pg document, I found this:
Also, a block containing an
EXCEPTION
clause effectively forms a subtransaction that can be rolled back without affecting the outer transaction.So it will report error for this case even though we replace PARALLEL UNSAFE with PARALLEL SAFE.
I think if PARALLEL UNSAFE is specified by users, PG internal should not choose to build index parallel, even if the function is
too simply that can be transform to Const node.
Attached patch will fix Alexander reported issue. I choose pass the raw index expression to is_parallel_safe().
The comments of RelationGetIndexExpressions() doesn't say it will return optimized expression. So I add a bool argument to RelationGetIndexExpressions().
I don't decide to write a new function like RelationGetIndexRawExpression. I think the RelationGetIndexExpressions() is enough after adding a bool argument to indicate
whether caller needing a raw index expression.
But if users specify PARALLEL SAFE, like I said before, it also reports error. But I think it is another thing. Maybe it is reasonable?
Tender Wang <tndrwang@gmail.com> 于2024年1月29日周一 22:36写道:
Thanks for reporting this issue.I debug the process of creating index, and in plan_create_index_workers() func, which use the planner todecide if use parallel building index.But the index expression passed to is_parallel_safe() has already simplified by RelationGetIndexExpressions() func.In this case, FunExpr node will be simplified to Const node. So the check in is_parallel_safe() will return true.The goal of plan_create_index_workers() is to decide whether it's safe to build index parallelly. I think we just check the raw indexexpression no need to simplify the index expression.PG Bug reporting form <noreply@postgresql.org> 于2024年1月29日周一 19:02写道:The following bug has been logged on the website:
Bug reference: 18314
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16.1
Operating system: Ubuntu 22.04
Description:
When planner chooses to perform parallel index build, it doesn't
check whether functions used in an index expression or predicate are
parallel-safe. For example:
CREATE FUNCTION f() RETURNS int IMMUTABLE PARALLEL UNSAFE
AS $$
BEGIN
RETURN 0;
EXCEPTION WHEN OTHERS THEN
RETURN 1;
END$$ LANGUAGE plpgsql;
CREATE TABLE t(i int);
INSERT INTO t SELECT g FROM generate_series(1, 300000) g;
CREATE INDEX ON t((i + f()));
results in:
ERROR: cannot start subtransactions during a parallel operation
CONTEXT: PL/pgSQL function f() line 2 during statement block entry
parallel worker
Although with 200000 records, the index created with no error.
Reproduced on REL_12_STABLE .. master.--Tender WangOpenPie: https://en.openpie.com/
Tender Wang
OpenPie: https://en.openpie.com/Attachment
On Wed, Feb 7, 2024 at 7:54 PM Tender Wang <tndrwang@gmail.com> wrote: > > In pg document, I found this: > > Also, a block containing an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affectingthe outer transaction. > > So it will report error for this case even though we replace PARALLEL UNSAFE with PARALLEL SAFE. > > I think if PARALLEL UNSAFE is specified by users, PG internal should not choose to build index parallel, even if the functionis > too simply that can be transform to Const node. > > Attached patch will fix Alexander reported issue. I choose pass the raw index expression to is_parallel_safe(). > The comments of RelationGetIndexExpressions() doesn't say it will return optimized expression. So I add a bool argumentto RelationGetIndexExpressions(). > I don't decide to write a new function like RelationGetIndexRawExpression. I think the RelationGetIndexExpressions() isenough after adding a bool argument to indicate > whether caller needing a raw index expression. > > But if users specify PARALLEL SAFE, like I said before, it also reports error. But I think it is another thing. Maybe itis reasonable? > /home/jian/Downloads/patches/0001-Fix-parallel-index-build-failed.patch:273: trailing whitespace. BEGIN warning: 1 line adds whitespace errors. +CREATE TABLE btree_para_bld(i int); +INSERT INTO btree_para_bld SELECT g FROM generate_series(1, 300000) g; +CREATE INDEX ON btree_para_bld((i + para_unsafe_f())); you don't need an insert of 300000 rows to test it. you can just CREATE TABLE btree_para_bld(i int); ALTER TABLE btree_para_bld SET (parallel_workers = 4); SET max_parallel_maintenance_workers TO 4; CREATE INDEX ON btree_para_bld(i) WHERE i > para_unsafe_f(); reset max_parallel_maintenance_workers; demo: https://dbfiddle.uk/HpLKN6zj
On Wed, Feb 7, 2024 at 11:43 PM jian he <jian.universality@gmail.com> wrote: > > On Wed, Feb 7, 2024 at 7:54 PM Tender Wang <tndrwang@gmail.com> wrote: > > > > In pg document, I found this: > > > > Also, a block containing an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affectingthe outer transaction. > > > > So it will report error for this case even though we replace PARALLEL UNSAFE with PARALLEL SAFE. > > > > I think if PARALLEL UNSAFE is specified by users, PG internal should not choose to build index parallel, even if thefunction is > > too simply that can be transform to Const node. > > > > Attached patch will fix Alexander reported issue. I choose pass the raw index expression to is_parallel_safe(). > > The comments of RelationGetIndexExpressions() doesn't say it will return optimized expression. So I add a bool argumentto RelationGetIndexExpressions(). > > I don't decide to write a new function like RelationGetIndexRawExpression. I think the RelationGetIndexExpressions()is enough after adding a bool argument to indicate > > whether caller needing a raw index expression. > > > > But if users specify PARALLEL SAFE, like I said before, it also reports error. But I think it is another thing. Maybeit is reasonable? > > based on index_build's ereport(DEBUG1...) output information you patch disabled parallel index build for parallel safe function. after apply your patch: set log_min_messages to debug1; begin; drop table if exists btree_para_bld; CREATE TABLE btree_para_bld(i int); ALTER TABLE btree_para_bld SET (parallel_workers = 4); SET local max_parallel_maintenance_workers TO 4; CREATE or replace FUNCTION para_unsafe_f1() RETURNS int IMMUTABLE PARALLEL UNSAFE AS $$ BEGIN RETURN 0; END$$ LANGUAGE plpgsql; CREATE or replace FUNCTION para_safe() RETURNS int IMMUTABLE PARALLEL UNSAFE AS $$ BEGIN RETURN 0; END$$ LANGUAGE plpgsql; CREATE INDEX ON btree_para_bld((para_unsafe_f1())); CREATE INDEX ON btree_para_bld((para_safe())); commit; debug information: 2024-02-08 14:11:26.664 CST [33245] DEBUG: building index "btree_para_bld_para_unsafe_f1_idx" on table "btree_para_bld" serially 2024-02-08 14:11:26.664 CST [33245] DEBUG: index "btree_para_bld_para_unsafe_f1_idx" can safely use deduplication CREATE INDEX 2024-02-08 14:11:26.666 CST [33245] DEBUG: building index "btree_para_bld_para_safe_idx" on table "btree_para_bld" serially 2024-02-08 14:11:26.666 CST [33245] DEBUG: index "btree_para_bld_para_safe_idx" can safely use deduplication CREATE INDEX the problem entry point is_parallel_safe, i think.
On Thu, Feb 8, 2024 at 2:17 PM jian he <jian.universality@gmail.com> wrote: > > On Wed, Feb 7, 2024 at 11:43 PM jian he <jian.universality@gmail.com> wrote: > > > > On Wed, Feb 7, 2024 at 7:54 PM Tender Wang <tndrwang@gmail.com> wrote: > > > > > > In pg document, I found this: > > > > > > Also, a block containing an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affectingthe outer transaction. > > > > > > So it will report error for this case even though we replace PARALLEL UNSAFE with PARALLEL SAFE. > > > > > > I think if PARALLEL UNSAFE is specified by users, PG internal should not choose to build index parallel, even if thefunction is > > > too simply that can be transform to Const node. > > > > > > Attached patch will fix Alexander reported issue. I choose pass the raw index expression to is_parallel_safe(). > > > The comments of RelationGetIndexExpressions() doesn't say it will return optimized expression. So I add a bool argumentto RelationGetIndexExpressions(). > > > I don't decide to write a new function like RelationGetIndexRawExpression. I think the RelationGetIndexExpressions()is enough after adding a bool argument to indicate > > > whether caller needing a raw index expression. > > > > > > But if users specify PARALLEL SAFE, like I said before, it also reports error. But I think it is another thing. Maybeit is reasonable? > > > I think it's reasonable. It's the function owner's responsibility to specify PARALLEL safety correctly. > after apply your patch: > set log_min_messages to debug1; > begin; > drop table if exists btree_para_bld; > CREATE TABLE btree_para_bld(i int); > ALTER TABLE btree_para_bld SET (parallel_workers = 4); > SET local max_parallel_maintenance_workers TO 4; > CREATE or replace FUNCTION para_unsafe_f1() RETURNS int IMMUTABLE > PARALLEL UNSAFE > AS $$ > BEGIN > RETURN 0; > END$$ LANGUAGE plpgsql; > > CREATE or replace FUNCTION para_safe() RETURNS int IMMUTABLE PARALLEL UNSAFE > AS $$ > BEGIN > RETURN 0; > END$$ LANGUAGE plpgsql; > CREATE INDEX ON btree_para_bld((para_unsafe_f1())); > CREATE INDEX ON btree_para_bld((para_safe())); > commit; > > debug information: > 2024-02-08 14:11:26.664 CST [33245] DEBUG: building index > "btree_para_bld_para_unsafe_f1_idx" on table "btree_para_bld" serially > 2024-02-08 14:11:26.664 CST [33245] DEBUG: index > "btree_para_bld_para_unsafe_f1_idx" can safely use deduplication > CREATE INDEX > 2024-02-08 14:11:26.666 CST [33245] DEBUG: building index > "btree_para_bld_para_safe_idx" on table "btree_para_bld" serially > 2024-02-08 14:11:26.666 CST [33245] DEBUG: index > "btree_para_bld_para_safe_idx" can safely use deduplication > CREATE INDEX > > the problem entry point is_parallel_safe, i think. I am sorry, this part was wrong. Your patch works perfectly! if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP || !is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index, false)) || !is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index, false))) { parallel_workers = 0; goto done; } add more comments in above code would be perfect. As I mentioned before, your test case can be simplified.
Hi. one minor issue. -RelationGetIndexPredicate(Relation relation) +RelationGetIndexPredicate(Relation relation, bool need_modify) need_modify name may not be that good IMHO, but so far I can only come up with are "raw_expr", "is_raw".
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Tender Wang
Date:
jian he <jian.universality@gmail.com> 于2024年2月27日周二 18:17写道:
Hi.
one minor issue.
-RelationGetIndexPredicate(Relation relation)
+RelationGetIndexPredicate(Relation relation, bool need_modify)
need_modify name may not be that good IMHO, but so far I can only come
up with are "raw_expr", "is_raw".
I'm sorry for the delayed response. Thanks for your reviewing, and the new v2 patch include your advices.
As no others reply to this patch. I would add it to the 2024-03 commitfest.
--
--
Tender Wang
OpenPie: https://en.openpie.com/Attachment
On Thu, Feb 29, 2024 at 3:08 PM Tender Wang <tndrwang@gmail.com> wrote: > > Hi jian he, > I'm sorry for the delayed response. Thanks for your reviewing, and the new v2 patch include your advices. > > As no others reply to this patch. I would add it to the 2024-03 commitfest. > -- ` if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP || !is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index, true)) || !is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index, true))) { parallel_workers = 0; goto done; } ` I refactored the comments for the code above. Feel free to change it. I also toggled {RelationGetIndexExpressions, RelationGetIndexPredicate}'s second argument from false to true where it's false, true to false where it's true. I guess it's more intuitive to do ` if (get_raw_expr) return result; ` since at that time, the `result` is a raw expression. Overall it looks good.
Attachment
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Michael Paquier
Date:
On Thu, Feb 29, 2024 at 05:17:41PM +0800, jian he wrote: > if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP || > !is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index, true)) || > !is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index, true))) > { > parallel_workers = 0; > goto done; > } Interesting bug. > Overall it looks good. I disagree with this idea. Your patch is creating a shortcut in the relcache code to retrieve data about index predicates and expressions that make it bypass the flattening that would be done in eval_const_expressions(). Giving the option to bypass that in the relcache is a very bad idea, because flattening of the expressions for the planner is not an option: it must happen. So, I fear that your patch could cause bugs if we introduce new code that calls RelationGetIndexExpressions() with an incorrect option. Not documenting the new option is not acceptable either, with only one comment added at the top of plan_create_index_workers(). On top of that, your approach has two bugs: RelationGetIndexExpressions and RelationGetIndexPredicate would return the flattened information if available directly from the relcache, so even if you pass get_raw_expr=true you may finish with flattened expressions and/or predicates, facing the same issues. I think that the correct way to do that would be to get the information from the syscache when calculating the number of parallel workers to use for the parallel index builds, with SysCacheGetAttr(), for example. That would ensure that the expressions are not flattened, letting the relcache be. By the way, this reminds me of 7cce159349cc and its thread for a bug fix that I did for REINDEX CONCURRENTLY a couple of years ago. The mistake is mostly the same: https://www.postgresql.org/message-id/CA%2Bu7OA5Hp0ra235F3czPom_FyAd-3%2BXwSJmX95r1%2BsRPOJc9VQ%40mail.gmail.com -- Michael
Attachment
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年3月5日周二 11:45写道:
On Thu, Feb 29, 2024 at 05:17:41PM +0800, jian he wrote:
> if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
> !is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index, true)) ||
> !is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index, true)))
> {
> parallel_workers = 0;
> goto done;
> }
Interesting bug.
> Overall it looks good.
I disagree with this idea. Your patch is creating a shortcut in the
relcache code to retrieve data about index predicates and expressions
that make it bypass the flattening that would be done in
eval_const_expressions(). Giving the option to bypass that in the
relcache is a very bad idea, because flattening of the expressions for
the planner is not an option: it must happen. So, I fear that your
patch could cause bugs if we introduce new code that calls
RelationGetIndexExpressions() with an incorrect option.
Not
documenting the new option is not acceptable either, with only one
comment added at the top of plan_create_index_workers().
Agreed! It is not friendly to others who maybe use these functions.
On top of that, your approach has two bugs:
RelationGetIndexExpressions and RelationGetIndexPredicate would return
the flattened information if available directly from the relcache, so
even if you pass get_raw_expr=true you may finish with flattened
expressions and/or predicates, facing the same issues.
Is that possible? In this case, it is CREATE INDEX, the relation in relcache should not have this index definition?
I try to create index on create table phase, but I failed. It looks like index on expression only be created using CREATE INDEX statement.
Is there something I need to learn?
I think that the correct way to do that would be to get the
information from the syscache when calculating the number of parallel
workers to use for the parallel index builds, with SysCacheGetAttr(),
for example. That would ensure that the expressions are not
flattened, letting the relcache be.
By the way, this reminds me of 7cce159349cc and its thread for a bug
fix that I did for REINDEX CONCURRENTLY a couple of years ago. The
mistake is mostly the same:
https://www.postgresql.org/message-id/CA%2Bu7OA5Hp0ra235F3czPom_FyAd-3%2BXwSJmX95r1%2BsRPOJc9VQ%40mail.gmail.com
In aboved link, I found that
"Even without that, changing the APIs of the functions you've touched
here seems pretty invasive." (Tom said that). It sounds sounds reasonable.--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年3月5日周二 11:45写道:
On Thu, Feb 29, 2024 at 05:17:41PM +0800, jian he wrote:
> if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
> !is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index, true)) ||
> !is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index, true)))
> {
> parallel_workers = 0;
> goto done;
> }
Interesting bug.
> Overall it looks good.
I disagree with this idea. Your patch is creating a shortcut in the
relcache code to retrieve data about index predicates and expressions
that make it bypass the flattening that would be done in
eval_const_expressions(). Giving the option to bypass that in the
relcache is a very bad idea, because flattening of the expressions for
the planner is not an option: it must happen. So, I fear that your
patch could cause bugs if we introduce new code that calls
RelationGetIndexExpressions() with an incorrect option. Not
documenting the new option is not acceptable either, with only one
comment added at the top of plan_create_index_workers().
On top of that, your approach has two bugs:
RelationGetIndexExpressions and RelationGetIndexPredicate would return
the flattened information if available directly from the relcache, so
even if you pass get_raw_expr=true you may finish with flattened
expressions and/or predicates, facing the same issues.
Yeah, I have carefully considered what you said. It is a bad idea to change the RelationGetIndexExpressions().
Even though it can fix this thread issue, but if new code calls RelationGetIndexExpressions() and
want to get the raw index expression, so it pass get_raw_expr=tree, it maybe get optimized index expression, that could cause some bugs.
I think that the correct way to do that would be to get the
information from the syscache when calculating the number of parallel
workers to use for the parallel index builds, with SysCacheGetAttr(),
for example. That would ensure that the expressions are not
flattened, letting the relcache be.
The fix here is to pass the raw index expression to is_parallel_safe(). Getting the information from the syscache sournds reasonable.
By the way, this reminds me of 7cce159349cc and its thread for a bug
fix that I did for REINDEX CONCURRENTLY a couple of years ago. The
mistake is mostly the same:
https://www.postgresql.org/message-id/CA%2Bu7OA5Hp0ra235F3czPom_FyAd-3%2BXwSJmX95r1%2BsRPOJc9VQ%40mail.gmail.com
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年3月5日周二 11:45写道:
On Thu, Feb 29, 2024 at 05:17:41PM +0800, jian he wrote:
> if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
> !is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index, true)) ||
> !is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index, true)))
> {
> parallel_workers = 0;
> goto done;
> }
Interesting bug.
> Overall it looks good.
I disagree with this idea. Your patch is creating a shortcut in the
relcache code to retrieve data about index predicates and expressions
that make it bypass the flattening that would be done in
eval_const_expressions(). Giving the option to bypass that in the
relcache is a very bad idea, because flattening of the expressions for
the planner is not an option: it must happen. So, I fear that your
patch could cause bugs if we introduce new code that calls
RelationGetIndexExpressions() with an incorrect option. Not
documenting the new option is not acceptable either, with only one
comment added at the top of plan_create_index_workers().
On top of that, your approach has two bugs:
RelationGetIndexExpressions and RelationGetIndexPredicate would return
the flattened information if available directly from the relcache, so
even if you pass get_raw_expr=true you may finish with flattened
expressions and/or predicates, facing the same issues.
I think that the correct way to do that would be to get the
information from the syscache when calculating the number of parallel
workers to use for the parallel index builds, with SysCacheGetAttr(),
for example. That would ensure that the expressions are not
flattened, letting the relcache be.
I refactor the v2 version patch according to your advice.
Any thoughts?
By the way, this reminds me of 7cce159349cc and its thread for a bug
fix that I did for REINDEX CONCURRENTLY a couple of years ago. The
mistake is mostly the same:
https://www.postgresql.org/message-id/CA%2Bu7OA5Hp0ra235F3czPom_FyAd-3%2BXwSJmX95r1%2BsRPOJc9VQ%40mail.gmail.com
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Attachment
On Tue, Mar 5, 2024 at 7:49 PM Tender Wang <tndrwang@gmail.com> wrote: > >> >> On top of that, your approach has two bugs: >> RelationGetIndexExpressions and RelationGetIndexPredicate would return >> the flattened information if available directly from the relcache, so >> even if you pass get_raw_expr=true you may finish with flattened >> expressions and/or predicates, facing the same issues. >> >> I think that the correct way to do that would be to get the >> information from the syscache when calculating the number of parallel >> workers to use for the parallel index builds, with SysCacheGetAttr(), >> for example. That would ensure that the expressions are not >> flattened, letting the relcache be. > > > I refactor the v2 version patch according to your advice. > Any thoughts? in RelationGetIndexClause to, I think you can use the following to save a SearchSysCache1 cycle? if (relation->rd_indextuple == NULL || heap_attisnull(relation->rd_indextuple, Anum_pg_index_indexprs, NULL)) return NIL; or if (relation->rd_indextuple == NULL || heap_attisnull(relation->rd_indextuple, Anum_pg_index_indpred, NULL)) return NIL; main question would be why not two functions, like RelationGetIndexRawExpr(Relation relation), RelationGetIndexRawPred(Relation relation)
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Michael Paquier
Date:
On Tue, Mar 05, 2024 at 09:22:08PM +0800, jian he wrote: > in RelationGetIndexClause to, I think you can use the following to > save a SearchSysCache1 cycle? > if (relation->rd_indextuple == NULL || > heap_attisnull(relation->rd_indextuple, Anum_pg_index_indexprs, NULL)) > return NIL; > > or > if (relation->rd_indextuple == NULL || > heap_attisnull(relation->rd_indextuple, Anum_pg_index_indpred, NULL)) > return NIL; Don't think so. The point is to not rely on the relcache at all to retrieve this information. > main question would be why not two functions, > like RelationGetIndexRawExpr(Relation relation), > RelationGetIndexRawPred(Relation relation) This comes down to if it is clean to have references to the catalog pg_index in the planner, which is not the case yet so my take is that two functions is much cleaner even if both return a List. Anyway, why do you insist in putting the new functions in relcache.c? I would suggest to move that to lsyscache.c instead, close to get_index_column_opclass where there are routines for the syscache of pg_index. It would be possible to reuse that in the reindex code, for example. The patch should add a comment in in plan_create_index_workers() explaining why we care about raw expressions and indexes rather than the relcache information. -- Michael
Attachment
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年3月6日周三 06:45写道:
On Tue, Mar 05, 2024 at 09:22:08PM +0800, jian he wrote:
> in RelationGetIndexClause to, I think you can use the following to
> save a SearchSysCache1 cycle?
> if (relation->rd_indextuple == NULL ||
> heap_attisnull(relation->rd_indextuple, Anum_pg_index_indexprs, NULL))
> return NIL;
>
> or
> if (relation->rd_indextuple == NULL ||
> heap_attisnull(relation->rd_indextuple, Anum_pg_index_indpred, NULL))
> return NIL;
Don't think so. The point is to not rely on the relcache at all to
retrieve this information.
> main question would be why not two functions,
> like RelationGetIndexRawExpr(Relation relation),
> RelationGetIndexRawPred(Relation relation)
This comes down to if it is clean to have references to the catalog
pg_index in the planner, which is not the case yet so my take is that
two functions is much cleaner even if both return a List.
Anyway, why do you insist in putting the new functions in relcache.c?
I would suggest to move that to lsyscache.c instead, close to
get_index_column_opclass where there are routines for the syscache of
pg_index. It would be possible to reuse that in the reindex code, for
example.
Make sense.
Please review attached v4 patch.
The patch should add a comment in in plan_create_index_workers()
explaining why we care about raw expressions and indexes rather than
the relcache information.
Added
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Attachment
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Tender Wang
Date:
Thanks for pushing the patch.
I found a typos:
/*
+ * get_index_expressions
+ *
+ * Given the index OID, its a List of its expressions or NIL if none.
+ */
+ * get_index_expressions
+ *
+ * Given the index OID, its a List of its expressions or NIL if none.
+ */
"its" should be "return"?
Tender Wang <tndrwang@gmail.com> 于2024年3月6日周三 12:30写道:
Michael Paquier <michael@paquier.xyz> 于2024年3月6日周三 06:45写道:On Tue, Mar 05, 2024 at 09:22:08PM +0800, jian he wrote:
> in RelationGetIndexClause to, I think you can use the following to
> save a SearchSysCache1 cycle?
> if (relation->rd_indextuple == NULL ||
> heap_attisnull(relation->rd_indextuple, Anum_pg_index_indexprs, NULL))
> return NIL;
>
> or
> if (relation->rd_indextuple == NULL ||
> heap_attisnull(relation->rd_indextuple, Anum_pg_index_indpred, NULL))
> return NIL;
Don't think so. The point is to not rely on the relcache at all to
retrieve this information.
> main question would be why not two functions,
> like RelationGetIndexRawExpr(Relation relation),
> RelationGetIndexRawPred(Relation relation)
This comes down to if it is clean to have references to the catalog
pg_index in the planner, which is not the case yet so my take is that
two functions is much cleaner even if both return a List.
Anyway, why do you insist in putting the new functions in relcache.c?
I would suggest to move that to lsyscache.c instead, close to
get_index_column_opclass where there are routines for the syscache of
pg_index. It would be possible to reuse that in the reindex code, for
example.Make sense.Please review attached v4 patch.
The patch should add a comment in in plan_create_index_workers()
explaining why we care about raw expressions and indexes rather than
the relcache information.Added--
Michael--Tender WangOpenPie: https://en.openpie.com/
Tender Wang
OpenPie: https://en.openpie.com/Tender Wang <tndrwang@gmail.com> writes: > Thanks for pushing the patch. My apologies for not having paid much attention to this thread earlier, but ... I don't like anything about the committed patch. It's a band-aid that doesn't really solve the problem: if I simply declare the function as PARALLEL SAFE, the failure comes right back. Of course you can argue that then I lied about the parallel safety of the function, but I think a far better answer is to make such functions actually parallel safe, which I have a patch for awaiting review in the current commitfest [1]. Moreover I think this breaks a fairly long-standing principle of the planner, namely that if we can inline or const-fold a function then its original markings no longer matter. People are likely to be surprised that that no longer applies to parallel-safety decisions, and even more surprised that it only doesn't apply in this one narrow context. I also believe that this has introduced new bugs, caused by passing is_parallel_safe() raw parse-analysis output rather than a tree that's been through eval_const_expressions. For example, that processing is responsible for inserting function default-argument expressions. If a default argument is parallel-unsafe, this coding would fail to notice that. (Compare commit 743ddafc7.) There might be other structures in an un-simplified tree that is_parallel_safe() isn't expecting and would spit up on. As far as HEAD goes, I think we should revert this and then look to get [1] committed. I'm unsure what to do in the back branches, but given the lack of prior complaints I suspect we could get away with just reverting and leaving the issue unfixed. (I wouldn't propose back-patching [1], it seems too risky.) If we want to do something about the more generic issue that maybe const-folding will succeed in the leader process but fail in workers, then I think the way to do that is to adjust parallel index build so that the processed expression trees are passed to the workers from the leader. That is how it's done with regular query trees, and index build is evidently taking a not-very-safe shortcut by not doing it like that. But I'm not excited enough about the mostly-theoretical hazard to put any work into that myself. Another line of thought is whether we should just forbid any parallel-unsafe functions in index expressions. If we continue to allow them, I think we have to reject parallelizing any query or DDL operation that touches the associated table at all, because there's too much risk of a worker trying to evaluate such an expression somewhere along the line. Again though, given the lack of complaints I'm not excited about imposing such a draconian policy. (The fact that functions are marked parallel unsafe by default means that if we did, we'd almost certainly break cases that are working fine for users today.) regards, tom lane [1] https://commitfest.postgresql.org/47/4776/
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Michael Paquier
Date:
On Wed, Mar 06, 2024 at 12:28:11PM -0500, Tom Lane wrote: > As far as HEAD goes, I think we should revert this and then look > to get [1] committed. > > I'm unsure what to do in the back branches, > but given the lack of prior complaints I suspect we could get away > with just reverting and leaving the issue unfixed. Still the issue fixed here and the issue of the other thread are different? Relying on the relcache tricks the planner to spawn workers for nothing because they would just fail if the internals of the function are unsafe to run in the workers, and the function is marked as such. In some cases users like to use properties as hints to give to the planner about what to use or not. I'd like to believe (and perhaps that's a cute thought) that functions are marked as parallel-unsafe for a purpose, which is that they cannot run in workers as they rely on a shared state. Whether the internals of the functions are implemented correctly is up to the user. I mean, it's perfectly possible to mark a function as immutable while its internals are volatile, and I've seen people rely on that to push down clauses. > (I wouldn't propose back-patching [1], it seems too risky.) Agreed. Just looked at it and the change of blockState can be sometimes tricky to get right. There are a bunch of internal assumptions as far as I recall this area of the code. > If we want to do something about the more generic issue that > maybe const-folding will succeed in the leader process but > fail in workers, then I think the way to do that is to adjust > parallel index build so that the processed expression trees are > passed to the workers from the leader. That is how it's done > with regular query trees, and index build is evidently taking > a not-very-safe shortcut by not doing it like that. But I'm not > excited enough about the mostly-theoretical hazard to put any > work into that myself. You mean in the serialization and deserialization logic used when workers are spawned and then let the workers complain when they try to use such a function in a new path? That would be backpatchable, but feels rather a bit invasive. > Another line of thought is whether we should just forbid any > parallel-unsafe functions in index expressions. If we continue > to allow them, I think we have to reject parallelizing any > query or DDL operation that touches the associated table at all, > because there's too much risk of a worker trying to evaluate > such an expression somewhere along the line. Again though, > given the lack of complaints I'm not excited about imposing > such a draconian policy. (The fact that functions are marked > parallel unsafe by default means that if we did, we'd almost > certainly break cases that are working fine for users today.) Neither am I to restrict that moving forward. That could hurt existing use cases that were allowed, still were able to work. Saying that, seeing the lack of complaints and the fact that you're not comfortable with that, I have no objections to revert the change on all the branches. Will do so in a bit. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > Still the issue fixed here and the issue of the other thread are > different? Relying on the relcache tricks the planner to spawn > workers for nothing because they would just fail if the internals of > the function are unsafe to run in the workers, and the function is > marked as such. Sure, but the current patch just trades one set of errors for others. As I said, if we care about this problem then the right fix will involve serializing/deserializing already-const-folded expression trees instead of expecting that the workers can regenerate those. I don't personally think it's worth the trouble, especially not if we can get rid of the need for exception-using functions to be marked parallel unsafe. > In some cases users like to use properties as hints > to give to the planner about what to use or not. I'd like to believe > (and perhaps that's a cute thought) that functions are marked as > parallel-unsafe for a purpose, which is that they cannot run in > workers as they rely on a shared state. Whether the internals of the > functions are implemented correctly is up to the user. I mean, it's > perfectly possible to mark a function as immutable while its internals > are volatile, and I've seen people rely on that to push down clauses. It's the other direction that I'm concerned about: if we inline a SQL function and find that its innards are less volatile than the original marking, we are free to exploit that, and people will complain if we don't. Similarly, if the innards are more parallel safe than the original marking, we should be able to exploit that. regards, tom lane
Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
From
Michael Paquier
Date:
On Wed, Mar 06, 2024 at 06:01:50PM -0500, Tom Lane wrote: > Sure, but the current patch just trades one set of errors for others. > As I said, if we care about this problem then the right fix will > involve serializing/deserializing already-const-folded expression > trees instead of expecting that the workers can regenerate those. > I don't personally think it's worth the trouble, especially not if > we can get rid of the need for exception-using functions to be marked > parallel unsafe. I don't know if that's worth tackling either. It sounds like this is just something we should do only if someone comes up with a good use case for it, outside of the subtransaction handling in the workers. Anyway, I've just reverted the change for now. -- Michael