Thread: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
Hi hackers, I found that when running command vaccum full in stand-alone mode there will be a core dump. The core stack looks like this: ------------------------------------ backend> vacuum full TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439) ./postgres(ExceptionalCondition+0xac)[0x55c664c36913] ./postgres(dsm_create+0x3c)[0x55c664a79fee] ./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296] ./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca] ./postgres(+0x16bdef)[0x55c664692def] ./postgres(+0x16951e)[0x55c66469051e] ./postgres(btbuild+0xcc)[0x55c6646903ec] ./postgres(index_build+0x322)[0x55c664719749] ./postgres(reindex_index+0x2ee)[0x55c66471a765] ./postgres(reindex_relation+0x1e5)[0x55c66471acca] ./postgres(finish_heap_swap+0x118)[0x55c6647c8db1] ./postgres(+0x2a0848)[0x55c6647c7848] ./postgres(cluster_rel+0x34e)[0x55c6647c727f] ./postgres(+0x3414cf)[0x55c6648684cf] ./postgres(vacuum+0x4f3)[0x55c664866591] ./postgres(ExecVacuum+0x736)[0x55c66486609b] ./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc] ./postgres(ProcessUtility+0x131)[0x55c664ab6cb5] ./postgres(+0x58ea69)[0x55c664ab5a69] ./postgres(+0x58ec95)[0x55c664ab5c95] ./postgres(PortalRun+0x307)[0x55c664ab5184] ./postgres(+0x587ef6)[0x55c664aaeef6] ./postgres(PostgresMain+0x819)[0x55c664ab3271] ./postgres(main+0x2e1)[0x55c6648f9df5] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1] ./postgres(_start+0x2a)[0x55c6645df86a] Aborted (core dumped)------------------------------------ I think that when there is a btree index in a table, vacuum full tries to rebuild the btree index in a parallel way. This will launch several workers and each of then will try to apply a dynamic shared memory segment, which is not allowed in stand-alone mode. I think it is better not to use btree index build in parallel in stand-alone mode. My patch is attached below. Best Regards! Yulin PEI |
Attachment
Re: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
From
Masahiko Sawada
Date:
On Mon, Nov 30, 2020 at 5:45 PM Yulin PEI <ypeiae@connect.ust.hk> wrote:
Hi hackers,
I found that when running command vaccum full in stand-alone mode there will be a core dump.
The core stack looks like this:
------------------------------------
backend> vacuum fullTRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)./postgres(ExceptionalCondition+0xac)[0x55c664c36913]./postgres(dsm_create+0x3c)[0x55c664a79fee]./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]./postgres(+0x16bdef)[0x55c664692def]./postgres(+0x16951e)[0x55c66469051e]./postgres(btbuild+0xcc)[0x55c6646903ec]./postgres(index_build+0x322)[0x55c664719749]./postgres(reindex_index+0x2ee)[0x55c66471a765]./postgres(reindex_relation+0x1e5)[0x55c66471acca]./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]./postgres(+0x2a0848)[0x55c6647c7848]./postgres(cluster_rel+0x34e)[0x55c6647c727f]./postgres(+0x3414cf)[0x55c6648684cf]./postgres(vacuum+0x4f3)[0x55c664866591]./postgres(ExecVacuum+0x736)[0x55c66486609b]./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]./postgres(+0x58ea69)[0x55c664ab5a69]./postgres(+0x58ec95)[0x55c664ab5c95]./postgres(PortalRun+0x307)[0x55c664ab5184]./postgres(+0x587ef6)[0x55c664aaeef6]./postgres(PostgresMain+0x819)[0x55c664ab3271]./postgres(main+0x2e1)[0x55c6648f9df5]/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]./postgres(_start+0x2a)[0x55c6645df86a]Aborted (core dumped)
------------------------------------
I think that when there is a btree index in a table, vacuum full tries to rebuild the btree index in a parallel way.
This will launch several workers and each of then will try to apply a dynamic shared memory segment, which is not allowed in stand-alone mode.
I think it is better not to use btree index build in parallel in stand-alone mode. My patch is attached below.
Good catch. This is a bug in parallel index(btree) creation. I could reproduce this assertion failure with HEAD even by using CREATE INDEX command.
- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && !IsBackendStandAlone())
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && !IsBackendStandAlone())
+#define IsBackendStandAlone() (!IsBootstrapProcessingMode() && !IsPostmasterEnvironment)
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.
I think we can use IsUnderPostmaster instead. If it's false we should not enable parallel index creation.
Regards,
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/
回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
From
Yulin PEI
Date:
Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.
发件人: Masahiko Sawada <sawada.mshk@gmail.com>
发送时间: 2020年11月30日 17:27
收件人: Yulin PEI <ypeiae@connect.ust.hk>
抄送: pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
发送时间: 2020年11月30日 17:27
收件人: Yulin PEI <ypeiae@connect.ust.hk>
抄送: pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
On Mon, Nov 30, 2020 at 5:45 PM Yulin PEI <ypeiae@connect.ust.hk> wrote:
Hi hackers,
I found that when running command vaccum full in stand-alone mode there will be a core dump.
The core stack looks like this:
------------------------------------
backend> vacuum fullTRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)./postgres(ExceptionalCondition+0xac)[0x55c664c36913]./postgres(dsm_create+0x3c)[0x55c664a79fee]./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]./postgres(+0x16bdef)[0x55c664692def]./postgres(+0x16951e)[0x55c66469051e]./postgres(btbuild+0xcc)[0x55c6646903ec]./postgres(index_build+0x322)[0x55c664719749]./postgres(reindex_index+0x2ee)[0x55c66471a765]./postgres(reindex_relation+0x1e5)[0x55c66471acca]./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]./postgres(+0x2a0848)[0x55c6647c7848]./postgres(cluster_rel+0x34e)[0x55c6647c727f]./postgres(+0x3414cf)[0x55c6648684cf]./postgres(vacuum+0x4f3)[0x55c664866591]./postgres(ExecVacuum+0x736)[0x55c66486609b]./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]./postgres(+0x58ea69)[0x55c664ab5a69]./postgres(+0x58ec95)[0x55c664ab5c95]./postgres(PortalRun+0x307)[0x55c664ab5184]./postgres(+0x587ef6)[0x55c664aaeef6]./postgres(PostgresMain+0x819)[0x55c664ab3271]./postgres(main+0x2e1)[0x55c6648f9df5]/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]./postgres(_start+0x2a)[0x55c6645df86a]Aborted (core dumped)
------------------------------------
I think that when there is a btree index in a table, vacuum full tries to rebuild the btree index in a parallel way.
This will launch several workers and each of then will try to apply a dynamic shared memory segment, which is not allowed in stand-alone mode.
I think it is better not to use btree index build in parallel in stand-alone mode. My patch is attached below.
Good catch. This is a bug in parallel index(btree) creation. I could reproduce this assertion failure with HEAD even by using CREATE INDEX command.
- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && !IsBackendStandAlone())
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && !IsBackendStandAlone())
+#define IsBackendStandAlone() (!IsBootstrapProcessingMode() && !IsPostmasterEnvironment)
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.
I think we can use IsUnderPostmaster instead. If it's false we should not enable parallel index creation.
Regards,
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/
Attachment
Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
From
Tom Lane
Date:
Yulin PEI <ypeiae@connect.ust.hk> writes: > Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster processwill not build index. > So my new modified patch is attached. This is a good catch, but the proposed fix still seems pretty random and unlike how it's done elsewhere. It seems to me that since index_build() is relying on plan_create_index_workers() to assess parallel safety, that's where to check IsUnderPostmaster. Moreover, the existing code in compute_parallel_vacuum_workers (which gets this right) associates the IsUnderPostmaster check with the initial check on max_parallel_maintenance_workers. So I think that the right fix is to adopt the compute_parallel_vacuum_workers coding in plan_create_index_workers, and thereby create a model for future uses of max_parallel_maintenance_workers to follow. regards, tom lane diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 247f7d4625..1a94b58f8b 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -6375,8 +6375,11 @@ plan_create_index_workers(Oid tableOid, Oid indexOid) double reltuples; double allvisfrac; - /* Return immediately when parallelism disabled */ - if (max_parallel_maintenance_workers == 0) + /* + * We don't allow performing parallel operation in standalone backend or + * when parallelism is disabled. + */ + if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0) return 0; /* Set up largely-dummy planner state */
Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
From
Masahiko Sawada
Date:
On Tue, Dec 1, 2020 at 3:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Yulin PEI <ypeiae@connect.ust.hk> writes: > > Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster processwill not build index. > > So my new modified patch is attached. > > This is a good catch, but the proposed fix still seems pretty random > and unlike how it's done elsewhere. It seems to me that since > index_build() is relying on plan_create_index_workers() to assess > parallel safety, that's where to check IsUnderPostmaster. Moreover, > the existing code in compute_parallel_vacuum_workers (which gets > this right) associates the IsUnderPostmaster check with the initial > check on max_parallel_maintenance_workers. So I think that the > right fix is to adopt the compute_parallel_vacuum_workers coding > in plan_create_index_workers, and thereby create a model for future > uses of max_parallel_maintenance_workers to follow. +1 Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
回复: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
From
Yulin PEI
Date:
I think you are right after reading code in compute_parallel_vacuum_workers() :)
发件人: Tom Lane <tgl@sss.pgh.pa.us>
发送时间: 2020年12月1日 2:54
收件人: Yulin PEI <ypeiae@connect.ust.hk>
抄送: Masahiko Sawada <sawada.mshk@gmail.com>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
发送时间: 2020年12月1日 2:54
收件人: Yulin PEI <ypeiae@connect.ust.hk>
抄送: Masahiko Sawada <sawada.mshk@gmail.com>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
Yulin PEI <ypeiae@connect.ust.hk> writes:
> Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
> So my new modified patch is attached.
This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere. It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster. Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers. So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.
regards, tom lane
> Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
> So my new modified patch is attached.
This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere. It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster. Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers. So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.
regards, tom lane