[PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
Hi hackers,
I found that when running command vaccum full in stand-alone mode there will be a core dump.
The core stack looks like this:
------------------------------------
backend> vacuum full
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)
./postgres(ExceptionalCondition+0xac)[0x55c664c36913]
./postgres(dsm_create+0x3c)[0x55c664a79fee]
./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]
./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]
./postgres(+0x16bdef)[0x55c664692def]
./postgres(+0x16951e)[0x55c66469051e]
./postgres(btbuild+0xcc)[0x55c6646903ec]
./postgres(index_build+0x322)[0x55c664719749]
./postgres(reindex_index+0x2ee)[0x55c66471a765]
./postgres(reindex_relation+0x1e5)[0x55c66471acca]
./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]
./postgres(+0x2a0848)[0x55c6647c7848]
./postgres(cluster_rel+0x34e)[0x55c6647c727f]
./postgres(+0x3414cf)[0x55c6648684cf]
./postgres(vacuum+0x4f3)[0x55c664866591]
./postgres(ExecVacuum+0x736)[0x55c66486609b]
./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]
./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]
./postgres(+0x58ea69)[0x55c664ab5a69]
./postgres(+0x58ec95)[0x55c664ab5c95]
./postgres(PortalRun+0x307)[0x55c664ab5184]
./postgres(+0x587ef6)[0x55c664aaeef6]
./postgres(PostgresMain+0x819)[0x55c664ab3271]
./postgres(main+0x2e1)[0x55c6648f9df5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]
./postgres(_start+0x2a)[0x55c6645df86a]
Aborted (core dumped)
------------------------------------
I think that when there is a btree index in a table, vacuum full tries to rebuild the btree index in a parallel way.
This will launch several workers and each of then will try to apply a dynamic shared memory segment, which is not allowed in stand-alone mode.
I think it is better not to use btree index build in parallel in stand-alone mode. My patch is attached below.
Best Regards!
Yulin PEI
Attachments:
standalone_vacuum_full.patchapplication/octet-stream; name=standalone_vacuum_full.patchDownload
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 731610c701..06feb779a6 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3012,9 +3012,11 @@ index_build(Relation heapRelation,
* only btree has support for parallel builds.
*
* Note that planner considers parallel safety for us.
+ * In stand-alone mode, parallel index building should not be considered since
+ * there is only one process.
*/
if (parallel && IsNormalProcessingMode() &&
- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && !IsBackendStandAlone())
indexInfo->ii_ParallelWorkers =
plan_create_index_workers(RelationGetRelid(heapRelation),
RelationGetRelid(indexRelation));
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 72e3352398..778aec987d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -405,6 +405,7 @@ extern ProcessingMode Mode;
} while(0)
+#define IsBackendStandAlone() (!IsBootstrapProcessingMode() && !IsPostmasterEnvironment)
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.
On Mon, Nov 30, 2020 at 5:45 PM Yulin PEI <ypeiae@connect.ust.hk> wrote:
Hi hackers,
I found that when running command vaccum full in stand-alone mode there
will be a core dump.
The core stack looks like this:
------------------------------------
backend> vacuum full
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)
./postgres(ExceptionalCondition+0xac)[0x55c664c36913]
./postgres(dsm_create+0x3c)[0x55c664a79fee]
./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]
./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]
./postgres(+0x16bdef)[0x55c664692def]
./postgres(+0x16951e)[0x55c66469051e]
./postgres(btbuild+0xcc)[0x55c6646903ec]
./postgres(index_build+0x322)[0x55c664719749]
./postgres(reindex_index+0x2ee)[0x55c66471a765]
./postgres(reindex_relation+0x1e5)[0x55c66471acca]
./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]
./postgres(+0x2a0848)[0x55c6647c7848]
./postgres(cluster_rel+0x34e)[0x55c6647c727f]
./postgres(+0x3414cf)[0x55c6648684cf]
./postgres(vacuum+0x4f3)[0x55c664866591]
./postgres(ExecVacuum+0x736)[0x55c66486609b]
./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]
./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]
./postgres(+0x58ea69)[0x55c664ab5a69]
./postgres(+0x58ec95)[0x55c664ab5c95]
./postgres(PortalRun+0x307)[0x55c664ab5184]
./postgres(+0x587ef6)[0x55c664aaeef6]
./postgres(PostgresMain+0x819)[0x55c664ab3271]
./postgres(main+0x2e1)[0x55c6648f9df5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]
./postgres(_start+0x2a)[0x55c6645df86a]
Aborted (core dumped)
------------------------------------I think that when there is a btree index in a table, vacuum full tries to
rebuild the btree index in a parallel way.This will launch several workers and each of then will try to apply a
dynamic shared memory segment, which is not allowed in stand-alone mode.I think it is better not to use btree index build in parallel in
stand-alone mode. My patch is attached below.
Good catch. This is a bug in parallel index(btree) creation. I could
reproduce this assertion failure with HEAD even by using CREATE INDEX
command.
- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment
&& !IsBackendStandAlone())
+#define IsBackendStandAlone() (!IsBootstrapProcessingMode() &&
!IsPostmasterEnvironment)
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.
I think we can use IsUnderPostmaster instead. If it's false we should not
enable parallel index creation.
Regards,
--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/
Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.
________________________________
发件人: Masahiko Sawada <sawada.mshk@gmail.com>
发送时间: 2020年11月30日 17:27
收件人: Yulin PEI <ypeiae@connect.ust.hk>
抄送: pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
On Mon, Nov 30, 2020 at 5:45 PM Yulin PEI <ypeiae@connect.ust.hk<mailto:ypeiae@connect.ust.hk>> wrote:
Hi hackers,
I found that when running command vaccum full in stand-alone mode there will be a core dump.
The core stack looks like this:
------------------------------------
backend> vacuum full
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)
./postgres(ExceptionalCondition+0xac)[0x55c664c36913]
./postgres(dsm_create+0x3c)[0x55c664a79fee]
./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]
./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]
./postgres(+0x16bdef)[0x55c664692def]
./postgres(+0x16951e)[0x55c66469051e]
./postgres(btbuild+0xcc)[0x55c6646903ec]
./postgres(index_build+0x322)[0x55c664719749]
./postgres(reindex_index+0x2ee)[0x55c66471a765]
./postgres(reindex_relation+0x1e5)[0x55c66471acca]
./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]
./postgres(+0x2a0848)[0x55c6647c7848]
./postgres(cluster_rel+0x34e)[0x55c6647c727f]
./postgres(+0x3414cf)[0x55c6648684cf]
./postgres(vacuum+0x4f3)[0x55c664866591]
./postgres(ExecVacuum+0x736)[0x55c66486609b]
./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]
./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]
./postgres(+0x58ea69)[0x55c664ab5a69]
./postgres(+0x58ec95)[0x55c664ab5c95]
./postgres(PortalRun+0x307)[0x55c664ab5184]
./postgres(+0x587ef6)[0x55c664aaeef6]
./postgres(PostgresMain+0x819)[0x55c664ab3271]
./postgres(main+0x2e1)[0x55c6648f9df5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]
./postgres(_start+0x2a)[0x55c6645df86a]
Aborted (core dumped)
------------------------------------
I think that when there is a btree index in a table, vacuum full tries to rebuild the btree index in a parallel way.
This will launch several workers and each of then will try to apply a dynamic shared memory segment, which is not allowed in stand-alone mode.
I think it is better not to use btree index build in parallel in stand-alone mode. My patch is attached below.
Good catch. This is a bug in parallel index(btree) creation. I could reproduce this assertion failure with HEAD even by using CREATE INDEX command.
- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && !IsBackendStandAlone())
+#define IsBackendStandAlone() (!IsBootstrapProcessingMode() && !IsPostmasterEnvironment)
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.
I think we can use IsUnderPostmaster instead. If it's false we should not enable parallel index creation.
Regards,
--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/
Attachments:
standalone_vacuum_full.patchapplication/octet-stream; name=standalone_vacuum_full.patchDownload
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 731610c701..e6f96ade7b 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3012,9 +3012,11 @@ index_build(Relation heapRelation,
* only btree has support for parallel builds.
*
* Note that planner considers parallel safety for us.
+ * In stand-alone mode, parallel index building should not be considered since
+ * there is only one process.
*/
if (parallel && IsNormalProcessingMode() &&
- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsUnderPostmaster)
indexInfo->ii_ParallelWorkers =
plan_create_index_workers(RelationGetRelid(heapRelation),
RelationGetRelid(indexRelation));
Yulin PEI <ypeiae@connect.ust.hk> writes:
Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.
This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere. It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster. Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers. So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.
regards, tom lane
Attachments:
avoid-parallel-index-build-in-standalone-mode.patchtext/x-diff; charset=us-ascii; name=avoid-parallel-index-build-in-standalone-mode.patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 247f7d4625..1a94b58f8b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6375,8 +6375,11 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
double reltuples;
double allvisfrac;
- /* Return immediately when parallelism disabled */
- if (max_parallel_maintenance_workers == 0)
+ /*
+ * We don't allow performing parallel operation in standalone backend or
+ * when parallelism is disabled.
+ */
+ if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
return 0;
/* Set up largely-dummy planner state */
On Tue, Dec 1, 2020 at 3:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yulin PEI <ypeiae@connect.ust.hk> writes:
Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere. It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster. Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers. So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.
+1
Regards,
--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/
I think you are right after reading code in compute_parallel_vacuum_workers() :)
________________________________
发件人: Tom Lane <tgl@sss.pgh.pa.us>
发送时间: 2020年12月1日 2:54
收件人: Yulin PEI <ypeiae@connect.ust.hk>
抄送: Masahiko Sawada <sawada.mshk@gmail.com>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode
Yulin PEI <ypeiae@connect.ust.hk> writes:
Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.
This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere. It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster. Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers. So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.
regards, tom lane