[PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

Started by Yulin PEIabout 5 years ago6 messages
#1Yulin PEI
ypeiae@connect.ust.hk
1 attachment(s)

Hi hackers,

I found that when running command vaccum full in stand-alone mode there will be a core dump.
The core stack looks like this:
------------------------------------
backend> vacuum full
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)
./postgres(ExceptionalCondition+0xac)[0x55c664c36913]
./postgres(dsm_create+0x3c)[0x55c664a79fee]
./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]
./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]
./postgres(+0x16bdef)[0x55c664692def]
./postgres(+0x16951e)[0x55c66469051e]
./postgres(btbuild+0xcc)[0x55c6646903ec]
./postgres(index_build+0x322)[0x55c664719749]
./postgres(reindex_index+0x2ee)[0x55c66471a765]
./postgres(reindex_relation+0x1e5)[0x55c66471acca]
./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]
./postgres(+0x2a0848)[0x55c6647c7848]
./postgres(cluster_rel+0x34e)[0x55c6647c727f]
./postgres(+0x3414cf)[0x55c6648684cf]
./postgres(vacuum+0x4f3)[0x55c664866591]
./postgres(ExecVacuum+0x736)[0x55c66486609b]
./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]
./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]
./postgres(+0x58ea69)[0x55c664ab5a69]
./postgres(+0x58ec95)[0x55c664ab5c95]
./postgres(PortalRun+0x307)[0x55c664ab5184]
./postgres(+0x587ef6)[0x55c664aaeef6]
./postgres(PostgresMain+0x819)[0x55c664ab3271]
./postgres(main+0x2e1)[0x55c6648f9df5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]
./postgres(_start+0x2a)[0x55c6645df86a]
Aborted (core dumped)
------------------------------------

I think that when there is a btree index in a table, vacuum full tries to rebuild the btree index in a parallel way.

This will launch several workers and each of then will try to apply a dynamic shared memory segment, which is not allowed in stand-alone mode.

I think it is better not to use btree index build in parallel in stand-alone mode. My patch is attached below.

Best Regards!
Yulin PEI

Attachments:

standalone_vacuum_full.patchapplication/octet-stream; name=standalone_vacuum_full.patchDownload
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 731610c701..06feb779a6 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3012,9 +3012,11 @@ index_build(Relation heapRelation,
 	 * only btree has support for parallel builds.
 	 *
 	 * Note that planner considers parallel safety for us.
+	 * In stand-alone mode, parallel index building should not be considered since
+	 * there is only one process.
 	 */
 	if (parallel && IsNormalProcessingMode() &&
-		indexRelation->rd_rel->relam == BTREE_AM_OID)
+		indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && !IsBackendStandAlone())
 		indexInfo->ii_ParallelWorkers =
 			plan_create_index_workers(RelationGetRelid(heapRelation),
 									  RelationGetRelid(indexRelation));
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 72e3352398..778aec987d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -405,6 +405,7 @@ extern ProcessingMode Mode;
 	} while(0)
 
 
+#define IsBackendStandAlone()	(!IsBootstrapProcessingMode() && !IsPostmasterEnvironment)
 /*
  * Auxiliary-process type identifiers.  These used to be in bootstrap.h
  * but it seems saner to have them here, with the ProcessingMode stuff.
#2Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Yulin PEI (#1)
Re: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

On Mon, Nov 30, 2020 at 5:45 PM Yulin PEI <ypeiae@connect.ust.hk> wrote:

Hi hackers,

I found that when running command vaccum full in stand-alone mode there
will be a core dump.
The core stack looks like this:
------------------------------------
backend> vacuum full
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)
./postgres(ExceptionalCondition+0xac)[0x55c664c36913]
./postgres(dsm_create+0x3c)[0x55c664a79fee]
./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]
./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]
./postgres(+0x16bdef)[0x55c664692def]
./postgres(+0x16951e)[0x55c66469051e]
./postgres(btbuild+0xcc)[0x55c6646903ec]
./postgres(index_build+0x322)[0x55c664719749]
./postgres(reindex_index+0x2ee)[0x55c66471a765]
./postgres(reindex_relation+0x1e5)[0x55c66471acca]
./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]
./postgres(+0x2a0848)[0x55c6647c7848]
./postgres(cluster_rel+0x34e)[0x55c6647c727f]
./postgres(+0x3414cf)[0x55c6648684cf]
./postgres(vacuum+0x4f3)[0x55c664866591]
./postgres(ExecVacuum+0x736)[0x55c66486609b]
./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]
./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]
./postgres(+0x58ea69)[0x55c664ab5a69]
./postgres(+0x58ec95)[0x55c664ab5c95]
./postgres(PortalRun+0x307)[0x55c664ab5184]
./postgres(+0x587ef6)[0x55c664aaeef6]
./postgres(PostgresMain+0x819)[0x55c664ab3271]
./postgres(main+0x2e1)[0x55c6648f9df5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]
./postgres(_start+0x2a)[0x55c6645df86a]
Aborted (core dumped)
------------------------------------

I think that when there is a btree index in a table, vacuum full tries to
rebuild the btree index in a parallel way.

This will launch several workers and each of then will try to apply a
dynamic shared memory segment, which is not allowed in stand-alone mode.

I think it is better not to use btree index build in parallel in
stand-alone mode. My patch is attached below.

Good catch. This is a bug in parallel index(btree) creation. I could
reproduce this assertion failure with HEAD even by using CREATE INDEX
command.

- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment
&& !IsBackendStandAlone())

+#define IsBackendStandAlone() (!IsBootstrapProcessingMode() &&
!IsPostmasterEnvironment)
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.

I think we can use IsUnderPostmaster instead. If it's false we should not
enable parallel index creation.

Regards,

--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/

#3Yulin PEI
ypeiae@connect.ust.hk
In reply to: Masahiko Sawada (#2)
1 attachment(s)
回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.

________________________________
发件人: Masahiko Sawada <sawada.mshk@gmail.com>
发送时间: 2020年11月30日 17:27
收件人: Yulin PEI <ypeiae@connect.ust.hk>
抄送: pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

On Mon, Nov 30, 2020 at 5:45 PM Yulin PEI <ypeiae@connect.ust.hk<mailto:ypeiae@connect.ust.hk>> wrote:
Hi hackers,

I found that when running command vaccum full in stand-alone mode there will be a core dump.
The core stack looks like this:
------------------------------------
backend> vacuum full
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)
./postgres(ExceptionalCondition+0xac)[0x55c664c36913]
./postgres(dsm_create+0x3c)[0x55c664a79fee]
./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]
./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]
./postgres(+0x16bdef)[0x55c664692def]
./postgres(+0x16951e)[0x55c66469051e]
./postgres(btbuild+0xcc)[0x55c6646903ec]
./postgres(index_build+0x322)[0x55c664719749]
./postgres(reindex_index+0x2ee)[0x55c66471a765]
./postgres(reindex_relation+0x1e5)[0x55c66471acca]
./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]
./postgres(+0x2a0848)[0x55c6647c7848]
./postgres(cluster_rel+0x34e)[0x55c6647c727f]
./postgres(+0x3414cf)[0x55c6648684cf]
./postgres(vacuum+0x4f3)[0x55c664866591]
./postgres(ExecVacuum+0x736)[0x55c66486609b]
./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]
./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]
./postgres(+0x58ea69)[0x55c664ab5a69]
./postgres(+0x58ec95)[0x55c664ab5c95]
./postgres(PortalRun+0x307)[0x55c664ab5184]
./postgres(+0x587ef6)[0x55c664aaeef6]
./postgres(PostgresMain+0x819)[0x55c664ab3271]
./postgres(main+0x2e1)[0x55c6648f9df5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]
./postgres(_start+0x2a)[0x55c6645df86a]
Aborted (core dumped)
------------------------------------

I think that when there is a btree index in a table, vacuum full tries to rebuild the btree index in a parallel way.

This will launch several workers and each of then will try to apply a dynamic shared memory segment, which is not allowed in stand-alone mode.

I think it is better not to use btree index build in parallel in stand-alone mode. My patch is attached below.

Good catch. This is a bug in parallel index(btree) creation. I could reproduce this assertion failure with HEAD even by using CREATE INDEX command.

- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && !IsBackendStandAlone())

+#define IsBackendStandAlone() (!IsBootstrapProcessingMode() && !IsPostmasterEnvironment)
/*
* Auxiliary-process type identifiers. These used to be in bootstrap.h
* but it seems saner to have them here, with the ProcessingMode stuff.

I think we can use IsUnderPostmaster instead. If it's false we should not enable parallel index creation.

Regards,

--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/

Attachments:

standalone_vacuum_full.patchapplication/octet-stream; name=standalone_vacuum_full.patchDownload
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 731610c701..e6f96ade7b 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3012,9 +3012,11 @@ index_build(Relation heapRelation,
 	 * only btree has support for parallel builds.
 	 *
 	 * Note that planner considers parallel safety for us.
+	 * In stand-alone mode, parallel index building should not be considered since
+	 * there is only one process.
 	 */
 	if (parallel && IsNormalProcessingMode() &&
-		indexRelation->rd_rel->relam == BTREE_AM_OID)
+		indexRelation->rd_rel->relam == BTREE_AM_OID && IsUnderPostmaster)
 		indexInfo->ii_ParallelWorkers =
 			plan_create_index_workers(RelationGetRelid(heapRelation),
 									  RelationGetRelid(indexRelation));
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Yulin PEI (#3)
1 attachment(s)
Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

Yulin PEI <ypeiae@connect.ust.hk> writes:

Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.

This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere. It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster. Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers. So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.

regards, tom lane

Attachments:

avoid-parallel-index-build-in-standalone-mode.patchtext/x-diff; charset=us-ascii; name=avoid-parallel-index-build-in-standalone-mode.patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 247f7d4625..1a94b58f8b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6375,8 +6375,11 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
 	double		reltuples;
 	double		allvisfrac;
 
-	/* Return immediately when parallelism disabled */
-	if (max_parallel_maintenance_workers == 0)
+	/*
+	 * We don't allow performing parallel operation in standalone backend or
+	 * when parallelism is disabled.
+	 */
+	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
 		return 0;
 
 	/* Set up largely-dummy planner state */
#5Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Tom Lane (#4)
Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

On Tue, Dec 1, 2020 at 3:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yulin PEI <ypeiae@connect.ust.hk> writes:

Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.

This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere. It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster. Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers. So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.

+1

Regards,

--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/

#6Yulin PEI
ypeiae@connect.ust.hk
In reply to: Tom Lane (#4)
回复: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

I think you are right after reading code in compute_parallel_vacuum_workers() :)

________________________________
发件人: Tom Lane <tgl@sss.pgh.pa.us>
发送时间: 2020年12月1日 2:54
收件人: Yulin PEI <ypeiae@connect.ust.hk>
抄送: Masahiko Sawada <sawada.mshk@gmail.com>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

Yulin PEI <ypeiae@connect.ust.hk> writes:

Yes, I agree because (IsNormalProcessingMode() ) means that current process is not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.

This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere. It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster. Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers. So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.

regards, tom lane