BUG #17339: Assert failed on attempt to detach a sequence concurrently

Started by PG Bug reporting formover 4 years ago7 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17339
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 14.1
Operating system: Ubuntu 20.04
Description:

The following query:
CREATE SEQUENCE seq;
CREATE TABLE range_parted(a int) PARTITION BY RANGE(a);
ALTER TABLE range_parted DETACH PARTITION seq CONCURRENTLY;

Leads to a failed assertion with the following stacktrace:
Core was generated by `postgres: law regression [local] ALTER TABLE
'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f270bc93859 in __GI_abort () at abort.c:79
#2 0x000056140a973f94 in ExceptionalCondition (
conditionName=0x56140aaf2260 "child_rel->rd_rel->relkind ==
RELKIND_RELATION || child_rel->rd_rel->relkind ==
RELKIND_PARTITIONED_TABLE", errorType=0x56140aaed143 "FailedAssertion",
fileName=0x56140aaed6b7 "tablecmds.c",
lineNumber=14790) at assert.c:69
#3 0x000056140a4ed790 in MarkInheritDetached (child_rel=0x7f26fcb7f7d8,
parent_rel=0x7f26fcb7e658) at tablecmds.c:14790
#4 0x000056140a4f4e9f in ATExecDetachPartition (wqueue=0x7ffcb3947328,
tab=0x56140bc9d558, rel=0x7f26fcb7e658,
name=0x56140bca1ed0, concurrent=true) at tablecmds.c:17785
#5 0x000056140a4d5cab in ATExecCmd (wqueue=0x7ffcb3947328,
tab=0x56140bc9d558, cmd=0x56140bc9ae18, lockmode=4,
cur_pass=10, context=0x7ffcb3947520) at tablecmds.c:5134
#6 0x000056140a4d4bf1 in ATRewriteCatalogs (wqueue=0x7ffcb3947328,
lockmode=4, context=0x7ffcb3947520)
at tablecmds.c:4792
#7 0x000056140a4d4030 in ATController (parsetree=0x56140bc79e20,
rel=0x7f26fcb7e658, cmds=0x56140bc79e58,
recurse=true, lockmode=4, context=0x7ffcb3947520) at tablecmds.c:4388
#8 0x000056140a4d3c37 in AlterTable (stmt=0x56140bc79e20, lockmode=4,
context=0x7ffcb3947520) at tablecmds.c:4035
#9 0x000056140a7bac4a in ProcessUtilitySlow (pstate=0x56140bc9ab90,
pstmt=0x56140bc7a1c8,
queryString=0x56140bc79280 "ALTER TABLE range_parted DETACH PARTITION
seq CONCURRENTLY;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x56140bc7a2b8, qc=0x7ffcb3947b70)
at utility.c:1317
#10 0x000056140a7ba4f1 in standard_ProcessUtility (pstmt=0x56140bc7a1c8,
queryString=0x56140bc79280 "ALTER TABLE range_parted DETACH PARTITION
seq CONCURRENTLY;", readOnlyTree=false,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x56140bc7a2b8, qc=0x7ffcb3947b70)
at utility.c:1066
#11 0x000056140a7b9461 in ProcessUtility (pstmt=0x56140bc7a1c8,
queryString=0x56140bc79280 "ALTER TABLE range_parted DETACH PARTITION
seq CONCURRENTLY;", readOnlyTree=false,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x56140bc7a2b8, qc=0x7ffcb3947b70) at utility.c:527
#12 0x000056140a7b7d00 in PortalRunUtility (portal=0x56140bce8180,
pstmt=0x56140bc7a1c8, isTopLevel=true,
setHoldSnapshot=false, dest=0x56140bc7a2b8, qc=0x7ffcb3947b70) at
pquery.c:1155
#13 0x000056140a7b7f8b in PortalRunMulti (portal=0x56140bce8180,
isTopLevel=true, setHoldSnapshot=false,
dest=0x56140bc7a2b8, altdest=0x56140bc7a2b8, qc=0x7ffcb3947b70) at
pquery.c:1312
#14 0x000056140a7b73ba in PortalRun (portal=0x56140bce8180,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x56140bc7a2b8, altdest=0x56140bc7a2b8, qc=0x7ffcb3947b70) at
pquery.c:788
#15 0x000056140a7b01f6 in exec_simple_query (
query_string=0x56140bc79280 "ALTER TABLE range_parted DETACH PARTITION
seq CONCURRENTLY;") at postgres.c:1214
#16 0x000056140a7b50e5 in PostgresMain (argc=1, argv=0x7ffcb3947d90,
dbname=0x56140bca53d8 "regression",
username=0x56140bca53b8 "law") at postgres.c:4486
#17 0x000056140a6d9974 in BackendRun (port=0x56140bc9ccf0) at
postmaster.c:4530
#18 0x000056140a6d91cf in BackendStartup (port=0x56140bc9ccf0) at
postmaster.c:4252
#19 0x000056140a6d4fc4 in ServerLoop () at postmaster.c:1745
#20 0x000056140a6d4721 in PostmasterMain (argc=3, argv=0x56140bc73540) at
postmaster.c:1417
#21 0x000056140a5c3f0c in main (argc=3, argv=0x56140bc73540) at main.c:209

without CONCURRENTLY I get:
ERROR: relation "seq" is not a partition of relation "range_parted"
This error is also returned on a build without asserts.

#2Michael Paquier
michael@paquier.xyz
In reply to: PG Bug reporting form (#1)
Re: BUG #17339: Assert failed on attempt to detach a sequence concurrently

On Sun, Dec 19, 2021 at 06:00:02AM +0000, PG Bug reporting form wrote:

CREATE SEQUENCE seq;
CREATE TABLE range_parted(a int) PARTITION BY RANGE(a);
ALTER TABLE range_parted DETACH PARTITION seq CONCURRENTLY;

The same error happens additionally for views or materialized views.
Looking at the code, I think that we should just apply
ATSimplePermissions() on (ATT_TABLE | ATT_FOREIGN_TABLE) when
executing the detach command to check for the supported relkinds.
That would make the logic consistent with the attach code path that
does the same check on the partition attached, while generating an
error message already generic enough for this purpose.

Attached is a patch, with some regression tests.
--
Michael

Attachments:

partition-detach-assert.patchtext/x-diff; charset=us-asciiDownload+35-0
#3Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#2)
Re: BUG #17339: Assert failed on attempt to detach a sequence concurrently

On 20.12.21 13:38, Michael Paquier wrote:

On Sun, Dec 19, 2021 at 06:00:02AM +0000, PG Bug reporting form wrote:

CREATE SEQUENCE seq;
CREATE TABLE range_parted(a int) PARTITION BY RANGE(a);
ALTER TABLE range_parted DETACH PARTITION seq CONCURRENTLY;

The same error happens additionally for views or materialized views.
Looking at the code, I think that we should just apply
ATSimplePermissions() on (ATT_TABLE | ATT_FOREIGN_TABLE) when
executing the detach command to check for the supported relkinds.
That would make the logic consistent with the attach code path that
does the same check on the partition attached, while generating an
error message already generic enough for this purpose.

Attached is a patch, with some regression tests.

Is it possible for child tables in partitioned tables to have different
ownerships? Then this change would introduce a new failure mode.

#4Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#3)
Re: BUG #17339: Assert failed on attempt to detach a sequence concurrently

On Mon, Dec 20, 2021 at 09:57:47PM +0100, Peter Eisentraut wrote:

On 20.12.21 13:38, Michael Paquier wrote:

Attached is a patch, with some regression tests.

Is it possible for child tables in partitioned tables to have different
ownerships? Then this change would introduce a new failure mode.

Hmm. Yes, you are right here. It is possible to change the ownership
of a partition after it gets attached, so this could cause a
regression. I recalled that this was not possible, so my memories
were wrong.

At the end, perhaps we should just remove the assertion that assumes
which relkind is right for the partition, as of:
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15013,8 +15013,6 @@ MarkInheritDetached(Relation child_rel,
Relation parent_rel)
    HeapTuple   inheritsTuple;
        bool        found = false;

- Assert(child_rel->rd_rel->relkind == RELKIND_RELATION ||
- child_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
Assert(parent_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);

This would also fail when attempting to detach a foreign table, as
well, and these are legal relkinds in a partition tree. Once we do
that, we fall down into the same failure as for the non-concurrent
mode in RemoveInheritance(), telling that the relation is not a member
of the partition tree.
--
Michael

#5Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#4)
Re: BUG #17339: Assert failed on attempt to detach a sequence concurrently

At Tue, 21 Dec 2021 09:30:42 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Mon, Dec 20, 2021 at 09:57:47PM +0100, Peter Eisentraut wrote:

On 20.12.21 13:38, Michael Paquier wrote:

Attached is a patch, with some regression tests.

Is it possible for child tables in partitioned tables to have different
ownerships? Then this change would introduce a new failure mode.

Hmm. Yes, you are right here. It is possible to change the ownership
of a partition after it gets attached, so this could cause a
regression. I recalled that this was not possible, so my memories
were wrong.

At the end, perhaps we should just remove the assertion that assumes
which relkind is right for the partition, as of:
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15013,8 +15013,6 @@ MarkInheritDetached(Relation child_rel,
Relation parent_rel)
HeapTuple   inheritsTuple;
bool        found = false;

- Assert(child_rel->rd_rel->relkind == RELKIND_RELATION ||
- child_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
Assert(parent_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);

This would also fail when attempting to detach a foreign table, as
well, and these are legal relkinds in a partition tree. Once we do
that, we fall down into the same failure as for the non-concurrent
mode in RemoveInheritance(), telling that the relation is not a member
of the partition tree.

I agree to the discussion above thus it seems to me right that we just
remove the assertion. We never have a partition of un-attachable
relkind so such kind of relations are surely rejected just after. The
assertion on "== RELKIND_PARTITIONED_TABLE" is still valid since we
don't reach there for the case of the traditional inheritance.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#6Alexander Lakhin
exclusion@gmail.com
In reply to: Kyotaro Horiguchi (#5)
Re: BUG #17339: Assert failed on attempt to detach a sequence concurrently

Hello Michael and Kyotaro-san,
21.12.2021 06:03, Kyotaro Horiguchi wrote:

This would also fail when attempting to detach a foreign table, as
well, and these are legal relkinds in a partition tree. Once we do
that, we fall down into the same failure as for the non-concurrent
mode in RemoveInheritance(), telling that the relation is not a member
of the partition tree.

I agree to the discussion above thus it seems to me right that we just
remove the assertion. We never have a partition of un-attachable
relkind so such kind of relations are surely rejected just after. The
assertion on "== RELKIND_PARTITIONED_TABLE" is still valid since we
don't reach there for the case of the traditional inheritance.

As detaching a foreign table is a valid scenario (not covered before),
maybe it's worth to exercise it as following:

--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -1490,6 +1490,8 @@ CREATE FOREIGN TABLE foreign_tbl (a int, b int)
 
 CREATE TABLE parent_tbl (a int, b int) PARTITION BY RANGE(a);
 ALTER TABLE parent_tbl ATTACH PARTITION foreign_tbl FOR VALUES FROM (0)
TO (100);
+ALTER TABLE parent_tbl DETACH PARTITION foreign_tbl CONCURRENTLY;
+ALTER TABLE parent_tbl ATTACH PARTITION foreign_tbl FOR VALUES FROM (0)
TO (100);
 
 CREATE VIEW rw_view AS SELECT * FROM parent_tbl
   WHERE a < b WITH CHECK OPTION;

Thanks!

Best regards,
Alexander

#7Michael Paquier
michael@paquier.xyz
In reply to: Alexander Lakhin (#6)
Re: BUG #17339: Assert failed on attempt to detach a sequence concurrently

On Tue, Dec 21, 2021 at 07:00:00AM +0300, Alexander Lakhin wrote:

As detaching a foreign table is a valid scenario (not covered before),
maybe it's worth to exercise it as following:

Yeah, I was going to add something in the main regression test suite
as of foreign_data.sql, but the detached partitions are used with
constraint checks so that was a bit confusing. What you are
suggesting is fine by me, so I have included these and pushed a fix
for the issue.
--
Michael