bug: virtual generated column can be partition key

Started by jian he9 months ago18 messages
#1jian he
jian.universality@gmail.com
1 attachment(s)

hi.
While trying to make the virtual generated column be part of the partition key,
I found this bug.
it also influences the stored generated column, i added a test
on generated_stored.sql.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE (f3);

ERROR: cannot use generated column in partition key
LINE 4: PARTITION BY RANGE (f3);
^
DETAIL: Column "f3" is a generated column.

the following is essentially the same as above, it should also fail.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE ((f3));

Attachments:

v1-0001-virtual-generated-column-can-be-partition-key.patchtext/x-patch; charset=US-ASCII; name=v1-0001-virtual-generated-column-can-be-partition-key.patchDownload
From 10f8c6a1c59a31c2ba6d77d69fdc740e094d9cd3 Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Mon, 21 Apr 2025 10:28:26 +0800
Subject: [PATCH v1 1/1] virtual generated column can be partition key

CREATE TABLE gtest_part_key (
    f1 date NOT NULL, f2 bigint,
    f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
    PARTITION BY RANGE (f3);

ERROR:  cannot use generated column in partition key
LINE 4:     PARTITION BY RANGE (f3);
                                ^
DETAIL:  Column "f3" is a generated column.

the following is essentially the same as above, it should also fail.

CREATE TABLE gtest_part_key (
    f1 date NOT NULL, f2 bigint,
    f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
    PARTITION BY RANGE ((f3));
---
 src/backend/commands/tablecmds.c                | 12 +++++++++++-
 src/test/regress/expected/generated_stored.out  |  5 +++++
 src/test/regress/expected/generated_virtual.out |  5 +++++
 src/test/regress/sql/generated_stored.sql       |  1 +
 src/test/regress/sql/generated_virtual.sql      |  1 +
 5 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 265b1c397fb..4b96cd73ba4 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19794,11 +19794,21 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			if (IsA(expr, Var) &&
 				((Var *) expr)->varattno > 0)
 			{
+				Var		   *var = (Var *) expr;
+
+				if (TupleDescAttr(RelationGetDescr(rel), var->varattno)->attgenerated)
+					ereport(ERROR,
+							errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+							errmsg("cannot use generated column in partition key"),
+							errdetail("Column \"%s\" is a generated column.",
+									  get_attname(RelationGetRelid(rel), var->varattno, false)),
+							parser_errposition(pstate, pelem->location));
+
 				/*
 				 * User wrote "(column)" or "(column COLLATE something)".
 				 * Treat it like simple attribute anyway.
 				 */
-				partattrs[attn] = ((Var *) expr)->varattno;
+				partattrs[attn] = var->varattno;
 			}
 			else
 			{
diff --git a/src/test/regress/expected/generated_stored.out b/src/test/regress/expected/generated_stored.out
index 16de30ab191..72f39f2f0c1 100644
--- a/src/test/regress/expected/generated_stored.out
+++ b/src/test/regress/expected/generated_stored.out
@@ -1074,6 +1074,11 @@ ERROR:  cannot use generated column in partition key
 LINE 1: ...ENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3);
                                                                    ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
+                                                                 ^
+DETAIL:  Column "f3" is a generated column.
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
 ERROR:  cannot use generated column in partition key
 LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
diff --git a/src/test/regress/expected/generated_virtual.out b/src/test/regress/expected/generated_virtual.out
index 6300e7c1d96..2a6dc84b8ca 100644
--- a/src/test/regress/expected/generated_virtual.out
+++ b/src/test/regress/expected/generated_virtual.out
@@ -1027,6 +1027,11 @@ ERROR:  cannot use generated column in partition key
 LINE 1: ...NERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3);
                                                                    ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...RATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
+                                                                 ^
+DETAIL:  Column "f3" is a generated column.
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
 ERROR:  cannot use generated column in partition key
 LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
diff --git a/src/test/regress/sql/generated_stored.sql b/src/test/regress/sql/generated_stored.sql
index 4ec155f2da9..c9427c2d4d8 100644
--- a/src/test/regress/sql/generated_stored.sql
+++ b/src/test/regress/sql/generated_stored.sql
@@ -500,6 +500,7 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
 
 -- generated columns in partition key (not allowed)
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3);
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
 
 -- ALTER TABLE ... ADD COLUMN
diff --git a/src/test/regress/sql/generated_virtual.sql b/src/test/regress/sql/generated_virtual.sql
index b4eedeee2fb..652057bd707 100644
--- a/src/test/regress/sql/generated_virtual.sql
+++ b/src/test/regress/sql/generated_virtual.sql
@@ -534,6 +534,7 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
 
 -- generated columns in partition key (not allowed)
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3);
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
 
 -- ALTER TABLE ... ADD COLUMN
-- 
2.34.1

#2Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: jian he (#1)
Re: bug: virtual generated column can be partition key

On 2025/04/21 11:30, jian he wrote:

hi.
While trying to make the virtual generated column be part of the partition key,
I found this bug.

I haven't looked at the patch in detail yet, but when I applied it
and ran the regression tests with RELCACHE_FORCE_RELEASE and
CATCACHE_FORCE_RELEASE enabled, the tests failed with the following diff:

----------------------------
========= Contents of ./src/test/regress/regression.diffs
diff -U3 /home/runner/work/postgresql/postgresql/src/test/regress/expected/create_table.out /home/runner/work/postgresql/postgresql/src/test/regress/results/create_table.out
--- /home/runner/work/postgresql/postgresql/src/test/regress/expected/create_table.out	2025-04-21 07:32:03.731119788 +0000
+++ /home/runner/work/postgresql/postgresql/src/test/regress/results/create_table.out	2025-04-21 07:38:31.358134750 +0000
@@ -810,8 +810,13 @@
  LINE 1: ...TITION OF parted FOR VALUES IN ('c') PARTITION BY RANGE (c);
                                                                      ^
  CREATE TABLE part_c PARTITION OF parted (b WITH OPTIONS NOT NULL DEFAULT 0) FOR VALUES IN ('c') PARTITION BY RANGE ((b));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ULL DEFAULT 0) FOR VALUES IN ('c') PARTITION BY RANGE ((b));
+                                                                  ^
+DETAIL:  Column "b" is a generated column.
  -- create a level-2 partition
  CREATE TABLE part_c_1_10 PARTITION OF part_c FOR VALUES FROM (1) TO (10);
+ERROR:  relation "part_c" does not exist
  -- check that NOT NULL and default value are inherited correctly
  create table parted_notnull_inh_test (a int default 1, b int not null default 0) partition by list (a);
  create table parted_notnull_inh_test1 partition of parted_notnull_inh_test (a not null, b default 1) for values in (1);
@@ -871,30 +876,8 @@
  -- Both partition bound and partition key in describe output
  \d+ part_c
-                             Partitioned table "public.part_c"
- Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
---------+---------+-----------+----------+---------+----------+--------------+-------------
- a      | text    |           |          |         | extended |              |
- b      | integer |           | not null | 0       | plain    |              |
-Partition of: parted FOR VALUES IN ('c')
-Partition constraint: ((a IS NOT NULL) AND (a = 'c'::text))
-Partition key: RANGE (b)
-Not-null constraints:
-    "part_c_b_not_null" NOT NULL "b" (local, inherited)
-Partitions: part_c_1_10 FOR VALUES FROM (1) TO (10)
-
  -- a level-2 partition's constraint will include the parent's expressions
  \d+ part_c_1_10
-                                Table "public.part_c_1_10"
- Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
---------+---------+-----------+----------+---------+----------+--------------+-------------
- a      | text    |           |          |         | extended |              |
- b      | integer |           | not null | 0       | plain    |              |
-Partition of: part_c FOR VALUES FROM (1) TO (10)
-Partition constraint: ((a IS NOT NULL) AND (a = 'c'::text) AND (b IS NOT NULL) AND (b >= 1) AND (b < 10))
-Not-null constraints:
-    "part_c_b_not_null" NOT NULL "b" (inherited)
-
  -- Show partition count in the parent's describe output
  -- Tempted to include \d+ output listing partitions with bound info but
  -- output could vary depending on the order in which partition oids are
@@ -906,7 +889,7 @@
   a      | text    |           |          |
   b      | integer |           | not null | 0
  Partition key: LIST (a)
-Number of partitions: 3 (Use \d+ to list them.)
+Number of partitions: 2 (Use \d+ to list them.)

\d hash_parted
Partitioned table "public.hash_parted"
----------------------------

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#3jian he
jian.universality@gmail.com
In reply to: Fujii Masao (#2)
1 attachment(s)
Re: bug: virtual generated column can be partition key

On Mon, Apr 21, 2025 at 4:02 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2025/04/21 11:30, jian he wrote:

hi.
While trying to make the virtual generated column be part of the partition key,
I found this bug.

I haven't looked at the patch in detail yet, but when I applied it
and ran the regression tests with RELCACHE_FORCE_RELEASE and
CATCACHE_FORCE_RELEASE enabled, the tests failed with the following diff:

----------------------------
========= Contents of ./src/test/regress/regression.diffs
diff -U3 /home/runner/work/postgresql/postgresql/src/test/regress/expected/create_table.out /home/runner/work/postgresql/postgresql/src/test/regress/results/create_table.out
--- /home/runner/work/postgresql/postgresql/src/test/regress/expected/create_table.out  2025-04-21 07:32:03.731119788 +0000
+++ /home/runner/work/postgresql/postgresql/src/test/regress/results/create_table.out   2025-04-21 07:38:31.358134750 +0000
@@ -810,8 +810,13 @@
LINE 1: ...TITION OF parted FOR VALUES IN ('c') PARTITION BY RANGE (c);
^
CREATE TABLE part_c PARTITION OF parted (b WITH OPTIONS NOT NULL DEFAULT 0) FOR VALUES IN ('c') PARTITION BY RANGE ((b));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ULL DEFAULT 0) FOR VALUES IN ('c') PARTITION BY RANGE ((b));
+                                                                  ^
+DETAIL:  Column "b" is a generated column.
-- create a level-2 partition
CREATE TABLE part_c_1_10 PARTITION OF part_c FOR VALUES FROM (1) TO (10);
+ERROR:  relation "part_c" does not exist
-- check that NOT NULL and default value are inherited correctly
create table parted_notnull_inh_test (a int default 1, b int not null default 0) partition by list (a);
create table parted_notnull_inh_test1 partition of parted_notnull_inh_test (a not null, b default 1) for values in (1);
@@ -871,30 +876,8 @@
-- Both partition bound and partition key in describe output
\d+ part_c
-                             Partitioned table "public.part_c"
- Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
---------+---------+-----------+----------+---------+----------+--------------+-------------
- a      | text    |           |          |         | extended |              |
- b      | integer |           | not null | 0       | plain    |              |
-Partition of: parted FOR VALUES IN ('c')
-Partition constraint: ((a IS NOT NULL) AND (a = 'c'::text))
-Partition key: RANGE (b)
-Not-null constraints:
-    "part_c_b_not_null" NOT NULL "b" (local, inherited)
-Partitions: part_c_1_10 FOR VALUES FROM (1) TO (10)
-
-- a level-2 partition's constraint will include the parent's expressions
\d+ part_c_1_10
-                                Table "public.part_c_1_10"
- Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
---------+---------+-----------+----------+---------+----------+--------------+-------------
- a      | text    |           |          |         | extended |              |
- b      | integer |           | not null | 0       | plain    |              |
-Partition of: part_c FOR VALUES FROM (1) TO (10)
-Partition constraint: ((a IS NOT NULL) AND (a = 'c'::text) AND (b IS NOT NULL) AND (b >= 1) AND (b < 10))
-Not-null constraints:
-    "part_c_b_not_null" NOT NULL "b" (inherited)
-
-- Show partition count in the parent's describe output
-- Tempted to include \d+ output listing partitions with bound info but
-- output could vary depending on the order in which partition oids are
@@ -906,7 +889,7 @@
a      | text    |           |          |
b      | integer |           | not null | 0
Partition key: LIST (a)
-Number of partitions: 3 (Use \d+ to list them.)
+Number of partitions: 2 (Use \d+ to list them.)

\d hash_parted
Partitioned table "public.hash_parted"

Thanks for pointing it out.

i think it's related to my silly mistake:
if (TupleDescAttr(RelationGetDescr(rel),
var->varattno)->attgenerated)
should be
if (TupleDescAttr(RelationGetDescr(rel), var->varattno
- 1)->attgenerated)

Feel free to test it again.

Attachments:

v2-0001-virtual-generated-column-can-be-partition-key.patchtext/x-patch; charset=US-ASCII; name=v2-0001-virtual-generated-column-can-be-partition-key.patchDownload
From 0607e2e3f89e8a30ee9233f1ec253542936a08fd Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Mon, 21 Apr 2025 17:09:27 +0800
Subject: [PATCH v2 1/1] virtual generated column can be partition key

CREATE TABLE gtest_part_key (
    f1 date NOT NULL, f2 bigint,
    f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
    PARTITION BY RANGE (f3);

ERROR:  cannot use generated column in partition key
LINE 4:     PARTITION BY RANGE (f3);
                                ^
DETAIL:  Column "f3" is a generated column.

the following is essentially the same as above, it should also fail.

CREATE TABLE gtest_part_key (
    f1 date NOT NULL, f2 bigint,
    f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
    PARTITION BY RANGE ((f3));

discussion: https://postgr.es/m/CACJufxF=WDGthXSAQr9thYUsfx_1_t9E6N8tE3B8EqXcVoVfQw@mail.gmail.com
---
 src/backend/commands/tablecmds.c                | 12 +++++++++++-
 src/test/regress/expected/generated_stored.out  |  5 +++++
 src/test/regress/expected/generated_virtual.out |  5 +++++
 src/test/regress/sql/generated_stored.sql       |  1 +
 src/test/regress/sql/generated_virtual.sql      |  1 +
 5 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 265b1c397fb..721734d338d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19794,11 +19794,21 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			if (IsA(expr, Var) &&
 				((Var *) expr)->varattno > 0)
 			{
+				Var		   *var = (Var *) expr;
+
+				if (TupleDescAttr(RelationGetDescr(rel), var->varattno - 1)->attgenerated)
+					ereport(ERROR,
+							errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+							errmsg("cannot use generated column in partition key"),
+							errdetail("Column \"%s\" is a generated column.",
+									  get_attname(RelationGetRelid(rel), var->varattno, false)),
+							parser_errposition(pstate, pelem->location));
+
 				/*
 				 * User wrote "(column)" or "(column COLLATE something)".
 				 * Treat it like simple attribute anyway.
 				 */
-				partattrs[attn] = ((Var *) expr)->varattno;
+				partattrs[attn] = var->varattno;
 			}
 			else
 			{
diff --git a/src/test/regress/expected/generated_stored.out b/src/test/regress/expected/generated_stored.out
index 16de30ab191..72f39f2f0c1 100644
--- a/src/test/regress/expected/generated_stored.out
+++ b/src/test/regress/expected/generated_stored.out
@@ -1074,6 +1074,11 @@ ERROR:  cannot use generated column in partition key
 LINE 1: ...ENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3);
                                                                    ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
+                                                                 ^
+DETAIL:  Column "f3" is a generated column.
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
 ERROR:  cannot use generated column in partition key
 LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
diff --git a/src/test/regress/expected/generated_virtual.out b/src/test/regress/expected/generated_virtual.out
index 6300e7c1d96..2a6dc84b8ca 100644
--- a/src/test/regress/expected/generated_virtual.out
+++ b/src/test/regress/expected/generated_virtual.out
@@ -1027,6 +1027,11 @@ ERROR:  cannot use generated column in partition key
 LINE 1: ...NERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3);
                                                                    ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...RATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
+                                                                 ^
+DETAIL:  Column "f3" is a generated column.
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
 ERROR:  cannot use generated column in partition key
 LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
diff --git a/src/test/regress/sql/generated_stored.sql b/src/test/regress/sql/generated_stored.sql
index 4ec155f2da9..c9427c2d4d8 100644
--- a/src/test/regress/sql/generated_stored.sql
+++ b/src/test/regress/sql/generated_stored.sql
@@ -500,6 +500,7 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
 
 -- generated columns in partition key (not allowed)
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3);
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
 
 -- ALTER TABLE ... ADD COLUMN
diff --git a/src/test/regress/sql/generated_virtual.sql b/src/test/regress/sql/generated_virtual.sql
index b4eedeee2fb..652057bd707 100644
--- a/src/test/regress/sql/generated_virtual.sql
+++ b/src/test/regress/sql/generated_virtual.sql
@@ -534,6 +534,7 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
 
 -- generated columns in partition key (not allowed)
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3);
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
 
 -- ALTER TABLE ... ADD COLUMN
-- 
2.34.1

#4Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: jian he (#3)
1 attachment(s)
Re: bug: virtual generated column can be partition key

On Mon, Apr 21, 2025 at 2:42 PM jian he <jian.universality@gmail.com> wrote:

On Mon, Apr 21, 2025 at 4:02 PM Fujii Masao <masao.fujii@oss.nttdata.com>
wrote:

On 2025/04/21 11:30, jian he wrote:

hi.
While trying to make the virtual generated column be part of the

partition key,

I found this bug.

I noticed that the check for a generated column in the partition key
expression already exists a few lines below. Had that check placed before
the if .. else ... block, we wouldn't have had this problem.
if (IsA(expr, Var) && ((Var *) expr)->varattno > 0)
{
...
}
else
{

I admit that pull_varattnos() + loop through bms_next_member() is a bit
wasteful when the expression is just a Var. But probably it's worth taking
that hit for the sake of avoiding code duplication. Further, I believe that
the two loops: one for system attribute check and the other for generated
attribute check can be combined, something like attached. Then it's not as
wasteful as it looks and we probably save a loop.

While looking at this I realised that a generated column may end up being
part of the partition key if the partition key expression contains a whole
row reference. Attached patch also has a fix and a testcase for the
same. PARTITION BY RANGE ((gtest_part_key is not null)) expression in the
test is kinda silly, but it tests the whole-row reference as part of an
expression. I haven't looked for more sensible expressions.

I have included your original tests, but ended up rewriting code. Please
let me know what do you think.

--
Best Wishes,
Ashutosh Bapat

Attachments:

0001-Tighten-check-for-generated-column-in-parti-20250421.patchapplication/x-patch; name=0001-Tighten-check-for-generated-column-in-parti-20250421.patchDownload
From 2adb796b1fdfa5e72c7302e4ecaf7d69b105d0f3 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Date: Mon, 21 Apr 2025 18:06:58 +0530
Subject: [PATCH] Tighten check for generated column in partition key
 expression

A generated column may end up being part of the partition key
expression, if it's specified as an expression e.g. "(<generated column
name>)" or if the partition key expression contains a whole-row
reference, even though we do not allow a generated column to be part of
partition key expression. Fix this hole.

Discussion: https://postgr.es/m/CACJufxF=WDGthXSAQr9thYUsfx_1_t9E6N8tE3B8EqXcVoVfQw@mail.gmail.com
---
 src/backend/commands/tablecmds.c              | 96 +++++++++++--------
 .../regress/expected/generated_stored.out     | 15 +++
 .../regress/expected/generated_virtual.out    | 15 +++
 src/test/regress/sql/generated_stored.sql     |  3 +
 src/test/regress/sql/generated_virtual.sql    |  3 +
 5 files changed, 92 insertions(+), 40 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 265b1c397fb..42610f50b0b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19768,6 +19768,8 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			/* Expression */
 			Node	   *expr = pelem->expr;
 			char		partattname[16];
+			Bitmapset  *expr_attrs = NULL;
+			int			i;
 
 			Assert(expr != NULL);
 			atttype = exprType(expr);
@@ -19791,43 +19793,42 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			while (IsA(expr, CollateExpr))
 				expr = (Node *) ((CollateExpr *) expr)->arg;
 
-			if (IsA(expr, Var) &&
-				((Var *) expr)->varattno > 0)
+			/*
+			 * Examine all the columns in the partition key expression. When
+			 * the whole-row reference is present, examine all the columns of
+			 * the partitioned table.
+			 */
+			pull_varattnos(expr, 1, &expr_attrs);
+			if (bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs))
 			{
-				/*
-				 * User wrote "(column)" or "(column COLLATE something)".
-				 * Treat it like simple attribute anyway.
-				 */
-				partattrs[attn] = ((Var *) expr)->varattno;
+				expr_attrs = bms_add_range(expr_attrs,
+										   1 - FirstLowInvalidHeapAttributeNumber,
+										   RelationGetNumberOfAttributes(rel) - FirstLowInvalidHeapAttributeNumber);
+				expr_attrs = bms_del_member(expr_attrs, 0 - FirstLowInvalidHeapAttributeNumber);
 			}
-			else
-			{
-				Bitmapset  *expr_attrs = NULL;
-				int			i;
 
-				partattrs[attn] = 0;	/* marks the column as expression */
-				*partexprs = lappend(*partexprs, expr);
+			i = -1;
+			while ((i = bms_next_member(expr_attrs, i)) >= 0)
+			{
+				AttrNumber	attno = i + FirstLowInvalidHeapAttributeNumber;
 
-				/*
-				 * transformPartitionSpec() should have already rejected
-				 * subqueries, aggregates, window functions, and SRFs, based
-				 * on the EXPR_KIND_ for partition expressions.
-				 */
+				Assert(attno != 0);
 
 				/*
 				 * Cannot allow system column references, since that would
 				 * make partition routing impossible: their values won't be
 				 * known yet when we need to do that.
 				 */
-				pull_varattnos(expr, 1, &expr_attrs);
-				for (i = FirstLowInvalidHeapAttributeNumber; i < 0; i++)
-				{
-					if (bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-									  expr_attrs))
-						ereport(ERROR,
-								(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
-								 errmsg("partition key expressions cannot contain system column references")));
-				}
+				if (attno < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+							 errmsg("partition key expressions cannot contain system column references")));
+
+				/*
+				 * We do not check dropped columns explicitly since they will
+				 * be eliminated when expanding the the whole row expression
+				 * anyway.
+				 */
 
 				/*
 				 * Stored generated columns cannot work: They are computed
@@ -19837,20 +19838,35 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 				 * SET EXPRESSION would need to check whether the column is
 				 * used in partition keys).  Seems safer to prohibit for now.
 				 */
-				i = -1;
-				while ((i = bms_next_member(expr_attrs, i)) >= 0)
-				{
-					AttrNumber	attno = i + FirstLowInvalidHeapAttributeNumber;
+				if (TupleDescAttr(RelationGetDescr(rel), attno - 1)->attgenerated)
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+							 errmsg("cannot use generated column in partition key"),
+							 errdetail("Column \"%s\" is a generated column.",
+									   get_attname(RelationGetRelid(rel), attno, false)),
+							 parser_errposition(pstate, pelem->location)));
+			}
 
-					if (attno > 0 &&
-						TupleDescAttr(RelationGetDescr(rel), attno - 1)->attgenerated)
-						ereport(ERROR,
-								(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
-								 errmsg("cannot use generated column in partition key"),
-								 errdetail("Column \"%s\" is a generated column.",
-										   get_attname(RelationGetRelid(rel), attno, false)),
-								 parser_errposition(pstate, pelem->location)));
-				}
+			if (IsA(expr, Var) &&
+				((Var *) expr)->varattno > 0)
+			{
+
+				/*
+				 * User wrote "(column)" or "(column COLLATE something)".
+				 * Treat it like simple attribute anyway.
+				 */
+				partattrs[attn] = ((Var *) expr)->varattno;
+			}
+			else
+			{
+				partattrs[attn] = 0;	/* marks the column as expression */
+				*partexprs = lappend(*partexprs, expr);
+
+				/*
+				 * transformPartitionSpec() should have already rejected
+				 * subqueries, aggregates, window functions, and SRFs, based
+				 * on the EXPR_KIND_ for partition expressions.
+				 */
 
 				/*
 				 * Preprocess the expression before checking for mutability.
diff --git a/src/test/regress/expected/generated_stored.out b/src/test/regress/expected/generated_stored.out
index 16de30ab191..eb981b0689b 100644
--- a/src/test/regress/expected/generated_stored.out
+++ b/src/test/regress/expected/generated_stored.out
@@ -1074,11 +1074,26 @@ ERROR:  cannot use generated column in partition key
 LINE 1: ...ENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3);
                                                                    ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
+                                                                 ^
+DETAIL:  Column "f3" is a generated column.
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
 ERROR:  cannot use generated column in partition key
 LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
                                                              ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key is not null));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.
 -- ALTER TABLE ... ADD COLUMN
 CREATE TABLE gtest25 (a int PRIMARY KEY);
 INSERT INTO gtest25 VALUES (3), (4);
diff --git a/src/test/regress/expected/generated_virtual.out b/src/test/regress/expected/generated_virtual.out
index 6300e7c1d96..75a1730e4d2 100644
--- a/src/test/regress/expected/generated_virtual.out
+++ b/src/test/regress/expected/generated_virtual.out
@@ -1027,11 +1027,26 @@ ERROR:  cannot use generated column in partition key
 LINE 1: ...NERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3);
                                                                    ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...RATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
+                                                                 ^
+DETAIL:  Column "f3" is a generated column.
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
 ERROR:  cannot use generated column in partition key
 LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
                                                              ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key is not null));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.
 -- ALTER TABLE ... ADD COLUMN
 CREATE TABLE gtest25 (a int PRIMARY KEY);
 INSERT INTO gtest25 VALUES (3), (4);
diff --git a/src/test/regress/sql/generated_stored.sql b/src/test/regress/sql/generated_stored.sql
index 4ec155f2da9..296429fbb23 100644
--- a/src/test/regress/sql/generated_stored.sql
+++ b/src/test/regress/sql/generated_stored.sql
@@ -500,7 +500,10 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
 
 -- generated columns in partition key (not allowed)
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3);
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key));
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key is not null));
 
 -- ALTER TABLE ... ADD COLUMN
 CREATE TABLE gtest25 (a int PRIMARY KEY);
diff --git a/src/test/regress/sql/generated_virtual.sql b/src/test/regress/sql/generated_virtual.sql
index b4eedeee2fb..033a251fb20 100644
--- a/src/test/regress/sql/generated_virtual.sql
+++ b/src/test/regress/sql/generated_virtual.sql
@@ -534,7 +534,10 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
 
 -- generated columns in partition key (not allowed)
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3);
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key));
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key is not null));
 
 -- ALTER TABLE ... ADD COLUMN
 CREATE TABLE gtest25 (a int PRIMARY KEY);

base-commit: 80b727eb9deab589a8648750bc20f1623d5acd3e
-- 
2.34.1

#5jian he
jian.universality@gmail.com
In reply to: Ashutosh Bapat (#4)
Re: bug: virtual generated column can be partition key

On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

While looking at this I realised that a generated column may end up being part of the partition key if the partition key expression contains a whole row reference. Attached patch also has a fix and a testcase for the same. PARTITION BY RANGE ((gtest_part_key is not null)) expression in the test is kinda silly, but it tests the whole-row reference as part of an expression. I haven't looked for more sensible expressions.

I begin to wonder if wholerow reference should be allowed.
then error occurred:

drop table if exists t4;
CREATE TABLE t4(f1 int, f2 bigint) PARTITION BY list ((t4));
create table t4_1 partition of t4 for values in ((1,2));
alter table t4 alter column f2 set data type text using f2;

insert into t4 select 1, '2';
ERROR: invalid memory alloc request size 18446744073709551615

#6Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: jian he (#5)
Re: bug: virtual generated column can be partition key

On Tue, Apr 22, 2025 at 11:19 AM jian he <jian.universality@gmail.com>
wrote:

On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

While looking at this I realised that a generated column may end up

being part of the partition key if the partition key expression contains a
whole row reference. Attached patch also has a fix and a testcase for the
same. PARTITION BY RANGE ((gtest_part_key is not null)) expression in the
test is kinda silly, but it tests the whole-row reference as part of an
expression. I haven't looked for more sensible expressions.

I begin to wonder if wholerow reference should be allowed.
then error occurred:

drop table if exists t4;
CREATE TABLE t4(f1 int, f2 bigint) PARTITION BY list ((t4));
create table t4_1 partition of t4 for values in ((1,2));
alter table t4 alter column f2 set data type text using f2;

insert into t4 select 1, '2';
ERROR: invalid memory alloc request size 18446744073709551615

I can reproduce this error without my patc. It seems to be a different
existing bug.

--
Best Wishes,
Ashutosh Bapat

#7jian he
jian.universality@gmail.com
In reply to: Ashutosh Bapat (#4)
Re: bug: virtual generated column can be partition key

On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

I have included your original tests, but ended up rewriting code. Please let me know what do you think.

+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("partition key expressions cannot contain system column references")));
+
+ /*
+ * We do not check dropped columns explicitly since they will
+ * be eliminated when expanding the the whole row expression
+ * anyway.
+ */
typo: "the the".
I am confused by the above comments.
ComputePartitionAttrs only called in DefineRelation.
DefineRelation will only CREATE a table, there will be no dropped
column via DefineRelation.
+ /*
+ * transformPartitionSpec() should have already rejected
+ * subqueries, aggregates, window functions, and SRFs, based
+ * on the EXPR_KIND_ for partition expressions.
+ */
"EXPR_KIND_" can change to EXPR_KIND_PARTITION_EXPRESSION
?

Other than that, it looks good to me for fixing this bug.

#8jian he
jian.universality@gmail.com
In reply to: jian he (#7)
1 attachment(s)
Re: bug: virtual generated column can be partition key

On Tue, Apr 22, 2025 at 3:02 PM jian he <jian.universality@gmail.com> wrote:

Other than that, it looks good to me for fixing this bug.

The error message seems not that intuitive.

+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.

with the attached patch. now,
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
ERROR: cannot use generated column in partition key
LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
^
DETAIL: Generated column "f3" is part of the partition key of
relation "gtest_part_key"

what do you think?

Attachments:

partition_expr_generated_minor_errormessage.difftext/x-patch; charset=US-ASCII; name=partition_expr_generated_minor_errormessage.diffDownload
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 42610f50b0b..a2908fdeef9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19770,6 +19770,7 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			char		partattname[16];
 			Bitmapset  *expr_attrs = NULL;
 			int			i;
+			bool		whole_row = false;
 
 			Assert(expr != NULL);
 			atttype = exprType(expr);
@@ -19799,7 +19800,9 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			 * the partitioned table.
 			 */
 			pull_varattnos(expr, 1, &expr_attrs);
-			if (bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs))
+			whole_row = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs);
+
+			if (whole_row)
 			{
 				expr_attrs = bms_add_range(expr_attrs,
 										   1 - FirstLowInvalidHeapAttributeNumber,
@@ -19839,12 +19842,23 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 				 * used in partition keys).  Seems safer to prohibit for now.
 				 */
 				if (TupleDescAttr(RelationGetDescr(rel), attno - 1)->attgenerated)
-					ereport(ERROR,
-							(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
-							 errmsg("cannot use generated column in partition key"),
-							 errdetail("Column \"%s\" is a generated column.",
-									   get_attname(RelationGetRelid(rel), attno, false)),
-							 parser_errposition(pstate, pelem->location)));
+				{
+					if (!whole_row)
+						ereport(ERROR,
+								errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+								errmsg("cannot use generated column in partition key"),
+								errdetail("Column \"%s\" is a generated column.",
+										get_attname(RelationGetRelid(rel), attno, false)),
+								parser_errposition(pstate, pelem->location));
+					else
+						ereport(ERROR,
+								errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+								errmsg("cannot use generated column in partition key"),
+								errdetail("Generated column \"%s\" is part of the partition key of relation \"%s\"",
+										  get_attname(RelationGetRelid(rel), attno, false),
+										  RelationGetRelationName(rel)),
+								parser_errposition(pstate, pelem->location));
+				}
 			}
 
 			if (IsA(expr, Var) &&
#9Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: jian he (#7)
1 attachment(s)
Re: bug: virtual generated column can be partition key

Thanks Jian for your review.

On Tue, Apr 22, 2025 at 12:32 PM jian he <jian.universality@gmail.com>
wrote:

On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

I have included your original tests, but ended up rewriting code. Please

let me know what do you think.

+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("partition key expressions cannot contain system column
references")));
+
+ /*
+ * We do not check dropped columns explicitly since they will
+ * be eliminated when expanding the the whole row expression
+ * anyway.
+ */
typo: "the the".
I am confused by the above comments.
ComputePartitionAttrs only called in DefineRelation.
DefineRelation will only CREATE a table, there will be no dropped
column via DefineRelation.

You are right! Fixed.

+ /*
+ * transformPartitionSpec() should have already rejected
+ * subqueries, aggregates, window functions, and SRFs, based
+ * on the EXPR_KIND_ for partition expressions.
+ */
"EXPR_KIND_" can change to EXPR_KIND_PARTITION_EXPRESSION
?

That's an existing comment which appears to be moved in diff but it's
actually untouched by the patch. Maybe you are right, IDK, but since it's
not related to the fix, let's leave it untouched. I did wonder whether that
comment has any relation to the pull_varattnos call which has been moved
earlier since pull_varattnos doesn't expect any Query node in the tree. But
pondering more, I think the comment is related to the number of rows
participating in the partition key computation - there should be exactly
one key for one tuple. Having SRFs, subqueries in part expression means a
possibility of producing more than one set of partition keys, aggregates
and window functions OTOH will produce one partition key for more than one
row - all of them breaking 1:1 mapping between a tuple and a partition key.
Hence I think the comment is at the right place.

PFA revised patch.

--
Best Wishes,
Ashutosh Bapat

Attachments:

0001-Tighten-check-for-generated-column-in-parti-20250422.patchtext/x-patch; charset=US-ASCII; name=0001-Tighten-check-for-generated-column-in-parti-20250422.patchDownload
From 5467147f6b37898263c78ce64b086072c76caaad Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Date: Mon, 21 Apr 2025 18:06:58 +0530
Subject: [PATCH] Tighten check for generated column in partition key
 expression

A generated column may end up being part of the partition key
expression, if it's specified as an expression e.g. "(<generated column
name>)" or if the partition key expression contains a whole-row
reference, even though we do not allow a generated column to be part of
partition key expression. Fix this hole.

Discussion: https://postgr.es/m/CACJufxF=WDGthXSAQr9thYUsfx_1_t9E6N8tE3B8EqXcVoVfQw@mail.gmail.com
---
 src/backend/commands/tablecmds.c              | 90 ++++++++++---------
 .../regress/expected/generated_stored.out     | 15 ++++
 .../regress/expected/generated_virtual.out    | 15 ++++
 src/test/regress/sql/generated_stored.sql     |  3 +
 src/test/regress/sql/generated_virtual.sql    |  3 +
 5 files changed, 86 insertions(+), 40 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 265b1c397fb..f7b7162a99c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19768,6 +19768,8 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			/* Expression */
 			Node	   *expr = pelem->expr;
 			char		partattname[16];
+			Bitmapset  *expr_attrs = NULL;
+			int			i;
 
 			Assert(expr != NULL);
 			atttype = exprType(expr);
@@ -19791,43 +19793,36 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			while (IsA(expr, CollateExpr))
 				expr = (Node *) ((CollateExpr *) expr)->arg;
 
-			if (IsA(expr, Var) &&
-				((Var *) expr)->varattno > 0)
+			/*
+			 * Examine all the columns in the partition key expression. When
+			 * the whole-row reference is present, examine all the columns of
+			 * the partitioned table.
+			 */
+			pull_varattnos(expr, 1, &expr_attrs);
+			if (bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs))
 			{
-				/*
-				 * User wrote "(column)" or "(column COLLATE something)".
-				 * Treat it like simple attribute anyway.
-				 */
-				partattrs[attn] = ((Var *) expr)->varattno;
+				expr_attrs = bms_add_range(expr_attrs,
+										   1 - FirstLowInvalidHeapAttributeNumber,
+										   RelationGetNumberOfAttributes(rel) - FirstLowInvalidHeapAttributeNumber);
+				expr_attrs = bms_del_member(expr_attrs, 0 - FirstLowInvalidHeapAttributeNumber);
 			}
-			else
-			{
-				Bitmapset  *expr_attrs = NULL;
-				int			i;
 
-				partattrs[attn] = 0;	/* marks the column as expression */
-				*partexprs = lappend(*partexprs, expr);
+			i = -1;
+			while ((i = bms_next_member(expr_attrs, i)) >= 0)
+			{
+				AttrNumber	attno = i + FirstLowInvalidHeapAttributeNumber;
 
-				/*
-				 * transformPartitionSpec() should have already rejected
-				 * subqueries, aggregates, window functions, and SRFs, based
-				 * on the EXPR_KIND_ for partition expressions.
-				 */
+				Assert(attno != 0);
 
 				/*
 				 * Cannot allow system column references, since that would
 				 * make partition routing impossible: their values won't be
 				 * known yet when we need to do that.
 				 */
-				pull_varattnos(expr, 1, &expr_attrs);
-				for (i = FirstLowInvalidHeapAttributeNumber; i < 0; i++)
-				{
-					if (bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-									  expr_attrs))
-						ereport(ERROR,
-								(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
-								 errmsg("partition key expressions cannot contain system column references")));
-				}
+				if (attno < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+							 errmsg("partition key expressions cannot contain system column references")));
 
 				/*
 				 * Stored generated columns cannot work: They are computed
@@ -19837,20 +19832,35 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 				 * SET EXPRESSION would need to check whether the column is
 				 * used in partition keys).  Seems safer to prohibit for now.
 				 */
-				i = -1;
-				while ((i = bms_next_member(expr_attrs, i)) >= 0)
-				{
-					AttrNumber	attno = i + FirstLowInvalidHeapAttributeNumber;
+				if (TupleDescAttr(RelationGetDescr(rel), attno - 1)->attgenerated)
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+							 errmsg("cannot use generated column in partition key"),
+							 errdetail("Column \"%s\" is a generated column.",
+									   get_attname(RelationGetRelid(rel), attno, false)),
+							 parser_errposition(pstate, pelem->location)));
+			}
 
-					if (attno > 0 &&
-						TupleDescAttr(RelationGetDescr(rel), attno - 1)->attgenerated)
-						ereport(ERROR,
-								(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
-								 errmsg("cannot use generated column in partition key"),
-								 errdetail("Column \"%s\" is a generated column.",
-										   get_attname(RelationGetRelid(rel), attno, false)),
-								 parser_errposition(pstate, pelem->location)));
-				}
+			if (IsA(expr, Var) &&
+				((Var *) expr)->varattno > 0)
+			{
+
+				/*
+				 * User wrote "(column)" or "(column COLLATE something)".
+				 * Treat it like simple attribute anyway.
+				 */
+				partattrs[attn] = ((Var *) expr)->varattno;
+			}
+			else
+			{
+				partattrs[attn] = 0;	/* marks the column as expression */
+				*partexprs = lappend(*partexprs, expr);
+
+				/*
+				 * transformPartitionSpec() should have already rejected
+				 * subqueries, aggregates, window functions, and SRFs, based
+				 * on the EXPR_KIND_ for partition expressions.
+				 */
 
 				/*
 				 * Preprocess the expression before checking for mutability.
diff --git a/src/test/regress/expected/generated_stored.out b/src/test/regress/expected/generated_stored.out
index 16de30ab191..eb981b0689b 100644
--- a/src/test/regress/expected/generated_stored.out
+++ b/src/test/regress/expected/generated_stored.out
@@ -1074,11 +1074,26 @@ ERROR:  cannot use generated column in partition key
 LINE 1: ...ENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3);
                                                                    ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
+                                                                 ^
+DETAIL:  Column "f3" is a generated column.
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
 ERROR:  cannot use generated column in partition key
 LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
                                                              ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key is not null));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.
 -- ALTER TABLE ... ADD COLUMN
 CREATE TABLE gtest25 (a int PRIMARY KEY);
 INSERT INTO gtest25 VALUES (3), (4);
diff --git a/src/test/regress/expected/generated_virtual.out b/src/test/regress/expected/generated_virtual.out
index 6300e7c1d96..75a1730e4d2 100644
--- a/src/test/regress/expected/generated_virtual.out
+++ b/src/test/regress/expected/generated_virtual.out
@@ -1027,11 +1027,26 @@ ERROR:  cannot use generated column in partition key
 LINE 1: ...NERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3);
                                                                    ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...RATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
+                                                                 ^
+DETAIL:  Column "f3" is a generated column.
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
 ERROR:  cannot use generated column in partition key
 LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
                                                              ^
 DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key is not null));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.
 -- ALTER TABLE ... ADD COLUMN
 CREATE TABLE gtest25 (a int PRIMARY KEY);
 INSERT INTO gtest25 VALUES (3), (4);
diff --git a/src/test/regress/sql/generated_stored.sql b/src/test/regress/sql/generated_stored.sql
index 4ec155f2da9..296429fbb23 100644
--- a/src/test/regress/sql/generated_stored.sql
+++ b/src/test/regress/sql/generated_stored.sql
@@ -500,7 +500,10 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
 
 -- generated columns in partition key (not allowed)
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3);
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3));
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3));
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key));
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key is not null));
 
 -- ALTER TABLE ... ADD COLUMN
 CREATE TABLE gtest25 (a int PRIMARY KEY);
diff --git a/src/test/regress/sql/generated_virtual.sql b/src/test/regress/sql/generated_virtual.sql
index b4eedeee2fb..033a251fb20 100644
--- a/src/test/regress/sql/generated_virtual.sql
+++ b/src/test/regress/sql/generated_virtual.sql
@@ -534,7 +534,10 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
 
 -- generated columns in partition key (not allowed)
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3);
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3));
 CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3));
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key));
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key is not null));
 
 -- ALTER TABLE ... ADD COLUMN
 CREATE TABLE gtest25 (a int PRIMARY KEY);

base-commit: e29df428a1dca4112aad640c889a9a54642759c9
-- 
2.34.1

#10Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: jian he (#8)
Re: bug: virtual generated column can be partition key

Sorry I missed this email while sending the patches - our emails crossed in
the air.

On Tue, Apr 22, 2025 at 2:15 PM jian he <jian.universality@gmail.com> wrote:

On Tue, Apr 22, 2025 at 3:02 PM jian he <jian.universality@gmail.com>
wrote:

Other than that, it looks good to me for fixing this bug.

The error message seems not that intuitive.

+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.

Yes. It's not apparent where does f3 appear in the partition key, to a lay
users. Users who explicitly use whole row expression in a partition key,
would know that a whole row expression contains all columns. And the error
location points to the whole-row reference. So the current state isn't that
bad.

with the attached patch. now,
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
ERROR: cannot use generated column in partition key
LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
^
DETAIL: Generated column "f3" is part of the partition key of
relation "gtest_part_key"

The DETAIL here is just mentioning what's already known from the command,
so the change in the DETAIL may not be useful.

If I understand your intention correctly, we have to mention something to
the effect "partition key expression contains a whole-row reference which
in turn contains a generated column." But that might be too verbose.

--
Best Wishes,
Ashutosh Bapat

#11jian he
jian.universality@gmail.com
In reply to: Ashutosh Bapat (#10)
Re: bug: virtual generated column can be partition key

On Tue, Apr 22, 2025 at 4:55 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

Sorry I missed this email while sending the patches - our emails crossed in the air.

On Tue, Apr 22, 2025 at 2:15 PM jian he <jian.universality@gmail.com> wrote:

On Tue, Apr 22, 2025 at 3:02 PM jian he <jian.universality@gmail.com> wrote:

Other than that, it looks good to me for fixing this bug.

The error message seems not that intuitive.

+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+                                                             ^
+DETAIL:  Column "f3" is a generated column.

Yes. It's not apparent where does f3 appear in the partition key, to a lay users. Users who explicitly use whole row expression in a partition key, would know that a whole row expression contains all columns. And the error location points to the whole-row reference. So the current state isn't that bad.

with the attached patch. now,
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
ERROR: cannot use generated column in partition key
LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
^
DETAIL: Generated column "f3" is part of the partition key of
relation "gtest_part_key"

The DETAIL here is just mentioning what's already known from the command, so the change in the DETAIL may not be useful.

If I understand your intention correctly, we have to mention something to the effect "partition key expression contains a whole-row reference which in turn contains a generated column." But that might be too verbose.

you understand my intention correctly,
I aggrees current error message in that location is not that bad.

overall your latest patch looks good to me.

#12jian he
jian.universality@gmail.com
In reply to: jian he (#11)
#13Yura Sokolov
y.sokolov@postgrespro.ru
In reply to: jian he (#1)
Re: bug: virtual generated column can be partition key

21.04.2025 05:30, jian he пишет:

hi.
While trying to make the virtual generated column be part of the partition key,
I found this bug.
it also influences the stored generated column, i added a test
on generated_stored.sql.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE (f3);

ERROR: cannot use generated column in partition key
LINE 4: PARTITION BY RANGE (f3);
^
DETAIL: Column "f3" is a generated column.

the following is essentially the same as above, it should also fail.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE ((f3));

I don't understand why latter should fail?

Documentation says [1]https://www.postgresql.org/docs/17/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE:

PostgreSQL allows you to declare that a table is divided into partitions.
The table that is divided is referred to as a partitioned table.
The declaration includes the partitioning method as described above,
plus a list of columns or expressions to be used as the partition key.

Note: "list of columns or EXPRESSIONS"!

In first case you pass list of columns (which contains single column f3). I
don't get which internal restriction forces it to fail, really, but ok:
there is restriction on COLUMNS LIST and it must be obeyed.

But in second case you pass EXPRESSION, and I don't think same restriction
should be applied.

More over, if you look into comments on restriction on GENERATED columns
[2]: https://github.com/postgres/postgres/blob/caa76b91a60681dff0bf193b64d4dcdc1014e036/src/backend/commands/tablecmds.c#L19735-L19741
generated columns, and it doesn't bound to VIRTUAL ones. Indeed, there is
suggestion for future to treat GENERATED VIRTUAL columns as expressions.

So, if you want to force some restriction, you should force it only against
STORED columns, but not VIRTUAL ones.
Of cause, if VIRTUAL column depends on STORED one, it is still should be
forbidden.

Certainly, it is my opinion and I could be mistaken somewhere.
For example, you may say "it is too hard to check dependency of VIRTUAL
column at the moment, so it is simpler to forbid them as well". But then it
should be clearly stated in commit messages and code comments.

[1]: https://www.postgresql.org/docs/17/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE
https://www.postgresql.org/docs/17/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE
[2]: https://github.com/postgres/postgres/blob/caa76b91a60681dff0bf193b64d4dcdc1014e036/src/backend/commands/tablecmds.c#L19735-L19741
https://github.com/postgres/postgres/blob/caa76b91a60681dff0bf193b64d4dcdc1014e036/src/backend/commands/tablecmds.c#L19735-L19741
[3]: https://github.com/postgres/postgres/blob/caa76b91a60681dff0bf193b64d4dcdc1014e036/src/backend/commands/tablecmds.c#L19821-L19828
https://github.com/postgres/postgres/blob/caa76b91a60681dff0bf193b64d4dcdc1014e036/src/backend/commands/tablecmds.c#L19821-L19828

--
regards
Yura Sokolov aka funny-falcon

#14jian he
jian.universality@gmail.com
In reply to: Yura Sokolov (#13)
Re: bug: virtual generated column can be partition key

On Tue, May 6, 2025 at 5:57 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:

21.04.2025 05:30, jian he пишет:

hi.
While trying to make the virtual generated column be part of the partition key,
I found this bug.
it also influences the stored generated column, i added a test
on generated_stored.sql.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE (f3);

ERROR: cannot use generated column in partition key
LINE 4: PARTITION BY RANGE (f3);
^
DETAIL: Column "f3" is a generated column.

the following is essentially the same as above, it should also fail.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE ((f3));

I don't understand why latter should fail?

Documentation says [1]:

PostgreSQL allows you to declare that a table is divided into partitions.
The table that is divided is referred to as a partitioned table.
The declaration includes the partitioning method as described above,
plus a list of columns or expressions to be used as the partition key.

Note: "list of columns or EXPRESSIONS"!

In first case you pass list of columns (which contains single column f3). I
don't get which internal restriction forces it to fail, really, but ok:
there is restriction on COLUMNS LIST and it must be obeyed.

But in second case you pass EXPRESSION, and I don't think same restriction
should be applied.

More over, if you look into comments on restriction on GENERATED columns
[2] [3], you will find this restriction is because of nature of STORED
generated columns, and it doesn't bound to VIRTUAL ones. Indeed, there is
suggestion for future to treat GENERATED VIRTUAL columns as expressions.

hi.
you already pointed out the problem.
As of now GENERATED VIRTUAL columns are not supported as partition keys,
therefore we need to disallow generated columns being referenced in
the partition key in any representation.

currently in a matser:
CREATE TABLE xx (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED
ALWAYS AS (f2 * 2) virtual) PARTITION BY RANGE (f3);
will yield error, however
CREATE TABLE xx (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED
ALWAYS AS (f2 * 2) virtual) PARTITION BY RANGE ((f3));
will not.
but these two form essentially the same thing.

"PARTITION BY RANGE ((f3));" and "PARTITION BY RANGE ((whole_row));"
are the corner cases we are trying to catch.

you may also see ComputePartitionAttrs.

#15Yura Sokolov
y.sokolov@postgrespro.ru
In reply to: jian he (#14)
Re: bug: virtual generated column can be partition key

06.05.2025 13:31, jian he пишет:

On Tue, May 6, 2025 at 5:57 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:

21.04.2025 05:30, jian he пишет:

hi.
While trying to make the virtual generated column be part of the partition key,
I found this bug.
it also influences the stored generated column, i added a test
on generated_stored.sql.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE (f3);

ERROR: cannot use generated column in partition key
LINE 4: PARTITION BY RANGE (f3);
^
DETAIL: Column "f3" is a generated column.

the following is essentially the same as above, it should also fail.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE ((f3));

I don't understand why latter should fail?

Documentation says [1]:

PostgreSQL allows you to declare that a table is divided into partitions.
The table that is divided is referred to as a partitioned table.
The declaration includes the partitioning method as described above,
plus a list of columns or expressions to be used as the partition key.

Note: "list of columns or EXPRESSIONS"!

In first case you pass list of columns (which contains single column f3). I
don't get which internal restriction forces it to fail, really, but ok:
there is restriction on COLUMNS LIST and it must be obeyed.

But in second case you pass EXPRESSION, and I don't think same restriction
should be applied.

More over, if you look into comments on restriction on GENERATED columns
[2] [3], you will find this restriction is because of nature of STORED
generated columns, and it doesn't bound to VIRTUAL ones. Indeed, there is
suggestion for future to treat GENERATED VIRTUAL columns as expressions.

hi.
you already pointed out the problem.
As of now GENERATED VIRTUAL columns are not supported as partition keys,
therefore we need to disallow generated columns being referenced in
the partition key in any representation.

I don't see why "we need to disallow". There are no fundamental reasons.

May be it is better to allow GENERATED VIRTUAL columns? But still forbid
GENERATED STORED columns because there are reasons for.

currently in a matser:
CREATE TABLE xx (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED
ALWAYS AS (f2 * 2) virtual) PARTITION BY RANGE (f3);
will yield error, however
CREATE TABLE xx (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED
ALWAYS AS (f2 * 2) virtual) PARTITION BY RANGE ((f3));
will not.
but these two form essentially the same thing.

"PARTITION BY RANGE ((f3));" and "PARTITION BY RANGE ((whole_row));"
are the corner cases we are trying to catch.

you may also see ComputePartitionAttrs.

I saw it. And I gave links to comments in this function. This comments
clearly state, there is no real reason to forbid GENERATED VIRTUAL columns
(in opposite to STORED). They are forbidden just because author had no
time/wish to finish their support.

--
regards
Yura Sokolov aka funny-falcon

#16jian he
jian.universality@gmail.com
In reply to: Yura Sokolov (#15)
Re: bug: virtual generated column can be partition key

On Tue, May 6, 2025 at 6:49 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:

06.05.2025 13:31, jian he пишет:

On Tue, May 6, 2025 at 5:57 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:

21.04.2025 05:30, jian he пишет:

hi.
While trying to make the virtual generated column be part of the partition key,
I found this bug.
it also influences the stored generated column, i added a test
on generated_stored.sql.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE (f3);

ERROR: cannot use generated column in partition key
LINE 4: PARTITION BY RANGE (f3);
^
DETAIL: Column "f3" is a generated column.

the following is essentially the same as above, it should also fail.

CREATE TABLE gtest_part_key (
f1 date NOT NULL, f2 bigint,
f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL)
PARTITION BY RANGE ((f3));

I don't understand why latter should fail?

Documentation says [1]:

PostgreSQL allows you to declare that a table is divided into partitions.
The table that is divided is referred to as a partitioned table.
The declaration includes the partitioning method as described above,
plus a list of columns or expressions to be used as the partition key.

Note: "list of columns or EXPRESSIONS"!

In first case you pass list of columns (which contains single column f3). I
don't get which internal restriction forces it to fail, really, but ok:
there is restriction on COLUMNS LIST and it must be obeyed.

But in second case you pass EXPRESSION, and I don't think same restriction
should be applied.

More over, if you look into comments on restriction on GENERATED columns
[2] [3], you will find this restriction is because of nature of STORED
generated columns, and it doesn't bound to VIRTUAL ones. Indeed, there is
suggestion for future to treat GENERATED VIRTUAL columns as expressions.

hi.
you already pointed out the problem.
As of now GENERATED VIRTUAL columns are not supported as partition keys,
therefore we need to disallow generated columns being referenced in
the partition key in any representation.

I don't see why "we need to disallow". There are no fundamental reasons.

May be it is better to allow GENERATED VIRTUAL columns? But still forbid
GENERATED STORED columns because there are reasons for.

hi.

I posted a patch about virtual generated column as partition key in
https://commitfest.postgresql.org/patch/5720/

but in this context, we still need error out for virtual generated column.
maybe we can change the error code, like the below,
but i feel it's not necessary.

ComputePartitionAttrs
if (TupleDescAttr(RelationGetDescr(rel), attno -
1)->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)
ereport(ERROR,
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot use generated column in
partition key"),
parser_errposition(pstate, pelem->location));

if (TupleDescAttr(RelationGetDescr(rel), attno -
1)->attgenerated)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("cannot use generated column in
partition key"),
errdetail("Column \"%s\" is a generated column.",

get_attname(RelationGetRelid(rel), attno, false)),
parser_errposition(pstate, pelem->location)));

the patch still works, i added this to commitfest
(https://commitfest.postgresql.org/patch/5989)
for tracking purposes.

#17Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: jian he (#16)
Re: bug: virtual generated column can be partition key

On Tue, May 6, 2025 at 4:19 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:

you may also see ComputePartitionAttrs.

I saw it. And I gave links to comments in this function. This comments
clearly state, there is no real reason to forbid GENERATED VIRTUAL columns
(in opposite to STORED). They are forbidden just because author had no
time/wish to finish their support.

Jian wrote:

I posted a patch about virtual generated column as partition key in
https://commitfest.postgresql.org/patch/5720/

but in this context, we still need error out for virtual generated column.
maybe we can change the error code, like the below,
but i feel it's not necessary.

RIght. There is no reason to not support partition keys containing
virtual columns. It's just a limitation in backbranches. It will be
removed by Jian's proposal.

However, in the backbranches we do not allow a virtual generated
column to be part of partition key when it appears directly in the
partition key as a column or in an expression. By that same reasoning,
a whole row reference to a relation having a virtual column should not
be allowed since whole row reference is a row expression containing
all the columns of that relation. That's what this thread and the
patch is about.

ComputePartitionAttrs
if (TupleDescAttr(RelationGetDescr(rel), attno -
1)->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)
ereport(ERROR,
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot use generated column in
partition key"),
parser_errposition(pstate, pelem->location));

if (TupleDescAttr(RelationGetDescr(rel), attno -
1)->attgenerated)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("cannot use generated column in
partition key"),
errdetail("Column \"%s\" is a generated column.",

get_attname(RelationGetRelid(rel), attno, false)),
parser_errposition(pstate, pelem->location)));

Hmm, that may be considered backward compatibility breaking change
since it's changing an error code that a user may be relying on. I
would avoid such a change in backbranches.

--
Best Wishes,
Ashutosh Bapat

#18Peter Eisentraut
peter@eisentraut.org
In reply to: Ashutosh Bapat (#9)
Re: bug: virtual generated column can be partition key

committed

Show quoted text

On 22.04.25 10:47, Ashutosh Bapat wrote:

Thanks Jian for your review.

On Tue, Apr 22, 2025 at 12:32 PM jian he <jian.universality@gmail.com
<mailto:jian.universality@gmail.com>> wrote:

On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com <mailto:ashutosh.bapat.oss@gmail.com>>
wrote:

I have included your original tests, but ended up rewriting code.

Please let me know what do you think.

+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("partition key expressions cannot contain system column
references")));
+
+ /*
+ * We do not check dropped columns explicitly since they will
+ * be eliminated when expanding the the whole row expression
+ * anyway.
+ */
typo: "the the".
I am confused by the above comments.
ComputePartitionAttrs only called in DefineRelation.
DefineRelation will only CREATE a table, there will be no dropped
column via DefineRelation.

You are right! Fixed.

+ /*
+ * transformPartitionSpec() should have already rejected
+ * subqueries, aggregates, window functions, and SRFs, based
+ * on the EXPR_KIND_ for partition expressions.
+ */
"EXPR_KIND_" can change to EXPR_KIND_PARTITION_EXPRESSION
?

That's an existing comment which appears to be moved in diff but it's
actually untouched by the patch. Maybe you are right, IDK, but since
it's not related to the fix, let's leave it untouched. I did wonder
whether that comment has any relation to the pull_varattnos call which
has been moved earlier since pull_varattnos doesn't expect any Query
node in the tree. But pondering more, I think the comment is related to
the number of rows participating in the partition key computation -
there should be exactly one key for one tuple. Having SRFs, subqueries
in part expression means a possibility of producing more than one set of
partition keys, aggregates and window functions OTOH will produce one
partition key for more than one row - all of them breaking 1:1 mapping
between a tuple and a partition key. Hence I think the comment is at the
right place.

PFA revised patch.

--
Best Wishes,
Ashutosh Bapat