inconsistent tableoid handling in COPY WHERE clause

Started by Peter Eisentraut2 months ago5 messages
#1Peter Eisentraut
peter@eisentraut.org

I think the COPY FROM WHERE clause is handling the tableoid column in a
way that is inconsistent with its usual definition.

Consider a partitioning hierarchy like

DROP TABLE IF EXISTS xp;

CREATE TABLE xp (a int, b int) PARTITION BY LIST (a);
CREATE TABLE xp1 PARTITION OF xp FOR VALUES IN (1);
CREATE TABLE xp2 PARTITION OF xp FOR VALUES IN (2);
CREATE TABLE xp3 PARTITION OF xp FOR VALUES IN (3);

Then you can use tableoid in a trigger to reveal the actual partition
that an inserted row ended up in:

CREATE OR REPLACE FUNCTION tf() RETURNS TRIGGER LANGUAGE plpgsql
AS
$$
BEGIN
RAISE NOTICE 'new.tableoid = %', NEW.tableoid;
RETURN NEW;
END
$$;

CREATE TRIGGER tg1 AFTER INSERT ON xp FOR EACH ROW EXECUTE FUNCTION tf();

INSERT INTO xp VALUES (1, 11), (2, 22);

You can also write a check constraint that references tableoid to check
what partition a row ends up in:

ALTER TABLE xp ADD CONSTRAINT xpc
CHECK (tableoid IN ('xp1'::regclass, 'xp2'::regclass));

INSERT INTO xp VALUES (3, 33); -- error: violates check constraint

So far so good.

You can also refer to tableoid in the WHERE clause of a COPY command,
but that doesn't work correctly:

COPY xp FROM STDIN WHERE tableoid = 'xp2'::regclass;
1 111
2 222
\.

I would have expected that to copy only rows that are targeted for the
xp2 partition, but in fact it does not copy anything.

This works:

COPY xp FROM STDIN WHERE tableoid = 'xp'::regclass;
1 111
2 222
\.

because tableoid in fact refers to the table named in the command, not
the actual target partition.

That seems incorrect to me.

I don't see this documented one way or another, but there is a code
comment in copyfrom.c at least mentioning tableoid in the context of the
COPY WHERE clause:

/*
* Constraints and where clause might reference the tableoid column,
* so (re-)initialize tts_tableOid before evaluating them.
*/
myslot->tts_tableOid =
RelationGetRelid(target_resultRelInfo->ri_RelationDesc);

This comment appeared in a not-quite-related refactoring (commit
86b85044e82), it only said "Constraints might ..." beforehand.

Even the "Constraints might ..." variant of this is dubious, since as
shown above, check constraints do have partition awareness, and there
are places in nodeModifyTable.c that initialize tts_tableOid for that
purpose. Are there other constraints where tableoid can be used (and
where this way of initializing it doesn't result in wrong behavior)?

I suggest that we should prohibit using tableoid in COPY WHERE clauses
for the time being. I don't know if there would be a way to make them
work correctly at all, but most likely not in a backpatchable way anyway.

I also suggest that the above piece of code assigning tts_tableOid
should be changed somehow. Maybe just delete it, or set it to
InvalidOid, because as it is it's misleading and probably wrong.

#2Kirill Reshke
reshkekirill@gmail.com
In reply to: Peter Eisentraut (#1)
Re: inconsistent tableoid handling in COPY WHERE clause

On Wed, 5 Nov 2025 at 16:06, Peter Eisentraut <peter@eisentraut.org> wrote:

I think the COPY FROM WHERE clause is handling the tableoid column in a
way that is inconsistent with its usual definition.

Consider a partitioning hierarchy like

DROP TABLE IF EXISTS xp;

CREATE TABLE xp (a int, b int) PARTITION BY LIST (a);
CREATE TABLE xp1 PARTITION OF xp FOR VALUES IN (1);
CREATE TABLE xp2 PARTITION OF xp FOR VALUES IN (2);
CREATE TABLE xp3 PARTITION OF xp FOR VALUES IN (3);

Then you can use tableoid in a trigger to reveal the actual partition
that an inserted row ended up in:

CREATE OR REPLACE FUNCTION tf() RETURNS TRIGGER LANGUAGE plpgsql
AS
$$
BEGIN
RAISE NOTICE 'new.tableoid = %', NEW.tableoid;
RETURN NEW;
END
$$;

CREATE TRIGGER tg1 AFTER INSERT ON xp FOR EACH ROW EXECUTE FUNCTION tf();

INSERT INTO xp VALUES (1, 11), (2, 22);

You can also write a check constraint that references tableoid to check
what partition a row ends up in:

ALTER TABLE xp ADD CONSTRAINT xpc
CHECK (tableoid IN ('xp1'::regclass, 'xp2'::regclass));

INSERT INTO xp VALUES (3, 33); -- error: violates check constraint

So far so good.

You can also refer to tableoid in the WHERE clause of a COPY command,
but that doesn't work correctly:

COPY xp FROM STDIN WHERE tableoid = 'xp2'::regclass;
1 111
2 222
\.

I would have expected that to copy only rows that are targeted for the
xp2 partition, but in fact it does not copy anything.

This works:

COPY xp FROM STDIN WHERE tableoid = 'xp'::regclass;
1 111
2 222
\.

because tableoid in fact refers to the table named in the command, not
the actual target partition.

That seems incorrect to me.

I don't see this documented one way or another, but there is a code
comment in copyfrom.c at least mentioning tableoid in the context of the
COPY WHERE clause:

/*
* Constraints and where clause might reference the tableoid column,
* so (re-)initialize tts_tableOid before evaluating them.
*/
myslot->tts_tableOid =
RelationGetRelid(target_resultRelInfo->ri_RelationDesc);

This comment appeared in a not-quite-related refactoring (commit
86b85044e82), it only said "Constraints might ..." beforehand.

Even the "Constraints might ..." variant of this is dubious, since as
shown above, check constraints do have partition awareness, and there
are places in nodeModifyTable.c that initialize tts_tableOid for that
purpose. Are there other constraints where tableoid can be used (and
where this way of initializing it doesn't result in wrong behavior)?

I suggest that we should prohibit using tableoid in COPY WHERE clauses
for the time being. I don't know if there would be a way to make them
work correctly at all, but most likely not in a backpatchable way anyway.

I also suggest that the above piece of code assigning tts_tableOid
should be changed somehow. Maybe just delete it, or set it to
InvalidOid, because as it is it's misleading and probably wrong.

Hi!
Looks like this issue is currently discussed at [0]/messages/by-id/CACJufxHGGc25a2m+3Dezfctuykn51n5k3FJK6w3KSqfSFc5gvQ@mail.gmail.com. Should we
continue here or there?

[0]: /messages/by-id/CACJufxHGGc25a2m+3Dezfctuykn51n5k3FJK6w3KSqfSFc5gvQ@mail.gmail.com

--
Best regards,
Kirill Reshke

#3Peter Eisentraut
peter@eisentraut.org
In reply to: Kirill Reshke (#2)
Re: inconsistent tableoid handling in COPY WHERE clause

On 05.11.25 12:13, Kirill Reshke wrote:

Looks like this issue is currently discussed at [0]. Should we
continue here or there?

[0]https://www.postgresql.org/message-id/
CACJufxHGGc25a2m%2B3Dezfctuykn51n5k3FJK6w3KSqfSFc5gvQ%40mail.gmail.com

I'm aware of that thread. I started this new thread to get more focus
on this particular sub-issue.

#4Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Eisentraut (#1)
Re: inconsistent tableoid handling in COPY WHERE clause

On Wed, Nov 5, 2025 at 3:06 AM Peter Eisentraut <peter@eisentraut.org> wrote:

I think the COPY FROM WHERE clause is handling the tableoid column in a
way that is inconsistent with its usual definition.

Consider a partitioning hierarchy like

DROP TABLE IF EXISTS xp;

CREATE TABLE xp (a int, b int) PARTITION BY LIST (a);
CREATE TABLE xp1 PARTITION OF xp FOR VALUES IN (1);
CREATE TABLE xp2 PARTITION OF xp FOR VALUES IN (2);
CREATE TABLE xp3 PARTITION OF xp FOR VALUES IN (3);

Then you can use tableoid in a trigger to reveal the actual partition
that an inserted row ended up in:

CREATE OR REPLACE FUNCTION tf() RETURNS TRIGGER LANGUAGE plpgsql
AS
$$
BEGIN
RAISE NOTICE 'new.tableoid = %', NEW.tableoid;
RETURN NEW;
END
$$;

CREATE TRIGGER tg1 AFTER INSERT ON xp FOR EACH ROW EXECUTE FUNCTION tf();

INSERT INTO xp VALUES (1, 11), (2, 22);

You can also write a check constraint that references tableoid to check
what partition a row ends up in:

ALTER TABLE xp ADD CONSTRAINT xpc
CHECK (tableoid IN ('xp1'::regclass, 'xp2'::regclass));

INSERT INTO xp VALUES (3, 33); -- error: violates check constraint

So far so good.

You can also refer to tableoid in the WHERE clause of a COPY command,
but that doesn't work correctly:

COPY xp FROM STDIN WHERE tableoid = 'xp2'::regclass;
1 111
2 222
\.

I would have expected that to copy only rows that are targeted for the
xp2 partition, but in fact it does not copy anything.

This works:

COPY xp FROM STDIN WHERE tableoid = 'xp'::regclass;
1 111
2 222
\.

because tableoid in fact refers to the table named in the command, not
the actual target partition.

That seems incorrect to me.

I don't see this documented one way or another, but there is a code
comment in copyfrom.c at least mentioning tableoid in the context of the
COPY WHERE clause:

/*
* Constraints and where clause might reference the tableoid column,
* so (re-)initialize tts_tableOid before evaluating them.
*/
myslot->tts_tableOid =
RelationGetRelid(target_resultRelInfo->ri_RelationDesc);

This comment appeared in a not-quite-related refactoring (commit
86b85044e82), it only said "Constraints might ..." beforehand.

Even the "Constraints might ..." variant of this is dubious, since as
shown above, check constraints do have partition awareness, and there
are places in nodeModifyTable.c that initialize tts_tableOid for that
purpose. Are there other constraints where tableoid can be used (and
where this way of initializing it doesn't result in wrong behavior)?

FYI we can specify system columns in the WITH CHECK clause in CREATE
POLICY, and it seems to work fine (COPY FROM is not supported with RLS
so I tested using with).

I suggest that we should prohibit using tableoid in COPY WHERE clauses
for the time being. I don't know if there would be a way to make them
work correctly at all, but most likely not in a backpatchable way anyway.

I also suggest that the above piece of code assigning tts_tableOid
should be changed somehow. Maybe just delete it, or set it to
InvalidOid, because as it is it's misleading and probably wrong.

Probably should we filter rows by WHERE clause after determining the
partition to insert the tuple into? Currently, after getting a row
from NextCopyFrom(), we check the WHERE clause and then determine the
partition to insert the tuple into if the partitioned table is
specified in the COPY FROM. Then, we set the slot's tableoid to the
right leaf partition's oid:

/* ensure that triggers etc see the right relation */
myslot->tts_tableOid =
RelationGetRelid(resultRelInfo->ri_RelationDesc);

Therefore, triggers and constraints can refer to the right tableoid.
For example, I tested it with the following check constraint:

ALTER TABLE xp ADD CONSTRAINT xpc CHECK (tableoid IN ('xp1'::regclass,
'xp2'::regclass, 'xp'::regclass));

This filters out the row as the tableoid in the where clause refers to
the xp table's oid:

copy xp from stdin where tableoid = 'xp3'::regclass;
3 333
\.

But this raises the check constraint violation error as the tuple
slot's tableoid now refers to xp3 table's oid:

copy xp from stdin;
3 333
\.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#5jian he
jian.universality@gmail.com
In reply to: Masahiko Sawada (#4)
1 attachment(s)
Re: inconsistent tableoid handling in COPY WHERE clause

On Thu, Nov 6, 2025 at 3:44 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I suggest that we should prohibit using tableoid in COPY WHERE clauses
for the time being. I don't know if there would be a way to make them
work correctly at all, but most likely not in a backpatchable way anyway.

I also suggest that the above piece of code assigning tts_tableOid
should be changed somehow. Maybe just delete it, or set it to
InvalidOid, because as it is it's misleading and probably wrong.

Probably should we filter rows by WHERE clause after determining the
partition to insert the tuple into? Currently, after getting a row
from NextCopyFrom(), we check the WHERE clause and then determine the
partition to insert the tuple into if the partitioned table is
specified in the COPY FROM. Then, we set the slot's tableoid to the
right leaf partition's oid:

I think it is doable.

BEFORE ROW TRIGGER no need to correct slot->tts_tableOid information.
see ExecInsert:
at the beginning of ExecInsert, slot->tts_tableOid == 0
ExecMaterializeSlot won't touch the Tableoid field.
ExecBRInsertTriggers happen before explicitly setting slot->tts_tableOid.

ExecBRInsertTriggers->ExecFetchSlotHeapTuple->tts_virtual_copy_heap_tuple->heap_form_tuple
will set (TriggerData->tg_trigtuple) t_tableOid as InvalidOid
trigger execution function plpgsql_exec_trigger only use
TriggerData->tg_trigtuple, not using TriggerData->tg_trigslot.
That means during ExecBRInsertTriggers execution, the t_tableOid is InvalidOid.

exec_assign_value have "cannot assign to system column" ERROR check
forbidden us change system column.

CopyMultiInsertInfoFlush->CopyMultiInsertBufferFlush->table_multi_insert
will set the proper tableoid for the slot.
That mean don't need to worry about AFTER ROW TRIGGER,

We can place filtering rows by WHERE clause logic right below ExecFindPartition.
Please check the attached patch.

--
jian
https://www.enterprisedb.com

Attachments:

v1-0001-fix-COPY-WHERE-clause-with-tableoid-field.patchtext/x-patch; charset=US-ASCII; name=v1-0001-fix-COPY-WHERE-clause-with-tableoid-field.patchDownload
From 92bfdd62bfd0d420029020e9b10b273718c962dc Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Thu, 6 Nov 2025 15:10:26 +0800
Subject: [PATCH v1 1/1] fix COPY WHERE clause with tableoid field

discussion: https://postgr.es/m/30c39ee8-bb11-4b8f-9697-45f7e018a8d3@eisentraut.org
---
 src/backend/commands/copyfrom.c    | 32 +++++++++++++++++++++++++++---
 src/test/regress/expected/copy.out | 16 +++++++++++++++
 src/test/regress/sql/copy.sql      | 14 +++++++++++++
 3 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 12781963b4f..94aed15d379 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1178,15 +1178,16 @@ CopyFrom(CopyFromState cstate)
 		ExecStoreVirtualTuple(myslot);
 
 		/*
-		 * Constraints and where clause might reference the tableoid column,
-		 * so (re-)initialize tts_tableOid before evaluating them.
+		 * where clause might reference the tableoid column, so (re-)initialize
+		 * tts_tableOid before evaluating them. It may change later if we are
+		 * COPY INTO partitioned table.
 		 */
 		myslot->tts_tableOid = RelationGetRelid(target_resultRelInfo->ri_RelationDesc);
 
 		/* Triggers and stuff need to be invoked in query context. */
 		MemoryContextSwitchTo(oldcontext);
 
-		if (cstate->whereClause)
+		if (proute == NULL && cstate->whereClause)
 		{
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
@@ -1215,6 +1216,24 @@ CopyFrom(CopyFromState cstate)
 			resultRelInfo = ExecFindPartition(mtstate, target_resultRelInfo,
 											  proute, myslot, estate);
 
+			myslot->tts_tableOid = RelationGetRelid(resultRelInfo->ri_RelationDesc);
+
+			if (cstate->whereClause)
+			{
+				econtext->ecxt_scantuple = myslot;
+				/* Skip items that don't match COPY's WHERE clause */
+				if (!ExecQual(cstate->qualexpr, econtext))
+				{
+					/*
+					 * Report that this tuple was filtered out by the WHERE
+					 * clause.
+					 */
+					pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED,
+												++excluded);
+					continue;
+				}
+			}
+
 			if (prevResultRelInfo != resultRelInfo)
 			{
 				/* Determine which triggers exist on this partition */
@@ -1343,6 +1362,13 @@ CopyFrom(CopyFromState cstate)
 			}
 			else
 			{
+				/*
+				 * Constraints and GENERATED expressions might reference the
+				 * tableoid column, so (re-)initialize tts_tableOid before
+				 * evaluating them.
+				 */
+				myslot->tts_tableOid = RelationGetRelid(resultRelInfo->ri_RelationDesc);
+
 				/* Compute stored generated columns */
 				if (resultRelInfo->ri_RelationDesc->rd_att->constr &&
 					resultRelInfo->ri_RelationDesc->rd_att->constr->has_generated_stored)
diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out
index 24e0f472f14..0735160bc10 100644
--- a/src/test/regress/expected/copy.out
+++ b/src/test/regress/expected/copy.out
@@ -121,6 +121,22 @@ insert into parted_copytest select x,2,'Two' from generate_series(1001,1010) x;
 insert into parted_copytest select x,1,'One' from generate_series(1011,1020) x;
 \set filename :abs_builddir '/results/parted_copytest.csv'
 copy (select * from parted_copytest order by a) to :'filename';
+truncate parted_copytest;
+COPY parted_copytest FROM STDIN WHERE (tableoid in ('parted_copytest'::regclass, 'parted_copytest_a1'::regclass));
+SELECT * FROM parted_copytest; --expect one row
+ a | b | c 
+---+---+---
+ 1 | 1 | a
+(1 row)
+
+COPY parted_copytest FROM STDIN WHERE (tableoid in ('parted_copytest_a2'::regclass));
+SELECT * FROM parted_copytest; --expect two row
+ a | b | c 
+---+---+---
+ 1 | 1 | a
+ 2 | 2 | b
+(2 rows)
+
 truncate parted_copytest;
 copy parted_copytest from :'filename';
 -- Ensure COPY FREEZE errors for partitioned tables.
diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql
index 676a8b342b5..a2d3194635a 100644
--- a/src/test/regress/sql/copy.sql
+++ b/src/test/regress/sql/copy.sql
@@ -148,6 +148,20 @@ copy (select * from parted_copytest order by a) to :'filename';
 
 truncate parted_copytest;
 
+COPY parted_copytest FROM STDIN WHERE (tableoid in ('parted_copytest'::regclass, 'parted_copytest_a1'::regclass));
+1	1	a
+2	2	b
+\.
+
+SELECT * FROM parted_copytest; --expect one row
+COPY parted_copytest FROM STDIN WHERE (tableoid in ('parted_copytest_a2'::regclass));
+2	2	b
+1	1	a
+\.
+SELECT * FROM parted_copytest; --expect two row
+truncate parted_copytest;
+
+
 copy parted_copytest from :'filename';
 
 -- Ensure COPY FREEZE errors for partitioned tables.
-- 
2.34.1