Bug: Missing collation assignment for GRAPH_TABLE COLUMNS expressions

Started by SATYANARAYANA NARLAPURAM20 days ago6 messageshackers
Jump to latest
#1SATYANARAYANA NARLAPURAM
satyanarlapuram@gmail.com

Hi hackers,

GRAPH_TABLE COLUMNS expressions that involve collation-dependent functions
or operators fail with:

ERROR: could not determine which collation to use for upper() function
HINT: Use the COLLATE clause to set the collation explicitly.

Setup:

CREATE TABLE vtx (id int PRIMARY KEY, name text);
CREATE TABLE edg (id int PRIMARY KEY,
src int REFERENCES vtx(id),
dst int REFERENCES vtx(id));
INSERT INTO vtx VALUES (1,'Alice'),(2,'Bob'),(3,'Carol');
INSERT INTO edg VALUES (1,1,2),(2,2,3);

CREATE PROPERTY GRAPH g
VERTEX TABLES (vtx KEY (id))
EDGE TABLES (edg KEY (id)
SOURCE KEY (src) REFERENCES vtx (id)
DESTINATION KEY (dst) REFERENCES vtx (id));

postgres=# SELECT * FROM GRAPH_TABLE (g
MATCH (a IS vtx)-[e IS edg]->(b IS vtx) COLUMNS (upper(a.name) AS
src_upper));
ERROR: could not determine which collation to use for upper() function
HINT: Use the COLLATE clause to set the collation explicitly.

In transformRangeGraphTable(), the COLUMNS transformation loop calls
transformExpr()
on each column expression but omits the subsequent assign_expr_collations()
call. Both
WHERE clause transformation sites in parse_graphtable.c correctly include
it.

Attached a patch to fix this.

Thanks,
Satya

Attachments:

0001-fix-graph-table-columns-collation.patchapplication/octet-stream; name=0001-fix-graph-table-columns-collation.patchDownload+55-0
#2Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: SATYANARAYANA NARLAPURAM (#1)
Re: Bug: Missing collation assignment for GRAPH_TABLE COLUMNS expressions

Hi Satya,
Thanks for the report and patch.

On Fri, Apr 10, 2026 at 9:12 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Hi hackers,

GRAPH_TABLE COLUMNS expressions that involve collation-dependent functions or operators fail with:

ERROR: could not determine which collation to use for upper() function
HINT: Use the COLLATE clause to set the collation explicitly.

Setup:

CREATE TABLE vtx (id int PRIMARY KEY, name text);
CREATE TABLE edg (id int PRIMARY KEY,
src int REFERENCES vtx(id),
dst int REFERENCES vtx(id));
INSERT INTO vtx VALUES (1,'Alice'),(2,'Bob'),(3,'Carol');
INSERT INTO edg VALUES (1,1,2),(2,2,3);

CREATE PROPERTY GRAPH g
VERTEX TABLES (vtx KEY (id))
EDGE TABLES (edg KEY (id)
SOURCE KEY (src) REFERENCES vtx (id)
DESTINATION KEY (dst) REFERENCES vtx (id));

postgres=# SELECT * FROM GRAPH_TABLE (g
MATCH (a IS vtx)-[e IS edg]->(b IS vtx) COLUMNS (upper(a.name) AS src_upper));
ERROR: could not determine which collation to use for upper() function
HINT: Use the COLLATE clause to set the collation explicitly.

In transformRangeGraphTable(), the COLUMNS transformation loop calls transformExpr()
on each column expression but omits the subsequent assign_expr_collations() call. Both
WHERE clause transformation sites in parse_graphtable.c correctly include it.

Attached a patch to fix this.

I think the fix is in the right direction. It's better to call
assign_expr_collation only once on all the columns at the end of loop
of rgt->columns, just like assign_expr_collation is called on all the
conditions in WHERE clause once.

Good to see tests also included in the patch. Do we need all three
queries? Also those queries should be placed near the section "-- test
collation specified in the expression" and add a query for explicit
collation in COLUMNs expression.

--
Best Wishes,
Ashutosh Bapat

#3SATYANARAYANA NARLAPURAM
satyanarlapuram@gmail.com
In reply to: Ashutosh Bapat (#2)
Re: Bug: Missing collation assignment for GRAPH_TABLE COLUMNS expressions

Hi Ashutosh,

On Fri, Apr 10, 2026 at 9:25 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:

Hi Satya,
Thanks for the report and patch.

On Fri, Apr 10, 2026 at 9:12 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Hi hackers,

GRAPH_TABLE COLUMNS expressions that involve collation-dependent

functions or operators fail with:

ERROR: could not determine which collation to use for upper() function
HINT: Use the COLLATE clause to set the collation explicitly.

Setup:

CREATE TABLE vtx (id int PRIMARY KEY, name text);
CREATE TABLE edg (id int PRIMARY KEY,
src int REFERENCES vtx(id),
dst int REFERENCES vtx(id));
INSERT INTO vtx VALUES (1,'Alice'),(2,'Bob'),(3,'Carol');
INSERT INTO edg VALUES (1,1,2),(2,2,3);

CREATE PROPERTY GRAPH g
VERTEX TABLES (vtx KEY (id))
EDGE TABLES (edg KEY (id)
SOURCE KEY (src) REFERENCES vtx (id)
DESTINATION KEY (dst) REFERENCES vtx (id));

postgres=# SELECT * FROM GRAPH_TABLE (g
MATCH (a IS vtx)-[e IS edg]->(b IS vtx) COLUMNS (upper(a.name) AS

src_upper));

ERROR: could not determine which collation to use for upper() function
HINT: Use the COLLATE clause to set the collation explicitly.

In transformRangeGraphTable(), the COLUMNS transformation loop calls

transformExpr()

on each column expression but omits the subsequent

assign_expr_collations() call. Both

WHERE clause transformation sites in parse_graphtable.c correctly

include it.

Attached a patch to fix this.

I think the fix is in the right direction. It's better to call
assign_expr_collation only once on all the columns at the end of loop
of rgt->columns, just like assign_expr_collation is called on all the
conditions in WHERE clause once

Addressed this in v2 patch.

Good to see tests also included in the patch. Do we need all three

queries? Also those queries should be placed near the section "-- test
collation specified in the expression" and add a query for explicit
collation in COLUMNs expression.

Removed two tests and moved the test. Explicit collate test already exists.

Thanks,
Satya

Attachments:

v2-0001-fix-graph-table-columns-collation.patchapplication/octet-stream; name=v2-0001-fix-graph-table-columns-collation.patchDownload+25-0
#4Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: SATYANARAYANA NARLAPURAM (#3)
Re: Bug: Missing collation assignment for GRAPH_TABLE COLUMNS expressions

On Fri, Apr 10, 2026 at 11:36 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Hi Ashutosh,

On Fri, Apr 10, 2026 at 9:25 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:

Hi Satya,
Thanks for the report and patch.

On Fri, Apr 10, 2026 at 9:12 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Hi hackers,

GRAPH_TABLE COLUMNS expressions that involve collation-dependent functions or operators fail with:

ERROR: could not determine which collation to use for upper() function
HINT: Use the COLLATE clause to set the collation explicitly.

Setup:

CREATE TABLE vtx (id int PRIMARY KEY, name text);
CREATE TABLE edg (id int PRIMARY KEY,
src int REFERENCES vtx(id),
dst int REFERENCES vtx(id));
INSERT INTO vtx VALUES (1,'Alice'),(2,'Bob'),(3,'Carol');
INSERT INTO edg VALUES (1,1,2),(2,2,3);

CREATE PROPERTY GRAPH g
VERTEX TABLES (vtx KEY (id))
EDGE TABLES (edg KEY (id)
SOURCE KEY (src) REFERENCES vtx (id)
DESTINATION KEY (dst) REFERENCES vtx (id));

postgres=# SELECT * FROM GRAPH_TABLE (g
MATCH (a IS vtx)-[e IS edg]->(b IS vtx) COLUMNS (upper(a.name) AS src_upper));
ERROR: could not determine which collation to use for upper() function
HINT: Use the COLLATE clause to set the collation explicitly.

In transformRangeGraphTable(), the COLUMNS transformation loop calls transformExpr()
on each column expression but omits the subsequent assign_expr_collations() call. Both
WHERE clause transformation sites in parse_graphtable.c correctly include it.

Attached a patch to fix this.

I think the fix is in the right direction. It's better to call
assign_expr_collation only once on all the columns at the end of loop
of rgt->columns, just like assign_expr_collation is called on all the
conditions in WHERE clause once

Addressed this in v2 patch.

If we call assign_expr_collations() on a list, the List expression
also gets a collation, which isn't what we want here. We want to
assign collations to the individual COLUMNs expression independently.
assign_list_collations() is better suited for that. I must say that
your earlier patch had got it right in this regard since it was
calling assign_expr_collations independently on each COLUMNs
expression. However, considering that an all properties reference is
replaced by a list of GraphPropertyRefs in place, I think calling
assign_list_collations() once on all COLUMNs expressions is a
future-proof fix. This is also inline with how collations are assigned
to targetlist expressions in a Query.

Good to see tests also included in the patch. Do we need all three
queries? Also those queries should be placed near the section "-- test
collation specified in the expression" and add a query for explicit
collation in COLUMNs expression.

Removed two tests and moved the test. Explicit collate test already exists.

I merged this test into an existing test to avoid adding yet another
query in the file that has many many queries already. Yes, an explicit
collation test is not needed separately.

While at it I added comments to explain why we aren't performing
en-masse collation assignment on a GraphTable or GraphPathPattern.

Please let me know what you think of the attached patch.

--
Best Wishes,
Ashutosh Bapat

Attachments:

v20260415-0001-Collation-of-expressions-in-GRAPH_TABLE-CO.patchtext/x-patch; charset=US-ASCII; name=v20260415-0001-Collation-of-expressions-in-GRAPH_TABLE-CO.patchDownload+23-6
#5SATYANARAYANA NARLAPURAM
satyanarlapuram@gmail.com
In reply to: Ashutosh Bapat (#4)
Re: Bug: Missing collation assignment for GRAPH_TABLE COLUMNS expressions

Hi Ashutosh,

On Wed, Apr 15, 2026 at 8:07 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:

On Fri, Apr 10, 2026 at 11:36 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Hi Ashutosh,

On Fri, Apr 10, 2026 at 9:25 AM Ashutosh Bapat <

ashutosh.bapat.oss@gmail.com> wrote:

Hi Satya,
Thanks for the report and patch.

On Fri, Apr 10, 2026 at 9:12 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Hi hackers,

GRAPH_TABLE COLUMNS expressions that involve collation-dependent

functions or operators fail with:

ERROR: could not determine which collation to use for upper()

function

HINT: Use the COLLATE clause to set the collation explicitly.

Setup:

CREATE TABLE vtx (id int PRIMARY KEY, name text);
CREATE TABLE edg (id int PRIMARY KEY,
src int REFERENCES vtx(id),
dst int REFERENCES vtx(id));
INSERT INTO vtx VALUES (1,'Alice'),(2,'Bob'),(3,'Carol');
INSERT INTO edg VALUES (1,1,2),(2,2,3);

CREATE PROPERTY GRAPH g
VERTEX TABLES (vtx KEY (id))
EDGE TABLES (edg KEY (id)
SOURCE KEY (src) REFERENCES vtx (id)
DESTINATION KEY (dst) REFERENCES vtx (id));

postgres=# SELECT * FROM GRAPH_TABLE (g
MATCH (a IS vtx)-[e IS edg]->(b IS vtx) COLUMNS (upper(a.name) AS

src_upper));

ERROR: could not determine which collation to use for upper()

function

HINT: Use the COLLATE clause to set the collation explicitly.

In transformRangeGraphTable(), the COLUMNS transformation loop calls

transformExpr()

on each column expression but omits the subsequent

assign_expr_collations() call. Both

WHERE clause transformation sites in parse_graphtable.c correctly

include it.

Attached a patch to fix this.

I think the fix is in the right direction. It's better to call
assign_expr_collation only once on all the columns at the end of loop
of rgt->columns, just like assign_expr_collation is called on all the
conditions in WHERE clause once

Addressed this in v2 patch.

If we call assign_expr_collations() on a list, the List expression
also gets a collation, which isn't what we want here. We want to
assign collations to the individual COLUMNs expression independently.
assign_list_collations() is better suited for that. I must say that
your earlier patch had got it right in this regard since it was
calling assign_expr_collations independently on each COLUMNs
expression. However, considering that an all properties reference is
replaced by a list of GraphPropertyRefs in place, I think calling
assign_list_collations() once on all COLUMNs expressions is a
future-proof fix. This is also inline with how collations are assigned
to targetlist expressions in a Query.

Nice catch!

Good to see tests also included in the patch. Do we need all three
queries? Also those queries should be placed near the section "-- test
collation specified in the expression" and add a query for explicit
collation in COLUMNs expression.

Removed two tests and moved the test. Explicit collate test already

exists.

I merged this test into an existing test to avoid adding yet another
query in the file that has many many queries already. Yes, an explicit
collation test is not needed separately.

Thanks!

While at it I added comments to explain why we aren't performing
en-masse collation assignment on a GraphTable or GraphPathPattern.

Please let me know what you think of the attached patch.

This LGTM. I reran the tests and all of them are passing.

Thanks,
Satya

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Ashutosh Bapat (#4)
Re: Bug: Missing collation assignment for GRAPH_TABLE COLUMNS expressions

On 15.04.26 17:07, Ashutosh Bapat wrote:

I merged this test into an existing test to avoid adding yet another
query in the file that has many many queries already. Yes, an explicit
collation test is not needed separately.

While at it I added comments to explain why we aren't performing
en-masse collation assignment on a GraphTable or GraphPathPattern.

Please let me know what you think of the attached patch.

committed