pg_get_viewdef() produces non-round-trippable SQL for views with USING join on mismatched integer types

Started by Swirl Smog Dowryabout 2 months ago8 messagesbugs
Jump to latest
#1Swirl Smog Dowry
swirl-smog-dowry@duck.com

PostgreSQL version: 18.1 (also verified on 18.2)
OS: Linux

Description:
============
When a view uses a USING join on columns with different integer types
(integer vs
bigint) and the SELECT clause contains an explicit narrowing cast,
pg_get_viewdef()
produces SQL that PostgreSQL itself rejects. This makes pg_dump produce
dumps that
fail on restore for any such view.

The SELECT clause gets col::integer, while the GROUP BY gets (col::bigint)
— they
refer to the same underlying column but with different casts, so the GROUP
BY check
fails to recognise the SELECT expression as covered.

Minimal reproducer:
===================

CREATE TABLE t1 (year integer, val numeric);
CREATE TABLE t2 (year bigint, label text);
INSERT INTO t1 VALUES (2025, 100), (2025, 200), (2026, 300);
INSERT INTO t2 VALUES (2025, 'A'), (2026, 'B');

-- View creation succeeds and queries work fine:
CREATE VIEW v AS
SELECT year::integer AS year, t2.label, sum(val) AS total
FROM t1
LEFT JOIN t2 USING (year)
GROUP BY year, t2.label;

SELECT * FROM v; -- returns correct results

-- pg_get_viewdef output:
SELECT pg_get_viewdef('v'::regclass, true);

Output of pg_get_viewdef:
=========================

SELECT t1.year::integer AS year,
t2.label,
sum(t1.val) AS total
FROM t1
LEFT JOIN t2 USING (year)
GROUP BY (t1.year::bigint), t2.label;

Note: SELECT has t1.year::integer, GROUP BY has (t1.year::bigint).

Attempting to re-execute the pg_get_viewdef output fails:
=========================================================

CREATE VIEW v2 AS
SELECT t1.year::integer AS year,
t2.label,
sum(t1.val) AS total
FROM t1
LEFT JOIN t2 USING (year)
GROUP BY (t1.year::bigint), t2.label;

ERROR: column "t1.year" must appear in the GROUP BY clause or be used
in an aggregate function
LINE 2: SELECT t1.year::integer AS year,
^

Impact:
=======
pg_dump uses pg_get_viewdef() to serialise view definitions. Any database
containing such a view (USING join between integer and bigint columns, with
an explicit cast in SELECT) will produce a dump that fails during restore
with
the above error.

Without the explicit cast in SELECT (i.e. just SELECT year, ...)
pg_get_viewdef
emits t1.year::bigint in both SELECT and GROUP BY, and the round-trip works.
The bug is triggered specifically by the narrowing cast (bigint -> integer)
in
the SELECT list combined with a USING join.

Kind regards,

Swirl

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Swirl Smog Dowry (#1)
Re: pg_get_viewdef() produces non-round-trippable SQL for views with USING join on mismatched integer types

Swirl Smog Dowry <swirl-smog-dowry@duck.com> writes:

When a view uses a USING join on columns with different integer
types (integer vs bigint) and the SELECT clause contains an explicit
narrowing cast, pg_get_viewdef() produces SQL that PostgreSQL itself
rejects. This makes pg_dump produce dumps that fail on restore for
any such view.

Hmm, yeah. This used to work as-expected, too. "git bisect" finds
that it broke at

247dea89f7616fdf06b7272b74abafc29e8e5860 is the first bad commit
commit 247dea89f7616fdf06b7272b74abafc29e8e5860
Author: Richard Guo <rguo@postgresql.org>
Date: Tue Sep 10 12:35:34 2024 +0900

Introduce an RTE for the grouping step

Looking at the parse tree for the problem query, I see

{RANGETBLENTRY
:alias <>
:eref
{ALIAS
:aliasname *GROUP*
:colnames ("?column?" "label")
}
:rtekind 9
:groupexprs (
{FUNCEXPR
:funcid 481
:funcresulttype 20
:funcretset false
:funcvariadic false
:funcformat 2
:funccollid 0
:inputcollid 0
:args (
{VAR
:varno 1
:varattno 1
:vartype 23
:vartypmod -1
:varcollid 0
:varnullingrels (b)
:varlevelsup 0
:varreturningtype 0
:varnosyn 1
:varattnosyn 1
:location -1
}
)
:location -1
}
{VAR
:varno 2
:varattno 2
:vartype 25
:vartypmod -1
:varcollid 100
:varnullingrels (b 3)
:varlevelsup 0
:varreturningtype 0
:varnosyn 2
:varattnosyn 2
:location 32
}
)
:lateral false
:inFromCl false
:securityQuals <>
}

The first groupexpr is the same as the joinaliasvars entry for that
column in the JOIN RTE. This surprises me: I'd expect to see a
reference to the join output column there, ie Var 3/1, because I'm
pretty sure that's what parsing of "GROUP BY year" would have produced
initially. If it were like that, I think ruleutils would produce the
desired output. So I'd tentatively classify this as "join alias Vars
are being flattened too soon". Richard, any thoughts?

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: pg_get_viewdef() produces non-round-trippable SQL for views with USING join on mismatched integer types

I wrote:

The first groupexpr is the same as the joinaliasvars entry for that
column in the JOIN RTE. This surprises me: I'd expect to see a
reference to the join output column there, ie Var 3/1, because I'm
pretty sure that's what parsing of "GROUP BY year" would have produced
initially. If it were like that, I think ruleutils would produce the
desired output. So I'd tentatively classify this as "join alias Vars
are being flattened too soon". Richard, any thoughts?

The problem is obvious after looking at parseCheckAggregates: the
RTE_GROUP RTE is manufactured using the groupClauses list after we
have flattened that, which we are only doing for comparison purposes;
it shouldn't affect what goes into the parse tree. I experimented
with just changing the order of operations, and that seems to fix it.
The lack of any effect on check-world shows we need more test cases
here ...

regards, tom lane

Attachments:

wip-preserve-join-aliases-in-RTE_GROUP.patchtext/x-diff; charset=us-ascii; name=wip-preserve-join-aliases-in-RTE_GROUP.patchDownload+19-17
#4Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#3)
Re: pg_get_viewdef() produces non-round-trippable SQL for views with USING join on mismatched integer types

On Fri, Feb 27, 2026 at 3:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The problem is obvious after looking at parseCheckAggregates: the
RTE_GROUP RTE is manufactured using the groupClauses list after we
have flattened that, which we are only doing for comparison purposes;
it shouldn't affect what goes into the parse tree. I experimented
with just changing the order of operations, and that seems to fix it.

Right. We should keep the unmodified GROUP BY expressions in the
parse tree, and then rely on the planner to flatten the join alias
vars within them.

+1 to the fix.

The lack of any effect on check-world shows we need more test cases
here ...

How about this new test case in the attached (parse_agg.c untouched)?

- Richard

Attachments:

wip-preserve-join-aliases-in-RTE_GROUP.patchapplication/octet-stream; name=wip-preserve-join-aliases-in-RTE_GROUP.patchDownload+36-17
#5Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#4)
Re: pg_get_viewdef() produces non-round-trippable SQL for views with USING join on mismatched integer types

On Fri, Feb 27, 2026 at 12:23 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Feb 27, 2026 at 3:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The problem is obvious after looking at parseCheckAggregates: the
RTE_GROUP RTE is manufactured using the groupClauses list after we
have flattened that, which we are only doing for comparison purposes;
it shouldn't affect what goes into the parse tree. I experimented
with just changing the order of operations, and that seems to fix it.

Right. We should keep the unmodified GROUP BY expressions in the
parse tree, and then rely on the planner to flatten the join alias
vars within them.

+1 to the fix.

I am on the fence about whether this fix is safe to back-patch to v18.
I cautiously think it is safe, as it does not change the parsetree's
external representation and does not require an initdb.

However, this fix will only apply to newly created views. Users who
have existing views affected by this bug will have to recreate them
after upgrading to get the corrected pg_get_viewdef output. (really
kicking myself for this.)

- Richard

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#5)
Re: pg_get_viewdef() produces non-round-trippable SQL for views with USING join on mismatched integer types

Richard Guo <guofenglinux@gmail.com> writes:

I am on the fence about whether this fix is safe to back-patch to v18.

I don't think we have a lot of choice. The cases where it makes a
difference are pretty broken. Fortunately, I think these cases
are rare. JOIN USING combining two different-type columns has got
to be an edge-case usage, and I think it likely doesn't matter much
in other cases.

However, this fix will only apply to newly created views. Users who
have existing views affected by this bug will have to recreate them
after upgrading to get the corrected pg_get_viewdef output.

Yeah :-(. What's really annoying is that probably people will not
notice until they try to upgrade to v19, and by then recreating
the view correctly might be difficult. But I'm not seeing a way
to smooth their path.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#6)
Re: pg_get_viewdef() produces non-round-trippable SQL for views with USING join on mismatched integer types

I wrote:

Richard Guo <guofenglinux@gmail.com> writes:

I am on the fence about whether this fix is safe to back-patch to v18.

I don't think we have a lot of choice. The cases where it makes a
difference are pretty broken. Fortunately, I think these cases
are rare. JOIN USING combining two different-type columns has got
to be an edge-case usage, and I think it likely doesn't matter much
in other cases.

I spent a bit of effort on determining which cases actually cause
wrong output, and AFAICT it's very narrow: you need "SELECT ...
t1 LEFT JOIN t2 USING (x) GROUP BY x" where t1.x and t2.x are
different data types and t1.x is the side requiring coercion.
With no coercion, or if the join side to be coerced is nullable, we
show the flattened alias Var but that doesn't actually break anything.

So I went ahead and pushed this, using your test case.

regards, tom lane

#8Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#7)
Re: pg_get_viewdef() produces non-round-trippable SQL for views with USING join on mismatched integer types

On Sat, Feb 28, 2026 at 3:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

So I went ahead and pushed this, using your test case.

Thank you for handling this, Tom.

- Richard