Re: Function scan FDW pushdown
Álvaro Herrera писал(а) 2025-08-05 22:55:
Hello,
On 2024-Nov-05, g.kashkin@postgrespro.ru wrote:
This is a long-overdue follow-up to the original patch.
Note that this patch contains only the changes required for
function scan pushdown, examples and code related to asymmetric
join are dropped.I've marked this as returned with feedback, as several months have
passed without a further version; the current one has a number of
gotchas, including some problematic coding detected by a compiler
warning, as well as unfinished design:The issue with setting newrte->functions to NIL still persists.
[...]
I am aware that the rte->functions will now be copied even on
instances that don't utilize a FDW, but I don't see a way to solve it.
Any suggestions are welcome.Feel free to reopen this CF entry[1] once you're able to figure this
out.
Hi.
I've updated patch with latest fixes, including more checks for corner
cases.
Now function pushdown is forbidden for functions, returning sets of
complex types
or which arguments contain parameters. Together with Gleb Kashkin we've
also moved
information about functions to foreign scan private data. The tricky
part here was
to get correct rti mappings after setrefs, but it seems we've managed to
do it.
Also I've removed some changes, which are required only in presence of
asymmetric
join.
Of course, I'm targeting PostgreSQL 20.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
v2-0001-Push-join-with-function-scan-to-remote-server.patchtext/x-diff; name=v2-0001-Push-join-with-function-scan-to-remote-server.patchDownload+1348-105
Import Notes
Reply to msg id not found: 202508051955.sao7ucalcce7@alvherre.pgsqlReference msg id not found: 202508051955.sao7ucalcce7@alvherre.pgsql
Alexander Pyhalov писал(а) 2026-03-18 15:08:
Álvaro Herrera писал(а) 2025-08-05 22:55:
Hello,
On 2024-Nov-05, g.kashkin@postgrespro.ru wrote:
This is a long-overdue follow-up to the original patch.
Note that this patch contains only the changes required for
function scan pushdown, examples and code related to asymmetric
join are dropped.I've marked this as returned with feedback, as several months have
passed without a further version; the current one has a number of
gotchas, including some problematic coding detected by a compiler
warning, as well as unfinished design:The issue with setting newrte->functions to NIL still persists.
[...]
I am aware that the rte->functions will now be copied even on
instances that don't utilize a FDW, but I don't see a way to solve
it.
Any suggestions are welcome.Feel free to reopen this CF entry[1] once you're able to figure this
out.Hi.
I've updated patch with latest fixes, including more checks for corner
cases.
Now function pushdown is forbidden for functions, returning sets of
complex types
or which arguments contain parameters. Together with Gleb Kashkin we've
also moved
information about functions to foreign scan private data. The tricky
part here was
to get correct rti mappings after setrefs, but it seems we've managed
to do it.
Also I've removed some changes, which are required only in presence of
asymmetric
join.Of course, I'm targeting PostgreSQL 20.
Hi. Updated patch to apply cleanly after recent changes.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
v3-0001-Push-join-with-function-scan-to-remote-server.patchtext/x-diff; name=v3-0001-Push-join-with-function-scan-to-remote-server.patchDownload+1348-105
Hi,
I tested the v3 patch on current master using a postgres_fdw loopback setup.
The patch mostly applied cleanly for me, with only a small manual
conflict resolution needed in postgres_fdw.c (missing
optimizer/clauses.h include). PostgreSQL built and started
successfully after that.
For testing, I used queries involving generate_series() together with
a foreign table, for example:
SELECT *
FROM ft
JOIN generate_series(1,3) g
ON ft.id = g;
I also tried:
1. implicit join syntax
2. LATERAL usage
3. disabling hashjoin/mergejoin to force different join paths
In all cases, the plans still showed a local Function Scan and local
join execution. The Remote SQL remained:
SELECT id FROM public.localtab
So I wasn't able to observe function scan pushdown with these testcases.
Maybe I'm missing a query shape or planner condition required to
trigger the new functionality. It would help to know which cases are
currently expected to be pushed down.
Regards,
Solaimurugan V
solaimurugan vellaipandiyan писал(а) 2026-05-08 08:17:
Hi,
I tested the v3 patch on current master using a postgres_fdw loopback
setup.The patch mostly applied cleanly for me, with only a small manual
conflict resolution needed in postgres_fdw.c (missing
optimizer/clauses.h include). PostgreSQL built and started
successfully after that.For testing, I used queries involving generate_series() together with
a foreign table, for example:SELECT *
FROM ft
JOIN generate_series(1,3) g
ON ft.id = g;I also tried:
1. implicit join syntax
2. LATERAL usage
3. disabling hashjoin/mergejoin to force different join pathsIn all cases, the plans still showed a local Function Scan and local
join execution. The Remote SQL remained:
SELECT id FROM public.localtab
So I wasn't able to observe function scan pushdown with these
testcases.
Maybe I'm missing a query shape or planner condition required to
trigger the new functionality. It would help to know which cases are
currently expected to be pushed down.Regards,
Solaimurugan V
Hi. Of course, this depends on foreign scan cost. Foreign join should be
cheaper than local one.
The cost of foreign join consists from
1) cost of evaluating inner and outer part of the join,
2) total costs of retreiving rows (nrows * cost of qual evaluation),
3) cost of preparing to evaluate quals (remote and local),
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.
Look at estimate_path_cost_size().
The cost of retreiving foreign rows consists from
1) cost of estimating baserestrictinfos
2) cost of remote sequential scan
3) cost of evaluating foreign relation target list
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.
If we compare function scan pushdown for foreign table and function with
foreign join pushdown of two foreign tables, we don't have two
penalties from fdw_startup_cost here. So, for function scan pushdown to
be chosen, it should filter out significant part of data, as data
transfer cost likely dominates in foreign scan costs estimation. You can
see this on the following toy example.
CREATE EXTENSION postgres_fdw;
SELECT current_database() AS current_database,
current_setting('port') AS current_port \gset
CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname :'current_database', port :'current_port');
CREATE USER MAPPING FOR current_user SERVER loopback;
CREATE table t1 (i int, j text);
create foreign table ft1 (i int, j text) server loopback options
(table_name 't1');
-- No adequate data about relation, we assume 1000+ rows in table, so
foreign join is chosen.
# explain select * from ft1 , generate_series(1,10) g where i=g;
QUERY PLAN
------------------------------------------------------
Foreign Scan (cost=100.00..172.16 rows=68 width=40)
Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)
-- We have actual data, there's no use to push down foreign join
postgres=# analyze ft1;
ANALYZE
postgres=# explain select * from ft1 , generate_series(1,10) g where
i=g;
QUERY PLAN
-----------------------------------------------------------------------------
Hash Join (cost=100.02..100.16 rows=1 width=40)
Hash Cond: (g.g = ft1.i)
-> Function Scan on generate_series g (cost=0.00..0.10 rows=10
width=4)
-> Hash (cost=100.00..100.00 rows=1 width=36)
-> Foreign Scan on ft1 (cost=100.00..100.00 rows=1 width=36)
-- If we had two foreign scans below join here, its startup cost would
be more than 2*100.
-- But evaluating function locally saves us from such great penalty.
-- Let's add some actual data.
postgres=# insert into ft1 select i, 'test'||i from
generate_series(1,1000) i;
INSERT 0 1000
postgres=# analyze ft1;
ANALYZE
-- Now we have actual data, transfering 1000 rows (plain foreign scan
costs ~ 410) and joining locally is more expensive than doing
-- remote join and transfering 10 rows.
postgres=# explain select * from ft1 , generate_series(1,10) g where
i=g;
QUERY PLAN
------------------------------------------------------
Foreign Scan (cost=100.00..143.20 rows=10 width=15)
Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi!
On Fri, May 8, 2026 at 9:03 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
solaimurugan vellaipandiyan писал(а) 2026-05-08 08:17:
Hi,
I tested the v3 patch on current master using a postgres_fdw loopback
setup.The patch mostly applied cleanly for me, with only a small manual
conflict resolution needed in postgres_fdw.c (missing
optimizer/clauses.h include). PostgreSQL built and started
successfully after that.For testing, I used queries involving generate_series() together with
a foreign table, for example:SELECT *
FROM ft
JOIN generate_series(1,3) g
ON ft.id = g;I also tried:
1. implicit join syntax
2. LATERAL usage
3. disabling hashjoin/mergejoin to force different join pathsIn all cases, the plans still showed a local Function Scan and local
join execution. The Remote SQL remained:
SELECT id FROM public.localtab
So I wasn't able to observe function scan pushdown with these
testcases.
Maybe I'm missing a query shape or planner condition required to
trigger the new functionality. It would help to know which cases are
currently expected to be pushed down.Regards,
Solaimurugan VHi. Of course, this depends on foreign scan cost. Foreign join should be
cheaper than local one.
The cost of foreign join consists from
1) cost of evaluating inner and outer part of the join,
2) total costs of retreiving rows (nrows * cost of qual evaluation),
3) cost of preparing to evaluate quals (remote and local),
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.
Look at estimate_path_cost_size().The cost of retreiving foreign rows consists from
1) cost of estimating baserestrictinfos
2) cost of remote sequential scan
3) cost of evaluating foreign relation target list
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.If we compare function scan pushdown for foreign table and function with
foreign join pushdown of two foreign tables, we don't have two
penalties from fdw_startup_cost here. So, for function scan pushdown to
be chosen, it should filter out significant part of data, as data
transfer cost likely dominates in foreign scan costs estimation. You can
see this on the following toy example.CREATE EXTENSION postgres_fdw;
SELECT current_database() AS current_database,
current_setting('port') AS current_port \gset
CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname :'current_database', port :'current_port');
CREATE USER MAPPING FOR current_user SERVER loopback;
CREATE table t1 (i int, j text);
create foreign table ft1 (i int, j text) server loopback options
(table_name 't1');-- No adequate data about relation, we assume 1000+ rows in table, so
foreign join is chosen.# explain select * from ft1 , generate_series(1,10) g where i=g;
QUERY PLAN
------------------------------------------------------
Foreign Scan (cost=100.00..172.16 rows=68 width=40)
Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)-- We have actual data, there's no use to push down foreign join
postgres=# analyze ft1;
ANALYZE
postgres=# explain select * from ft1 , generate_series(1,10) g where
i=g;
QUERY PLAN
-----------------------------------------------------------------------------
Hash Join (cost=100.02..100.16 rows=1 width=40)
Hash Cond: (g.g = ft1.i)
-> Function Scan on generate_series g (cost=0.00..0.10 rows=10
width=4)
-> Hash (cost=100.00..100.00 rows=1 width=36)
-> Foreign Scan on ft1 (cost=100.00..100.00 rows=1 width=36)-- If we had two foreign scans below join here, its startup cost would
be more than 2*100.
-- But evaluating function locally saves us from such great penalty.
-- Let's add some actual data.postgres=# insert into ft1 select i, 'test'||i from
generate_series(1,1000) i;
INSERT 0 1000
postgres=# analyze ft1;
ANALYZE-- Now we have actual data, transfering 1000 rows (plain foreign scan
costs ~ 410) and joining locally is more expensive than doing
-- remote join and transfering 10 rows.postgres=# explain select * from ft1 , generate_series(1,10) g where
i=g;
QUERY PLAN
------------------------------------------------------
Foreign Scan (cost=100.00..143.20 rows=10 width=15)
Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)
Thank you for your work on improving this matter. I've updated the
patch attempting to address major concerns raised in this thread.
1) Limit the support of the pushdown with immutable functions. This
must eliminate the risk of different evaluation results on different
servers (as long as immutable is set correctly).
2) Store pointer to stub fpinfo for the function in the absorbing
foreign table. Thus, if there are multiple foreign tables, which
could absorb the function, both ways can be considered. This is
illustrated in the regression tests: the decision on which foreign
table would absorb the function can be changed depending on clause
selectivity.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v4-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchapplication/octet-stream; name=v4-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchDownload+462-11
Hi,
Thanks for the detailed explanation and the example queries. That
helped me better understand the costing behavior behind the pushdown
decisions.
I tested the new v4 patch on current master using a postgres_fdw
loopback setup with local table t1 and foreign table ft1.
Here’s what I observed:
- Before ANALYZE, the planner chose a local Hash Join with separate
Foreign Scan and Function Scan nodes.
- After running ANALYZE on the empty table, it still preferred local
execution, which makes sense based on the costing explanation.
- I then inserted 1000 rows into t1, ran ANALYZE again, and repeated the test.
Even after that, I still got a local Hash Join plan like:
Hash Join
- Foreign Scan on ft1
- Function Scan on generate_series
I wasn’t able to observe the pushed-down Foreign Scan plan shown in
the example from the thread.
The patch itself applied and built successfully on my side, so this
may just be due to planner cost differences or environment-specific
behavior on current master.
Regards,
Solaimurugan V
On Mon, May 11, 2026 at 9:01 AM solaimurugan vellaipandiyan <
drsolaimurugan.v@gmail.com> wrote:
Thanks for the detailed explanation and the example queries. That
helped me better understand the costing behavior behind the pushdown
decisions.I tested the new v4 patch on current master using a postgres_fdw
loopback setup with local table t1 and foreign table ft1.Here’s what I observed:
- Before ANALYZE, the planner chose a local Hash Join with separate
Foreign Scan and Function Scan nodes.
- After running ANALYZE on the empty table, it still preferred local
execution, which makes sense based on the costing explanation.
- I then inserted 1000 rows into t1, ran ANALYZE again, and repeated the
test.
Even after that, I still got a local Hash Join plan like:
Hash Join
- Foreign Scan on ft1
- Function Scan on generate_series
I wasn’t able to observe the pushed-down Foreign Scan plan shown in
the example from the thread.The patch itself applied and built successfully on my side, so this
may just be due to planner cost differences or environment-specific
behavior on current master.
This also comes from the cost model. Check this example.
# CREATE TABLE t1 (id int);
# INSERT INTO t1 SELECT g FROM generate_series(1, 1000) g;
# CREATE FOREIGN TABLE ft1 (id int) SERVER loopback OPTIONS (table_name
't1');
# ANALYZE t1;
# ANALYZE ft1;
By default the local join is selected.
# EXPLAIN (VERBOSE, COSTS ON)
SELECT * FROM ft1, generate_series(1, 100) AS g(id)
WHERE ft1.id = g.id;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Hash Join (cost=102.25..332.00 rows=100 width=8)
Output: ft1.id, g.id
Hash Cond: (ft1.id = g.id)
-> Foreign Scan on public.ft1 (cost=100.00..325.00 rows=1000 width=4)
Output: ft1.id
Remote SQL: SELECT id FROM public.t1
-> Hash (cost=1.00..1.00 rows=100 width=4)
Output: g.id
-> Function Scan on pg_catalog.generate_series g
(cost=0.00..1.00 rows=100 width=4)
Output: g.id
Function Call: generate_series(1, 100)
(11 rows)
However, we can force remote join using enable_* options. You can see it
has higher cost. This is because estimate_path_cost_size() expects join
operator to be applied to the whole cross-product.
# SET enable_hashjoin = off;
# SET enable_mergejoin = off;
# SET enable_nestloop = off;
# EXPLAIN (VERBOSE, COSTS ON)
SELECT * FROM ft1, generate_series(1, 100) AS g(id)
WHERE ft1.id = g.id;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Hash Join (cost=102.25..332.00 rows=100 width=8)
Output: ft1.id, g.id
Hash Cond: (ft1.id = g.id)
-> Foreign Scan on public.ft1 (cost=100.00..325.00 rows=1000 width=4)
Output: ft1.id
Remote SQL: SELECT id FROM public.t1
-> Hash (cost=1.00..1.00 rows=100 width=4)
Output: g.id
-> Function Scan on pg_catalog.generate_series g
(cost=0.00..1.00 rows=100 width=4)
Output: g.id
Function Call: generate_series(1, 100)
(11 rows)
# RESET enable_hashjoin;
# RESET enable_mergejoin;
# RESET enable_nestloop;
Also, this can be fixed using remote estimate. You also can check that v3
regression tests by Pyhalov use this approach as well.
# ALTER FOREIGN TABLE ft1 OPTIONS (ADD use_remote_estimate 'true');
# EXPLAIN (VERBOSE, COSTS ON)
SELECT * FROM ft1, generate_series(1, 100) AS g(id)
WHERE ft1.id = g.id;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=102.25..143.00 rows=100 width=8)
Output: ft1.id, g.id
Relations: (public.ft1) INNER JOIN (Function g)
Remote SQL: SELECT r1.id, f2.c1 FROM (public.t1 r1 INNER JOIN
generate_series(1, 100) f2(c1) ON (((r1.id = f2.c1))))
(4 rows)
Thus, I don't see it to be a problem of this specific patch. I think this
is general inaccuracy of postgres_fdw cost model.
------
Regards,
Alexander Korotkov
Supabase
Alexander Korotkov писал(а) 2026-05-11 01:22:
Hi!
On Fri, May 8, 2026 at 9:03 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:solaimurugan vellaipandiyan писал(а) 2026-05-08 08:17:
Hi,
I tested the v3 patch on current master using a postgres_fdw loopback
setup.The patch mostly applied cleanly for me, with only a small manual
conflict resolution needed in postgres_fdw.c (missing
optimizer/clauses.h include). PostgreSQL built and started
successfully after that.For testing, I used queries involving generate_series() together with
a foreign table, for example:SELECT *
FROM ft
JOIN generate_series(1,3) g
ON ft.id = g;I also tried:
1. implicit join syntax
2. LATERAL usage
3. disabling hashjoin/mergejoin to force different join pathsIn all cases, the plans still showed a local Function Scan and local
join execution. The Remote SQL remained:
SELECT id FROM public.localtab
So I wasn't able to observe function scan pushdown with these
testcases.
Maybe I'm missing a query shape or planner condition required to
trigger the new functionality. It would help to know which cases are
currently expected to be pushed down.Regards,
Solaimurugan VHi. Of course, this depends on foreign scan cost. Foreign join should
be
cheaper than local one.
The cost of foreign join consists from
1) cost of evaluating inner and outer part of the join,
2) total costs of retreiving rows (nrows * cost of qual evaluation),
3) cost of preparing to evaluate quals (remote and local),
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.
Look at estimate_path_cost_size().The cost of retreiving foreign rows consists from
1) cost of estimating baserestrictinfos
2) cost of remote sequential scan
3) cost of evaluating foreign relation target list
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.If we compare function scan pushdown for foreign table and function
with
foreign join pushdown of two foreign tables, we don't have two
penalties from fdw_startup_cost here. So, for function scan pushdown
to
be chosen, it should filter out significant part of data, as data
transfer cost likely dominates in foreign scan costs estimation. You
can
see this on the following toy example.CREATE EXTENSION postgres_fdw;
SELECT current_database() AS current_database,
current_setting('port') AS current_port \gset
CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname :'current_database', port :'current_port');
CREATE USER MAPPING FOR current_user SERVER loopback;
CREATE table t1 (i int, j text);
create foreign table ft1 (i int, j text) server loopback options
(table_name 't1');-- No adequate data about relation, we assume 1000+ rows in table, so
foreign join is chosen.# explain select * from ft1 , generate_series(1,10) g where i=g;
QUERY PLAN
------------------------------------------------------
Foreign Scan (cost=100.00..172.16 rows=68 width=40)
Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)-- We have actual data, there's no use to push down foreign join
postgres=# analyze ft1;
ANALYZE
postgres=# explain select * from ft1 , generate_series(1,10) g where
i=g;
QUERY PLAN
-----------------------------------------------------------------------------
Hash Join (cost=100.02..100.16 rows=1 width=40)
Hash Cond: (g.g = ft1.i)
-> Function Scan on generate_series g (cost=0.00..0.10 rows=10
width=4)
-> Hash (cost=100.00..100.00 rows=1 width=36)
-> Foreign Scan on ft1 (cost=100.00..100.00 rows=1
width=36)-- If we had two foreign scans below join here, its startup cost would
be more than 2*100.
-- But evaluating function locally saves us from such great penalty.
-- Let's add some actual data.postgres=# insert into ft1 select i, 'test'||i from
generate_series(1,1000) i;
INSERT 0 1000
postgres=# analyze ft1;
ANALYZE-- Now we have actual data, transfering 1000 rows (plain foreign scan
costs ~ 410) and joining locally is more expensive than doing
-- remote join and transfering 10 rows.postgres=# explain select * from ft1 , generate_series(1,10) g where
i=g;
QUERY PLAN
------------------------------------------------------
Foreign Scan (cost=100.00..143.20 rows=10 width=15)
Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)Thank you for your work on improving this matter. I've updated the
patch attempting to address major concerns raised in this thread.1) Limit the support of the pushdown with immutable functions. This
must eliminate the risk of different evaluation results on different
servers (as long as immutable is set correctly).
2) Store pointer to stub fpinfo for the function in the absorbing
foreign table. Thus, if there are multiple foreign tables, which
could absorb the function, both ways can be considered. This is
illustrated in the regression tests: the decision on which foreign
table would absorb the function can be changed depending on clause
selectivity.------
Regards,
Alexander Korotkov
Supabase
Hi.
I've tested updated patch a bit.
1) deparseColumnRef() doesn't account for whole row vars.
In queries like
UPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))']) AS
t (bx) WHERE r.a = area(t.bx)
it fails with assert that varattno should be > 0. When we lock
non-relation RTE, we select whole row var, and we have to deparse it for
function RTE.
You've removed check for function return type. This seems to be
dangerous. Old example
CREATE OR REPLACE FUNCTION f_ret_record() RETURNS record AS $$ SELECT
(1,2)::record $$ language SQL IMMUTABLE;
ALTER EXTENSION postgres_fdw ADD function f_ret_record();
EXPLAIN (VERBOSE, COSTS OFF)
SELECT s FROM remote_tbl rt, f_ret_record() AS s(a int, b int)
WHERE s.a = rt.a;
fails with
ERROR: a column definition list is required for functions returning
"record"
2) postgresBeginForeignScan() can step on function RTE, and doesn't know
what to do with it:
SELECT * FROM unnest(array[2,3,4]) n, remote_tbl r WHERE r.a = n;
ERROR: cache lookup failed for foreign table 0
So, we need to look for the first RTE_RELATION, as in older patch
version.
3) A lot of complexity in the old patch version was in making it
possible to find out RTE_FUNCTION attribute types after planing, as it's
necessary to correctly handle joins. In this version
get_tupdesc_for_join_scan_tuples() doesn't handle function RTEs. This
means, when we try to find out type for attribute types for joins, we'll
get errors. This can be seen in queries like
UPDATE remote_tbl r SET b=CASE WHEN random()>=0 THEN 5 ELSE 0 END FROM
UNNEST(array[box '((2,3),(-2,-3))']) AS t (bx) WHERE r.a = area(t.bx)
RETURNING a,b;
Now it fails on earlier stages (with "column f2.c0 does not exist"), but
if we fix it, we'll get something like
"ERROR: input of anonymous composite types is not implemented"
Overall, function_rte_pushdown_ok() now allows more strange
constructions. Could it skip Vars from outside of joinrel->relids? Can
we safely ship function with parameters in arguments? I'm not sure.
4) Why do we restrict list_length(rte->functions) to 1? The main reason
for supporting several rte->functions was to allow pushdown of UNNEST()
with several arrays, which is used by HammerDB tproc-c test.
5) We can support pushing down joins of foreign tables and another RTE
types, for example, VALUES. But with new specific way of handling
RTE_FUNCTIONS in core, this requires both changes to
set_foreign_rel_properties() and postgres_fdw. Not sure, if this is a
big problem, but at least is worth mentioning .
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi, Alexander!
The revised patch is attached.
On Tue, May 12, 2026 at 11:09 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
1) deparseColumnRef() doesn't account for whole row vars.
In queries likeUPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))']) AS
t (bx) WHERE r.a = area(t.bx)it fails with assert that varattno should be > 0. When we lock
non-relation RTE, we select whole row var, and we have to deparse it for
function RTE.You've removed check for function return type. This seems to be
dangerous. Old exampleCREATE OR REPLACE FUNCTION f_ret_record() RETURNS record AS $$ SELECT
(1,2)::record $$ language SQL IMMUTABLE;
ALTER EXTENSION postgres_fdw ADD function f_ret_record();
EXPLAIN (VERBOSE, COSTS OFF)
SELECT s FROM remote_tbl rt, f_ret_record() AS s(a int, b int)
WHERE s.a = rt.a;fails with
ERROR: a column definition list is required for functions returning
"record"
function_rte_pushdown_ok() now calls get_expr_result_type() and
rejects anything that isn't TYPEFUNC_SCALAR (also RECORDOID/VOIDOID),
so f_ret_record() no longer reaches the remote side.
deparseColumnRef() now handles varattno == 0 for RTE_FUNCTION and
emits ROW(f<rti>.c1, ..., f<rti>.c<N>) from rte->eref->colnames.
2) postgresBeginForeignScan() can step on function RTE, and doesn't know
what to do with it:
SELECT * FROM unnest(array[2,3,4]) n, remote_tbl r WHERE r.a = n;
ERROR: cache lookup failed for foreign table 0So, we need to look for the first RTE_RELATION, as in older patch
version.
The scanrelid == 0 branch in postgresBeginForeignScan() now scans
fs_base_relids until it finds an RTE_RELATION. An explicit
elog(ERROR) guards the (theoretically impossible) case where no
foreign RTE is found.
3) A lot of complexity in the old patch version was in making it
possible to find out RTE_FUNCTION attribute types after planing, as it's
necessary to correctly handle joins. In this version
get_tupdesc_for_join_scan_tuples() doesn't handle function RTEs. This
means, when we try to find out type for attribute types for joins, we'll
get errors. This can be seen in queries likeUPDATE remote_tbl r SET b=CASE WHEN random()>=0 THEN 5 ELSE 0 END FROM
UNNEST(array[box '((2,3),(-2,-3))']) AS t (bx) WHERE r.a = area(t.bx)
RETURNING a,b;Now it fails on earlier stages (with "column f2.c0 does not exist"), but
if we fix it, we'll get something like
"ERROR: input of anonymous composite types is not implemented"Overall, function_rte_pushdown_ok() now allows more strange
constructions. Could it skip Vars from outside of joinrel->relids? Can
we safely ship function with parameters in arguments? I'm not sure.
Restored the per-function metadata you had in v2/v3.
FdwScanPrivateFunctions (list of (funcid, funcrettype, funccollation)
indexed by RTI-offset) and FdwScanPrivateMinRTIndex are now saved in
fdw_private by postgresGetForeignPlan().
get_tupdesc_for_join_scan_tuples() now has an RTE_FUNCTION branch that
rebuilds the tuple descriptor from this metadata, exactly as in your
patch.
4) Why do we restrict list_length(rte->functions) to 1? The main reason
for supporting several rte->functions was to allow pushdown of UNNEST()
with several arrays, which is used by HammerDB tproc-c test.
function_rte_pushdown_ok() now loops over rte->functions and applies
the same checks to every member. deparseFunctionRangeTblRef() emits
ROWS FROM (f1(), f2(), ...) AS f<rti>(c1, c2, ...) with the
column-name list covering the concatenation of every function's
columns, in the order they appear.
5) We can support pushing down joins of foreign tables and another RTE
types, for example, VALUES. But with new specific way of handling
RTE_FUNCTIONS in core, this requires both changes to
set_foreign_rel_properties() and postgres_fdw. Not sure, if this is a
big problem, but at least is worth mentioning .
I've kept the planner-side change minimal:
set_foreign_rel_properties() propagates fdwroutine onto a joinrel
pairing a foreign rel with an RTE_FUNCTION rel, so the standard
GetForeignJoinPaths callback gets invoked. No new FDW callback was
needed. This is also the reason the planner-side fpinfo of the
function side lives on the joinrel
(outer_func_fpinfo/inner_func_fpinfo fields added to
PgFdwRelationInfo) rather than on the function
RelOptInfo->fdw_private: the same RTE_FUNCTION rel can be paired
independently with foreign rels from several servers. Extending the
same scheme to RTE_VALUES / RTE_CTE is, I agree, worth doing late. It
would be a one branch addition to set_foreign_rel_properties() plus an
FDW-side shippability check analogous to function_rte_pushdown_ok().
A few extra changes to the patch:
- LATERAL function RTEs are explicitly rejected in
function_rte_pushdown_ok() (returns false when rel->lateral_relids is
non-empty).
- Outer joins (LEFT/RIGHT/FULL) and SEMI joins fall through to the
existing !fpinfo_o || !fpinfo_i rejection, since
inner_is_function/outer_is_function are only set for JOIN_INNER.
- Regression coverage in contrib/postgres_fdw/sql/postgres_fdw.sql now
reuses your examples.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v5-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchapplication/octet-stream; name=v5-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchDownload+1127-25
Alexander Korotkov писал(а) 2026-05-18 13:34:
Hi, Alexander!
The revised patch is attached.
On Tue, May 12, 2026 at 11:09 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:1) deparseColumnRef() doesn't account for whole row vars.
In queries likeUPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))'])
AS
t (bx) WHERE r.a = area(t.bx)it fails with assert that varattno should be > 0. When we lock
non-relation RTE, we select whole row var, and we have to deparse it
for
function RTE.You've removed check for function return type. This seems to be
dangerous. Old exampleCREATE OR REPLACE FUNCTION f_ret_record() RETURNS record AS $$ SELECT
(1,2)::record $$ language SQL IMMUTABLE;
ALTER EXTENSION postgres_fdw ADD function f_ret_record();
EXPLAIN (VERBOSE, COSTS OFF)
SELECT s FROM remote_tbl rt, f_ret_record() AS s(a int, b int)
WHERE s.a = rt.a;fails with
ERROR: a column definition list is required for functions returning
"record"function_rte_pushdown_ok() now calls get_expr_result_type() and
rejects anything that isn't TYPEFUNC_SCALAR (also RECORDOID/VOIDOID),
so f_ret_record() no longer reaches the remote side.
deparseColumnRef() now handles varattno == 0 for RTE_FUNCTION and
emits ROW(f<rti>.c1, ..., f<rti>.c<N>) from rte->eref->colnames.2) postgresBeginForeignScan() can step on function RTE, and doesn't
know
what to do with it:
SELECT * FROM unnest(array[2,3,4]) n, remote_tbl r WHERE r.a = n;
ERROR: cache lookup failed for foreign table 0So, we need to look for the first RTE_RELATION, as in older patch
version.The scanrelid == 0 branch in postgresBeginForeignScan() now scans
fs_base_relids until it finds an RTE_RELATION. An explicit
elog(ERROR) guards the (theoretically impossible) case where no
foreign RTE is found.3) A lot of complexity in the old patch version was in making it
possible to find out RTE_FUNCTION attribute types after planing, as
it's
necessary to correctly handle joins. In this version
get_tupdesc_for_join_scan_tuples() doesn't handle function RTEs. This
means, when we try to find out type for attribute types for joins,
we'll
get errors. This can be seen in queries likeUPDATE remote_tbl r SET b=CASE WHEN random()>=0 THEN 5 ELSE 0 END FROM
UNNEST(array[box '((2,3),(-2,-3))']) AS t (bx) WHERE r.a = area(t.bx)
RETURNING a,b;Now it fails on earlier stages (with "column f2.c0 does not exist"),
but
if we fix it, we'll get something like
"ERROR: input of anonymous composite types is not implemented"Overall, function_rte_pushdown_ok() now allows more strange
constructions. Could it skip Vars from outside of joinrel->relids? Can
we safely ship function with parameters in arguments? I'm not sure.Restored the per-function metadata you had in v2/v3.
FdwScanPrivateFunctions (list of (funcid, funcrettype, funccollation)
indexed by RTI-offset) and FdwScanPrivateMinRTIndex are now saved in
fdw_private by postgresGetForeignPlan().
get_tupdesc_for_join_scan_tuples() now has an RTE_FUNCTION branch that
rebuilds the tuple descriptor from this metadata, exactly as in your
patch.
Hi. I am a bit confused about this comment (and code):
/*
* DirectModify on a foreign join: pass NIL/0 for
the function
* metadata. We don't currently push function
RTEs through the
* direct-modify path, so there are no whole-row
Vars pointing at
* function-RTE tuples to reconstruct.
*/
tupdesc = get_tupdesc_for_join_scan_tuples(node,
NIL, 0);
We evidently go through this code path when executing example
UPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))']) AS
t (bx) WHERE r.a = area(t.bx)
RETURNING a,b;
But don't need whole row var in returning list.... However, we still can
step on this issue.
UPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))'],
array[int '1']) AS t (bx, i) WHERE r.a = area(t.bx)
RETURNING a,b,t;
ERROR: input of anonymous composite types is not implemented
CONTEXT: whole-row reference to foreign table "t"
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi, Alexander.
On Mon, May 18, 2026 at 11:06 PM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
Hi. I am a bit confused about this comment (and code):
/*
* DirectModify on a foreign join: pass NIL/0 for
the function
* metadata. We don't currently push function
RTEs through the
* direct-modify path, so there are no whole-row
Vars pointing at
* function-RTE tuples to reconstruct.
*/
tupdesc = get_tupdesc_for_join_scan_tuples(node,
NIL, 0);We evidently go through this code path when executing example
UPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))']) AS
t (bx) WHERE r.a = area(t.bx)
RETURNING a,b;But don't need whole row var in returning list.... However, we still can
step on this issue.
Yes, we go through this code path, and it works as long as whole-row
var is not needed.
UPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))'],
array[int '1']) AS t (bx, i) WHERE r.a = area(t.bx)
RETURNING a,b,t;ERROR: input of anonymous composite types is not implemented
CONTEXT: whole-row reference to foreign table "t"
But if whole row var is actually used, then the assumption is broken.
So, we need to build a whole-row var anyway. I've fixed this in the
attached patch, and added your sample query as a regression test case.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v6-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchapplication/octet-stream; name=v6-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchDownload+1177-25
Alexander Korotkov писал(а) 2026-05-19 16:00:
Hi, Alexander.
On Mon, May 18, 2026 at 11:06 PM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:Hi. I am a bit confused about this comment (and code):
/*
* DirectModify on a foreign join: pass NIL/0
for
the function
* metadata. We don't currently push function
RTEs through the
* direct-modify path, so there are no
whole-row
Vars pointing at
* function-RTE tuples to reconstruct.
*/
tupdesc =
get_tupdesc_for_join_scan_tuples(node,
NIL, 0);We evidently go through this code path when executing example
UPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))'])
AS
t (bx) WHERE r.a = area(t.bx)
RETURNING a,b;But don't need whole row var in returning list.... However, we still
can
step on this issue.Yes, we go through this code path, and it works as long as whole-row
var is not needed.UPDATE remote_tbl r SET b=5 FROM UNNEST(array[box '((2,3),(-2,-3))'],
array[int '1']) AS t (bx, i) WHERE r.a = area(t.bx)
RETURNING a,b,t;ERROR: input of anonymous composite types is not implemented
CONTEXT: whole-row reference to foreign table "t"But if whole row var is actually used, then the assumption is broken.
So, we need to build a whole-row var anyway. I've fixed this in the
attached patch, and added your sample query as a regression test case.
Good evening.
Found one more issue in whole row var deparsing. It can appear on a
nullable outer side, and we should use the same logic as when deparsing
table column reference. Otherwise we get records from nulls instead of
nulls (for example, "(NULL, NULL)" instead of NULL).
Also I wonder if it is possible for get_tupdesc_for_join_scan_tuples()
to get NULL rtfuncdata when it looks at RTE_FUNCTION RTE here:
1759 else if (rte->rtekind == RTE_FUNCTION && rtfuncdata
!= NIL)
1760 {
1761 /*
1762 * A whole-row Var points at a FUNCTION RTE
absorbed into the
1763 * foreign join. Synthesize an anonymous
composite TupleDesc from
1764 * the per-function return-type metadata we
saved at plan time;
1765 * the deparser emits these as
ROW(f<rti>.c1, f<rti>.c2, ...).
1766 */
1767 List *funcdata;
1768 TupleDesc rte_tupdesc;
1769 int num_funcs;
1770 int attnum;
1771 ListCell *lc1,
?
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
v7-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchtext/x-diff; name=v7-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchDownload+1221-25
Good evening!
On Tue, May 19, 2026 at 6:25 PM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
Found one more issue in whole row var deparsing. It can appear on a
nullable outer side, and we should use the same logic as when deparsing
table column reference. Otherwise we get records from nulls instead of
nulls (for example, "(NULL, NULL)" instead of NULL).
Good catch, accepted.
Also I wonder if it is possible for get_tupdesc_for_join_scan_tuples()
to get NULL rtfuncdata when it looks at RTE_FUNCTION RTE here:1759 else if (rte->rtekind == RTE_FUNCTION && rtfuncdata
!= NIL)
1760 {
1761 /*
1762 * A whole-row Var points at a FUNCTION RTE
absorbed into the
1763 * foreign join. Synthesize an anonymous
composite TupleDesc from
1764 * the per-function return-type metadata we
saved at plan time;
1765 * the deparser emits these as
ROW(f<rti>.c1, f<rti>.c2, ...).
1766 */
1767 List *funcdata;
1768 TupleDesc rte_tupdesc;
1769 int num_funcs;
1770 int attnum;
1771 ListCell *lc1,?
I don't see how that's possible. I think it would be safer to just
assert rtfuncdata/funcdata are not NULL. Revised patch is attached.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v8-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchapplication/octet-stream; name=v8-0001-postgres_fdw-push-down-FUNCTION-RTE-into-foreign-.patchDownload+1227-25
Alexander Korotkov писал(а) 2026-05-19 21:21:
Good evening!
On Tue, May 19, 2026 at 6:25 PM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:Found one more issue in whole row var deparsing. It can appear on a
nullable outer side, and we should use the same logic as when
deparsing
table column reference. Otherwise we get records from nulls instead of
nulls (for example, "(NULL, NULL)" instead of NULL).Good catch, accepted.
Hi.
I've found another issue. The fact that in the new versions of the patch
RTE RelOptInfo misses fdw_private seems to be unfortunate. For example,
in the last version we haven't thought about classifying
baserestrictinfo. And if we do, we should pass fdw_private down to
foreign_expr_walker. Perhaps, we could attach it to RTE_FUNCTION rel
prior to calling classifyConditions(), but should we later set it back
to NULL? Another problem comes if we try to handle joins, which can
crearte subqueries (like INNER/OUTER UNIQUE). In this case we should
somehow cook fpinfo for get_relation_column_alias_ids(). Attaching patch
which tries to handle baserestrictinfos by passing fpinfo down to
foreign_expr_walker().
One more interesting example (included in the patch) is
EXPLAIN (VERBOSE, COSTS OFF)
WITH s AS MATERIALIZED (SELECT r1.* FROM remote_tbl r1
JOIN LATERAL
(SELECT r2.a FROM remote_tbl r2, f(r1.a) LIMIT 1) s
ON true)
SELECT * FROM s ORDER BY 1;
We get the following plan:
Sort
Output: s.a, s.b
Sort Key: s.a
CTE s
-> Nested Loop
Output: r1.a, r1.b
-> Foreign Scan on public.remote_tbl r1
Output: r1.a, r1.b
Remote SQL: SELECT a, b FROM public.base_tbl_fn
-> Foreign Scan
Output: NULL::integer
Relations: (public.remote_tbl r2) INNER JOIN (Function
f)
Remote SQL: SELECT NULL FROM (public.base_tbl_fn r1
INNER JOIN public.f($1::integer) f2(c1) ON (TRUE)) LIMIT 1::bigint
-> CTE Scan on s
Output: s.a, s.b
Here you can see that we use parameter in function argument. Don't know
if it's a real problem, but at least looks suspicious. In v3 patch used
contain_param_walker() in is_nonrel_relinfo_ok() (which mutated to
function_rte_pushdown_ok()) to avoid such plans.
One minor issue I've noticed is in function_rte_pushdown_ok():
+ if (rel->rtekind != RTE_FUNCTION)
+ return false;
+ rte = planner_rt_fetch(rel->relid, root);
+ if (rte->rtekind != RTE_FUNCTION)
+ return false;
Is the second rtekind check necessary?
--
Best regards,
Alexander Pyhalov,
Postgres Professional