postgres_fdw aggregate pushdown for group by with expressions

Started by Michał Kłeczekabout 2 years ago10 messagesgeneral
Jump to latest
#1Michał Kłeczek
michal@kleczek.org

Hi,

I have the following foreign table:

CREATE FOREIGN TABLE t1 (
grouping_column text,
date_column date,
whatever_data int
);

The query is:

SELECT
sum(whatever_data)
FROM
t1
GROUP BY
grouping_colulmn, extract(YEAR FROM date_column), extract(MONTH FROM date_column);

From my (preliminary) testing postgres_fdw will not push down this aggregate query
- it will happily push down query with only “grouping_column” or “grouping_column, date_column" in GROUP BY

Is there a way to somehow push down the query with expressions in GROUP BY?

Thanks,
Michal

#2Michał Kłeczek
michal@kleczek.org
In reply to: Michał Kłeczek (#1)
Re: postgres_fdw aggregate pushdown for group by with expressions

On 3 Mar 2024, at 10:34, Michał Kłeczek <michal@kleczek.org> wrote:

Hi,

I have the following foreign table:

CREATE FOREIGN TABLE t1 (
grouping_column text,
date_column date,
whatever_data int
);

The query is:

SELECT
sum(whatever_data)
FROM
t1
GROUP BY
grouping_colulmn, extract(YEAR FROM date_column), extract(MONTH FROM date_column);

From my (preliminary) testing postgres_fdw will not push down this aggregate query
- it will happily push down query with only “grouping_column” or “grouping_column, date_column" in GROUP BY

Is there a way to somehow push down the query with expressions in GROUP BY?

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
the WHERE criteria from the following query are not pushed down as well and filter is performed locally:

SELECT
*
FROM
t1
WHERE extract(YEAR FROM date_column) = 2023

I was under impression that “extract” meets all criteria for pushable expressions - looks like I am wrong?

What would be the expression to achieve the same but such that postgres_fdw would push it down?

Thanks,
Michal

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Michał Kłeczek (#2)
Re: postgres_fdw aggregate pushdown for group by with expressions

On 3/3/24 05:16, Michał Kłeczek wrote:

On 3 Mar 2024, at 10:34, Michał Kłeczek <michal@kleczek.org> wrote:

Hi,

I have the following foreign table:

CREATE FOREIGN TABLE t1 (
grouping_column text,
date_column date,
whatever_data int
);

The query is:

SELECT
sum(whatever_data)
FROM
t1
GROUP BY
grouping_colulmn, extract(YEAR FROM date_column), extract(MONTH FROM date_column);

From my (preliminary) testing postgres_fdw will not push down this aggregate query
- it will happily push down query with only “grouping_column” or “grouping_column, date_column" in GROUP BY

Is there a way to somehow push down the query with expressions in GROUP BY?

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
the WHERE criteria from the following query are not pushed down as well and filter is performed locally:

SELECT
*
FROM
t1
WHERE extract(YEAR FROM date_column) = 2023

I was under impression that “extract” meets all criteria for pushable expressions - looks like I am wrong?

What would be the expression to achieve the same but such that postgres_fdw would push it down?

What Postgres versions on both ends?

What does

select proname, prosrc, provolatile from pg_proc where proname = 'extract';

return?

Thanks,
Michal

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Michał Kłeczek
michal@kleczek.org
In reply to: Adrian Klaver (#3)
Re: postgres_fdw aggregate pushdown for group by with expressions

Thanks for help.

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
the WHERE criteria from the following query are not pushed down as well and filter is performed locally:
SELECT
*
FROM
t1
WHERE extract(YEAR FROM date_column) = 2023
I was under impression that “extract” meets all criteria for pushable expressions - looks like I am wrong?
What would be the expression to achieve the same but such that postgres_fdw would push it down?

What Postgres versions on both ends?

16.1 on both sides

What does

select proname, prosrc, provolatile from pg_proc where proname = 'extract';

return?

extract | extract_date | i

(and bunch of similar records for other types - for timestampz provolatile is “s”)


Michal

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michał Kłeczek (#2)
Re: postgres_fdw aggregate pushdown for group by with expressions

=?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -

Yeah :-(. I traced through this, and it seems it's a collation
problem. Internally, that call looks like
extract('year'::text, date_column)
The text constant is marked as having collation "default", which means
that extract() is marked as having input collation "default", and then
it falls foul of this rule:

* ... An expression is considered safe to send
* only if all operator/function input collations used in it are traceable to
* Var(s) of the foreign table. That implies that if the remote server gets
* a different answer than we do, the foreign table's columns are not marked
* with collations that match the remote table's columns, which we can
* consider to be user error.

Of course, extract() doesn't actually care about collation, but
postgres_fdw has no good way to know that. Nor does it trust the
remote server to have the same set of collations the local one does,
so it doesn't want to try to fix this by sending explicit COLLATE
clauses.

Somebody ought to work on improving that mess sometime. One thought
that comes to mind is to have a server option authorizing postgres_fdw
to believe that all local collations exist on the remote side.

regards, tom lane

#6Michał Kłeczek
michal@kleczek.org
In reply to: Tom Lane (#5)
Re: postgres_fdw aggregate pushdown for group by with expressions

Ok, so that means the workaround is to create an extension with year(date) and month(date) functions that internally would call extract.

Not ideal but workable I guess.


Michał

Show quoted text

On 3 Mar 2024, at 18:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:

=?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -

Yeah :-(. I traced through this, and it seems it's a collation
problem. Internally, that call looks like
extract('year'::text, date_column)
The text constant is marked as having collation "default", which means
that extract() is marked as having input collation "default", and then
it falls foul of this rule:

* ... An expression is considered safe to send
* only if all operator/function input collations used in it are traceable to
* Var(s) of the foreign table. That implies that if the remote server gets
* a different answer than we do, the foreign table's columns are not marked
* with collations that match the remote table's columns, which we can
* consider to be user error.

Of course, extract() doesn't actually care about collation, but
postgres_fdw has no good way to know that. Nor does it trust the
remote server to have the same set of collations the local one does,
so it doesn't want to try to fix this by sending explicit COLLATE
clauses.

Somebody ought to work on improving that mess sometime. One thought
that comes to mind is to have a server option authorizing postgres_fdw
to believe that all local collations exist on the remote side.

regards, tom lane

#7Michał Kłeczek
michal@kleczek.org
In reply to: Tom Lane (#5)
Re: postgres_fdw aggregate pushdown for group by with expressions

On 3 Mar 2024, at 18:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:

=?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -

Yeah :-(. I traced through this, and it seems it's a collation
problem. Internally, that call looks like
extract('year'::text, date_column)
The text constant is marked as having collation "default", which means
that extract() is marked as having input collation "default", and then
it falls foul of this rule:

[snip]

Thanks for explanation - I have a follow-up question.

Does that mean that *any* expression containing text constants is not going to be pushed down??
That would be a really serious issues I’d say.


Michal

#8Michał Kłeczek
michal@kleczek.org
In reply to: Michał Kłeczek (#7)
Re: postgres_fdw aggregate pushdown for group by with expressions

On 4 Mar 2024, at 07:46, Michał Kłeczek <michal@kleczek.org> wrote:

On 3 Mar 2024, at 18:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:

=?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -

Yeah :-(. I traced through this, and it seems it's a collation
problem. Internally, that call looks like
extract('year'::text, date_column)
The text constant is marked as having collation "default", which means
that extract() is marked as having input collation "default", and then
it falls foul of this rule:

[snip]

Thanks for explanation - I have a follow-up question.

Does that mean that *any* expression containing text constants is not going to be pushed down??
That would be a really serious issues I’d say.

I’m afraid it is affecting all expressions containing functions.
This is really troublesome as it blocks push down for some basic (and collation independent) functions like coalesce.


Michal

#9Michał Kłeczek
michal@kleczek.org
In reply to: Michał Kłeczek (#1)
Re: postgres_fdw aggregate pushdown for group by with expressions

Hi All,

On 3 Mar 2024, at 10:34, Michał Kłeczek <michal@kleczek.org> wrote:

Hi,

I have the following foreign table:

CREATE FOREIGN TABLE t1 (
grouping_column text,
date_column date,
whatever_data int
);

The query is:

SELECT
sum(whatever_data)
FROM
t1
GROUP BY
grouping_colulmn, extract(YEAR FROM date_column), extract(MONTH FROM date_column);

From my (preliminary) testing postgres_fdw will not push down this aggregate query
- it will happily push down query with only “grouping_column” or “grouping_column, date_column" in GROUP BY

I was able to make it work by creating an extension and installing it on both sides with functions:

year(date) IMMUTABLE
month(date) IMMUTABLE

But it looks there are more issues - the following queries work fine (ie. are pushed down to remote server):

SELECT * FROM t1 WHERE grouping_column LIKE ‘abcd%’;
and
SELECT sum(whatever_data) FROM t1 GROUP BY grouping_column, year(date_column), month(date_column)

But together - unfortunately not:

SELECT grouping_column, sum(whatever_data) FROM t1 WHERE grouping_column LIKE ‘abcd%' GROUP BY grouping_column, year(date_column)

In this case aggregate is performed locally.
Not sure if this is because of statistics (but I set fdw_tuple_cost to a very high value 99999 to try to force planner to push down aggregates)


Michal

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michał Kłeczek (#7)
Re: postgres_fdw aggregate pushdown for group by with expressions

=?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:

Does that mean that *any* expression containing text constants is not going to be pushed down??
That would be a really serious issues I’d say.

If it contains text constants and not text variables, possibly so.
Keep in mind though that this rule is applied after constant-folding,
so that what we're left with is normally going to contain variables
at every level. extract() is problematic because it combines a
text constant with a datetime (collation-less) variable.

regards, tom lane