BUG #15352: postgresql FDW error "ERROR: ORDER BY position 0 is not in select list"
The following bug has been logged on the website:
Bug reference: 15352
Logged by: Maksym Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 9.6.8
Operating system: Linux
Description:
Hi,
I found strange error with postgresql_fdw used with use_remote_estimate
'true'.
Trimmed test case:
\с postgres
create extension postgres_fdw;
CREATE SERVER test FOREIGN DATA WRAPPER postgres_fdw OPTIONS (
dbname 'postgres',
use_remote_estimate 'true'
);
CREATE USER MAPPING FOR public SERVER test OPTIONS (
"user" 'postgres'
);
IMPORT FOREIGN SCHEMA pg_catalog
LIMIT TO (pg_stat_user_tables)
FROM SERVER test
INTO public;
select *
from pg_catalog.pg_stat_user_tables
join public.pg_stat_user_tables USING (relid)
WHERE
greatest(pg_catalog.pg_stat_user_tables.idx_scan,
public.pg_stat_user_tables.idx_scan)=0;
ERROR: ORDER BY position 0 is not in select list
CONTEXT: Remote SQL command: EXPLAIN SELECT relid, schemaname, relname,
seq_scan, seq_tup_read, idx_scan, idx_tup_fetch, n_tup_ins, n_tup_upd,
n_tup_del, n_tup_hot_upd, n_live_tup, n_dead_tup, n_mod_since_analyze,
last_vacuum, last_autovacuum, last_analyze, last_autoanalyze, vacuum_count,
autovacuum_count, analyze_count, autoanalyze_count FROM
pg_catalog.pg_stat_user_tables ORDER BY 0 ASC NULLS LAST
"PG" == PG Bug reporting form <noreply@postgresql.org> writes:
PG> ERROR: ORDER BY position 0 is not in select list
PG> CONTEXT: Remote SQL command: EXPLAIN SELECT relid, schemaname, relname,
PG> seq_scan, seq_tup_read, idx_scan, idx_tup_fetch, n_tup_ins, n_tup_upd,
PG> n_tup_del, n_tup_hot_upd, n_live_tup, n_dead_tup, n_mod_since_analyze,
PG> last_vacuum, last_autovacuum, last_analyze, last_autoanalyze, vacuum_count,
PG> autovacuum_count, analyze_count, autoanalyze_count FROM
PG> pg_catalog.pg_stat_user_tables ORDER BY 0 ASC NULLS LAST
So what's happening here is that there's an equivalence class containing
members "greatest(pg_catalog.pg_stat_user_tables.idx_scan,
public.pg_stat_user_tables.idx_scan)" and "0" (as an integer constant),
and somehow a pathkey for this eclass is becoming attached to the query
going to the remote for statistics purposes.
It seems obviously wrong that a constant pathkey with no actual
reference to the foreign table should be being pushed down, so so far I
suspect that get_useful_pathkeys_for_relation isn't being selective
enough about what is "useful". In this context I find it suspicious that
find_em_expr_for_rel will return an expr with no vars as being "for"
every rel, since it's just looking for a subset.
--
Andrew (irc:RhodiumToad)
"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> So what's happening here is that there's an equivalence class
Andrew> containing members
Andrew> "greatest(pg_catalog.pg_stat_user_tables.idx_scan,
Andrew> public.pg_stat_user_tables.idx_scan)" and "0" (as an integer
Andrew> constant), and somehow a pathkey for this eclass is becoming
Andrew> attached to the query going to the remote for statistics
Andrew> purposes.
Andrew> It seems obviously wrong that a constant pathkey with no actual
Andrew> reference to the foreign table should be being pushed down, so
Andrew> so far I suspect that get_useful_pathkeys_for_relation isn't
Andrew> being selective enough about what is "useful". In this context
Andrew> I find it suspicious that find_em_expr_for_rel will return an
Andrew> expr with no vars as being "for" every rel, since it's just
Andrew> looking for a subset.
So this looks to me like an oversight in aa09cd242 (CCing rhaas and
Ashutosh accordingly), which changed find_em_expr_for_rel from using
bms_equal to bms_is_subset without considering the degenerate case of
members with no relids at all. I propose to simply add a !bms_is_empty
condition there; anyone have any better idea?
--
Andrew (irc:RhodiumToad)
"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> It seems obviously wrong that a constant pathkey with no actual
Andrew> reference to the foreign table should be being pushed down, so
Andrew> so far I suspect that get_useful_pathkeys_for_relation isn't
Andrew> being selective enough about what is "useful". In this context
Andrew> I find it suspicious that find_em_expr_for_rel will return an
Andrew> expr with no vars as being "for" every rel, since it's just
Andrew> looking for a subset.
Andrew> So this looks to me like an oversight in aa09cd242 (CCing rhaas
Andrew> and Ashutosh accordingly), which changed find_em_expr_for_rel
Andrew> from using bms_equal to bms_is_subset without considering the
Andrew> degenerate case of members with no relids at all. I propose to
Andrew> simply add a !bms_is_empty condition there; anyone have any
Andrew> better idea?
Pushed this.
--
Andrew (irc:RhodiumToad)
On Tue, Aug 28, 2018 at 7:53 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> It seems obviously wrong that a constant pathkey with no actual
Andrew> reference to the foreign table should be being pushed down, so
Andrew> so far I suspect that get_useful_pathkeys_for_relation isn't
Andrew> being selective enough about what is "useful". In this context
Andrew> I find it suspicious that find_em_expr_for_rel will return an
Andrew> expr with no vars as being "for" every rel, since it's just
Andrew> looking for a subset.
Sorry for replying late. I am not able to understand why it's wrong to
push a constant or for the matter any shippable expression which
doesn't refer to the foreign table/s (for that matter any tables)
under consideration down to the foreign server. The context in the
original mail doesn't help. I haven't checked the original thread on
bugs mailing list. I agree that ordering by such an expression is
useless, but if we are getting that done from a foreign server, what's
the harm? But by not doing it we might be loosing some optimization
since postgres_fdw pushes all or none of pathkeys.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
"Ashutosh" == Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
Andrew> It seems obviously wrong that a constant pathkey with no actual
Andrew> reference to the foreign table should be being pushed down, so
Andrew> so far I suspect that get_useful_pathkeys_for_relation isn't
Andrew> being selective enough about what is "useful". In this context
Andrew> I find it suspicious that find_em_expr_for_rel will return an
Andrew> expr with no vars as being "for" every rel, since it's just
Andrew> looking for a subset.
Ashutosh> Sorry for replying late. I am not able to understand why it's
Ashutosh> wrong to push a constant or for the matter any shippable
Ashutosh> expression which doesn't refer to the foreign table/s (for
Ashutosh> that matter any tables) under consideration down to the
Ashutosh> foreign server.
Well, it's certainly pointless.
But the failure in this case is specifically about pushing down an
_integer_ constant, because the deparse code for pushing down an ORDER
BY does not understand that integer literals in ORDER BY clauses are a
special case.
Ashutosh> The context in the original mail doesn't help. I haven't
Ashutosh> checked the original thread on bugs mailing list. I agree
Ashutosh> that ordering by such an expression is useless, but if we are
Ashutosh> getting that done from a foreign server, what's the harm? But
Ashutosh> by not doing it we might be loosing some optimization since
Ashutosh> postgres_fdw pushes all or none of pathkeys.
I'm pretty sure that constant (hence redundant) clauses have been
removed from pathkeys before postgres_fdw will see them. The problem
only occurs because postgres_fdw tries inventing _new_ pathkeys for
possible orderings from eclasses (in order to try for mergejoin
opportunities) in addition to using the requested pathkeys, and it's
clearly pointless to do that with constants.
--
Andrew (irc:RhodiumToad)
On Wed, Aug 29, 2018 at 4:32 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
"Ashutosh" == Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
Andrew> It seems obviously wrong that a constant pathkey with no actual
Andrew> reference to the foreign table should be being pushed down, so
Andrew> so far I suspect that get_useful_pathkeys_for_relation isn't
Andrew> being selective enough about what is "useful". In this context
Andrew> I find it suspicious that find_em_expr_for_rel will return an
Andrew> expr with no vars as being "for" every rel, since it's just
Andrew> looking for a subset.Ashutosh> Sorry for replying late. I am not able to understand why it's
Ashutosh> wrong to push a constant or for the matter any shippable
Ashutosh> expression which doesn't refer to the foreign table/s (for
Ashutosh> that matter any tables) under consideration down to the
Ashutosh> foreign server.Well, it's certainly pointless.
But the failure in this case is specifically about pushing down an
_integer_ constant, because the deparse code for pushing down an ORDER
BY does not understand that integer literals in ORDER BY clauses are a
special case.
Deparser needs to be fixed then, irrespective of whether or not we fix
the costant pathkey problem.
Ashutosh> The context in the original mail doesn't help. I haven't
Ashutosh> checked the original thread on bugs mailing list. I agree
Ashutosh> that ordering by such an expression is useless, but if we are
Ashutosh> getting that done from a foreign server, what's the harm? But
Ashutosh> by not doing it we might be loosing some optimization since
Ashutosh> postgres_fdw pushes all or none of pathkeys.I'm pretty sure that constant (hence redundant) clauses have been
removed from pathkeys before postgres_fdw will see them. The problem
only occurs because postgres_fdw tries inventing _new_ pathkeys for
possible orderings from eclasses (in order to try for mergejoin
opportunities) in addition to using the requested pathkeys, and it's
clearly pointless to do that with constants.
Yes, I forgot about that. But even in that case, we should consider
the case when the constant pathkey is just one in the bunch and we are
trying to push the whole bunch.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
[removing the OP from CC list]
"Ashutosh" == Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
Well, it's certainly pointless.
But the failure in this case is specifically about pushing down an
_integer_ constant, because the deparse code for pushing down an
ORDER BY does not understand that integer literals in ORDER BY
clauses are a special case.
Ashutosh> Deparser needs to be fixed then, irrespective of whether or
Ashutosh> not we fix the costant pathkey problem.
Since we have no business sending sort expressions to the remote that do
not include remote vars, this seems superfluous. Any such expression is
either mutable (and hence not pushable anyway) or known locally to be
constant (in which case we never legitimately see it in a pathkey).
(Maybe Asserting it or throwing an error might be appropriate.)
I'm pretty sure that constant (hence redundant) clauses have been
removed from pathkeys before postgres_fdw will see them. The problem
only occurs because postgres_fdw tries inventing _new_ pathkeys for
possible orderings from eclasses (in order to try for mergejoin
opportunities) in addition to using the requested pathkeys, and it's
clearly pointless to do that with constants.
Ashutosh> Yes, I forgot about that. But even in that case, we should
Ashutosh> consider the case when the constant pathkey is just one in
Ashutosh> the bunch and we are trying to push the whole bunch.
How do you think that could happen, given that redundant pathkeys are
already removed?
--
Andrew (irc:RhodiumToad)
On Thu, Aug 30, 2018 at 2:14 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
[removing the OP from CC list]
"Ashutosh" == Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
Well, it's certainly pointless.
But the failure in this case is specifically about pushing down an
_integer_ constant, because the deparse code for pushing down an
ORDER BY does not understand that integer literals in ORDER BY
clauses are a special case.
Looking at deparseSortGroupClause() this issue looks to be fixed in
HEAD. Either the version where bug was found doesn't have this fix or
somehow the fix isn't working.
Ashutosh> Deparser needs to be fixed then, irrespective of whether or
Ashutosh> not we fix the costant pathkey problem.Since we have no business sending sort expressions to the remote that do
not include remote vars, this seems superfluous. Any such expression is
either mutable (and hence not pushable anyway) or known locally to be
constant (in which case we never legitimately see it in a pathkey).
(Maybe Asserting it or throwing an error might be appropriate.)I'm pretty sure that constant (hence redundant) clauses have been
removed from pathkeys before postgres_fdw will see them. The problem
only occurs because postgres_fdw tries inventing _new_ pathkeys for
possible orderings from eclasses (in order to try for mergejoin
opportunities) in addition to using the requested pathkeys, and it's
clearly pointless to do that with constants.Ashutosh> Yes, I forgot about that. But even in that case, we should
Ashutosh> consider the case when the constant pathkey is just one in
Ashutosh> the bunch and we are trying to push the whole bunch.How do you think that could happen, given that redundant pathkeys are
already removed?
I don't have exact answer. But deparseSortGroupClause() has code to
deparse constants in GROUP BY indicates that we do encounter such
pathkeys somewhere. I am thinking about ORDER BY being pushed down for
GROUP BY.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
"Ashutosh" == Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
Ashutosh> Looking at deparseSortGroupClause() this issue looks to be
Ashutosh> fixed in HEAD. Either the version where bug was found doesn't
Ashutosh> have this fix or somehow the fix isn't working.
You're misreading the code. The OP's example clearly demonstrates the
bug when tested with HEAD prior to bf2d0462c or with bf2d0462c reverted.
As the code stands, it _cannot_ use deparseSortGroupClause for
generating EXPLAINs for pathkeys that are chosen by postgres_fdw,
because deparseSortGroupClause wants a tlist, not a plain list of
expressions.
(Perhaps you haven't realized that without the additional check, when
remote estimates are enabled we end up sending EXPLAIN commands to the
remote for paths that cannot possibly be of any use in the query?
Round-trips to the remote are not free.)
How do you think that could happen, given that redundant pathkeys
are already removed?
Ashutosh> I don't have exact answer.
Then you should find out.
Ashutosh> But deparseSortGroupClause() has code to deparse constants in
Ashutosh> GROUP BY indicates that we do encounter such pathkeys
Ashutosh> somewhere.
GROUP BY isn't based on pathkeys (for one thing, grouping columns might
not be sortable).
Ashutosh> I am thinking about ORDER BY being pushed down for GROUP BY.
Perhaps you should take a look at how the pathkeys for that case are
generated.
--
Andrew (irc:RhodiumToad)